Width of Minima Reached by Stochastic Gradient Descent Is Influenced by Learning Rate to Batch Size Ratio
Lecture Notes in Computer Science - Germany
doi 10.1007/978-3-030-01424-7_39
Full Text
Open PDFAbstract
Available in full text
Date
January 1, 2018
Authors
Publisher
Springer International Publishing