One cycle paper
 One cycle paper
- Smith, L. N. (2017, March). Cyclical learning rates for training neural networks. In 2017 IEEE Winter Conference on Applications of Computer Vision (WACV) (pp. 464-472). IEEE.
- Practically eliminates the need to experimentally find the best values and
schedule for the global learning rates
- learning rate cyclically vary between reasonable boundary values.
- paper also describes a simple way to estimate “reasonable bounds” – linearly increasing the learning rate of the network for a few epochs.
- increasing the learning rate might have a short term negative effect and yet achieve a longer term beneficial effect.
- learning rate vary within a range of values rather than adopting a stepwise fixed or exponentially decreasing value
- triangular window
- Saddle points have small gradients that slow the learning process. However, increasing the learning rate allows more rapid traversal of saddle point plateaus.
- it is likely the optimum learning rate will be between the bounds and near optimal learning rates will be used throughout training.
- experiments show that it often is good to set stepsize equal to 2 − 10 times the number of iterations in an epoch
- a single LR range test provides both a good LR
value and a good range.
Note : Dont forget to use a range of lr instead of just one