Skip to content

Latest commit

 

History

History
16 lines (9 loc) · 657 Bytes

README.md

File metadata and controls

16 lines (9 loc) · 657 Bytes

learning rate annealing, recommends starting with a relatively high learning rate and then gradually lowering the learning rate during training. The intuition behind this approach is that we'd like to traverse quickly from the initial parameters to a range of "good" parameter values but then we'd like a learning rate small enough that we can explore the "deeper, but narrower parts of the loss function"

Ref: https://cs231n.github.io/neural-networks-3/#annealing-the-learning-rate

Ref: https://www.jeremyjordan.me/nn-learning-rate/

alt text