====== 2019-12 LARC ====== * https://developer.nvidia.com/blog/pretraining-bert-with-layer-wise-adaptive-learning-rates/ * https://github.com/NVIDIA/apex/blob/master/apex/parallel/LARC.py * https://github.com/kakaobrain/torchlars * https://www.kakaobrain.com/blog/113 {{tag>larc pytorch optimizer large_batch}}