2019-12 LARC
https://developer.nvidia.com/blog/pretraining-bert-with-layer-wise-adaptive-learning-rates/
https://github.com/NVIDIA/apex/blob/master/apex/parallel/LARC.py
https://github.com/kakaobrain/torchlars
https://www.kakaobrain.com/blog/113
larc
,
pytorch
,
optimizer
,
large batch