Large Batch
2020-11 Training EfficientNets at Supercomputer Scale: 83% ImageNet Top-1 Accuracy in One Hour
2019-12 LARC
2017-06 Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour