SliceOut: Training Transformers and CNNs faster while using less memory
https://arxiv.org/abs/2007.10909
Transformer
,
CNN
,
Memory
,
Memory 절약
,
Dropout
,
Regulization