2024-11 Transformers are Multi-State RNNs
https://arxiv.org/abs/2401.06104
LLM
,
GPT
,
메모리최적화
,
2024