2020-06 Conservative Q-Learning for Offline Reinforcement Learning
https://arxiv.org/abs/2006.04779
CQL
,
offline RL
,
RL
,
Aviral Kumar
,
Sergey Levine
,
2020