2018-06 [MPO] Maximum a Posteriori Policy Optimisation
https://arxiv.org/abs/1806.06920
https://paperswithcode.com/paper/maximum-a-posteriori-policy-optimisation
https://github.com/theogruner/rl_pro_telu
MPO
,
DeepMind
,
Abbas Abdolmaleki
,
Martin Riedmiller
,
2018