The Value Function Polytope in Reinforcement Learning

https://arxiv.org/abs/1901.11524v3

Marc G. Bellemare, Google