Deep Q-Learning | V. Mnih et al., “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529–533, Feb. 2015. |
---|---|
Double DQN | H. van Hasselt, A. Guez, and D. Silver, “Deep Reinforcement Learning with Double Q-learning,” arXiv:1509.06461 [cs], Sep. 2015. |
DDPG | T. P. Lillicrap et al., “Continuous control with deep reinforcement learning,” arXiv:1509.02971 [cs, stat], Sep. 2015. |
Async. Deep RL | V. Mnih et al., “Asynchronous Methods for Deep Reinforcement Learning,” arXiv:1602.01783 [cs], Feb. 2016. |
Count & Exploration | I. Osband, C. Blundell, A. Pritzel, and B. Van Roy, “Deep Exploration via Bootstrapped DQN,” arXiv:1602.04621 [cs, stat], Feb. 2016. |
---|---|
Prioritized Replay | T. Schaul, J. Quan, I. Antonoglou, and D. Silver, “Prioritized Experience Replay,” arXiv:1511.05952 [cs], Nov. 2015. |
---|---|
Dueling Network | Z. Wang, N. de Freitas, and M. Lanctot, “Dueling network architectures for deep reinforcement learning,” arXiv preprint arXiv:1511.06581, 2015. |
---|
ViZDoom | M. Kempka, M. Wydmuch, G. Runc, J. Toczek, and W. Jaśkowski, “ViZDoom: A Doom-based AI Research Platform for Visual Reinforcement Learning,” arXiv:1605.02097 [cs], May 2016. |
---|---|
OpenAIGym | https://gym.openai.com/ |
Universe | https://universe.openai.com/ |
DeepMind Lab | https://deepmind.com/blog/open-sourcing-deepmind-lab/ |
Malmo | https://github.com/Microsoft/malmo |
AlphaGo | D. Silver et al., “Mastering the game of Go with deep neural networks and tree search,” Nature, vol. 529, no. 7587, pp. 484–489, Jan. 2016. |
---|---|
Anold | G. Lample and D. S. Chaplot, “Playing FPS Games with Deep Reinforcement Learning,” arXiv:1609.05521 [cs], Sep. 2016. |