drl:start
차이
문서의 선택한 두 판 사이의 차이를 보여줍니다.
| 양쪽 이전 판이전 판다음 판 | 이전 판 | ||
| drl:start [2016/11/30 06:18] – [응용] rex8312 | drl:start [2024/03/23 02:42] (현재) – 바깥 편집 127.0.0.1 | ||
|---|---|---|---|
| 줄 1: | 줄 1: | ||
| - | ===== DRL 관련연구 목록 | + | ===== General Video Game Playing |
| + | ==== Learning Algorithm ==== | ||
| + | |||
| + | * [[https:// | ||
| + | |||
| + | ==== Reward Shaping ==== | ||
| + | |||
| + | * [[https:// | ||
| + | |||
| + | ==== Simulation ==== | ||
| + | |||
| + | * [[https:// | ||
| + | * [[https:// | ||
| + | * | ||
| + | |||
| + | ==== Learning by Instruction ==== | ||
| + | |||
| + | * [[https:// | ||
| + | ==== ?? ===== | ||
| + | |||
| + | * ?? | ||
| + | * Z. C. Lipton, J. Gao, L. Li, X. Li, F. Ahmed, and L. Deng, “Efficient Exploration for Dialogue Policy Learning with BBQ Networks & Replay Buffer Spiking,” arXiv: | ||
| * Adaptive Normalization | * Adaptive Normalization | ||
| * H. van Hasselt et al., “Learning values across many orders of magnitude, | * H. van Hasselt et al., “Learning values across many orders of magnitude, | ||
| 줄 15: | 줄 36: | ||
| * Progressive Network | * Progressive Network | ||
| * A. A. Rusu et al., “Progressive neural networks, | * A. A. Rusu et al., “Progressive neural networks, | ||
| + | * [[https:// | ||
| - | ==== 학습 | + | |
| + | ==== Deep Reinforcement Learning | ||
| ^Deep Q-Learning |V. Mnih et al., “Human-level control through deep reinforcement learning, | ^Deep Q-Learning |V. Mnih et al., “Human-level control through deep reinforcement learning, | ||
| 줄 33: | 줄 56: | ||
| ^ | | | ^ | | | ||
| - | ==== 인공신경망 아키텍처 | + | ==== Architecture |
| ^Dueling Network | Z. Wang, N. de Freitas, and M. Lanctot, “Dueling network architectures for deep reinforcement learning, | ^Dueling Network | Z. Wang, N. de Freitas, and M. Lanctot, “Dueling network architectures for deep reinforcement learning, | ||
| - | ==== Neural Differential Machine ==== | ||
| - | ^ | | | ||
| - | ^ | | | ||
| - | ==== 플랫폼 ==== | ||
| - | ^ | | | + | ==== AI Platform ==== |
| - | ^ | | | + | |
| + | ^ViZDoom |M. Kempka, M. Wydmuch, G. Runc, J. Toczek, and W. Jaśkowski, “ViZDoom: A Doom-based AI Research Platform for Visual Reinforcement Learning, | ||
| + | ^OpenAIGym|https:// | ||
| + | ^Universe|https:// | ||
| + | ^DeepMind Lab|https:// | ||
| + | ^Malmo|https:// | ||
| - | ==== 응용 | + | ==== Application |
| ^AlphaGo|D. Silver et al., “Mastering the game of Go with deep neural networks and tree search,” Nature, vol. 529, no. 7587, pp. 484–489, Jan. 2016.| | ^AlphaGo|D. Silver et al., “Mastering the game of Go with deep neural networks and tree search,” Nature, vol. 529, no. 7587, pp. 484–489, Jan. 2016.| | ||
| ^Anold |G. Lample and D. S. Chaplot, “Playing FPS Games with Deep Reinforcement Learning, | ^Anold |G. Lample and D. S. Chaplot, “Playing FPS Games with Deep Reinforcement Learning, | ||
| + | ===== 구현체 ===== | ||
| - | {{tag> | + | * https:// |
| | | ||
drl/start.1480486707.txt.gz · 마지막으로 수정됨: (바깥 편집)