====== Reinforcement Learning ====== * [[https://arxiv.org/pdf/1912.00167.pdf|2019-12, IMPACT: Importance Weighted Asynchronous Architectures with Clipped Target Networks]] * IMPALA, V-trace, * [[https://arxiv.org/pdf/1909.11583.pdf|2019-11, OFF-POLICY ACTOR-CRITIC WITH SHARED EXPERIENCE REPLAY]] * IMPALA, experience replay (ER) * [[https://arxiv.org/pdf/1905.02363.pdf|2019-05, Dimension-Wise Importance Sampling Weight Clipping for Sample-Efficient Reinforcement Learning]] * IMPALA, V-trace, GAE