====== Action Space ====== ===== Large Action Space ===== * [[https://arxiv.org/pdf/2001.08116.pdf|Q-Learning in Enourmous Action Spaces via Amortized Approximate Maximization, 2020-01]] * DeepMind, Volodymyr Mnih * action space restriction * masked softmax * https://torchcraft.github.io/TorchCraftAI/docs/bptut-rl.html * https://gist.github.com/kaniblu/94f3ede72d1651b087a561cf80b306ca * https://discuss.pytorch.org/t/apply-mask-softmax/14212/12