Proximal Policy Optimization with Mixed Distributed Training

https://arxiv.org/abs/1907.06479v3

PPO, Distributed Computing, PBT, LASER, 2019