Proximal Policy Optimization with Mixed Distributed Training