Does your PPO agent fail to learn?

Does your PPO agent fail to learn?

21.293 Lượt nghe
Does your PPO agent fail to learn?
One hyper-parameter could improve the stability of learning, and help your agent to explore! We investigate how to improve the reliability of training when using stable baselines 3 library, with ViZDoom, using the PyTorch deep neural network library, and the Python 3 language.