weixin_39963830
weixin_39963830
2021-01-07 04:35

NAN problem in PPO1 and PPO2

I'm trying to apply PPO1/PPO2 agents with my custom environment, however after some epoches, the policy_loss, policy_entropy and approxkl all become nan. If i use default layers(two hidden layers with sizes 64) as policy network, it just ok, but not ok for a bigger network( like two layers with sizes 256). so is there any good idea or solution to this problem?

image

该提问来源于开源项目:hill-a/stable-baselines

  • 点赞
  • 回答
  • 收藏
  • 复制链接分享

8条回答

为你推荐

换一换