我在代码中定义了自己的环境,训练一个比较小的步数后,想对模型进行测试。my_env.py在step20步后truncated 确实变成了tensor(false),符合我的env中的设置,然而却发现evaluator.py 中的render函数运行了234次才停止?
这是为什么?
对evaluator.py的render函数中的“obs, rew, cost, terminated, truncated, _ = self._env.step(act)”中的五个值进行打印,发现与my_env.py中的step函数return的值不一样?是因为不同实体的原因吗?请求解答,谢谢!
I define my own environment in the code, and I want to evaluate the saved model. Acturally, in my_env.py, I set 20 steps as the end condition in my_env. But when I evaluate saved model, I find truncated does become a tensor (false) after step 20 in my_env.py, matching the Settings in my env, but then the render function in evaluator.py runs 234 times ?Why is that?
Print the five values from "obs, rew, cost, terminated, truncated, _ = self._env.step(act)" in the render function of evaluator. Not the same as the return value of the step function in my_env.py? Is it because of the different entities? Answer requested, thank you!