You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After checking your code and running two examples on Cheetah Run, I have been confused about the definition of "step" used in your code. In each "step", the agent will interact with the Env once, and the step should be "policy step". However, In your readme, you mean that the step "S" is the total number of environment steps. After running your code by myself, I think the scores you reported in RAD paper should be consistent with the "S" in your log, which is not consistent with the definition of 100K/500K environment steps.
So, can you tell me what is wrong in my words above?
Hi, I met the same question. Maybe you can check the output file in the format 'xxx_eval_scores.npy'. The 'steps' there refer to the environment steps. It should be equal to the 'S' in the log times action repeat.
Hi, @MishaLaskin. Thanks for sharing your code.
After checking your code and running two examples on Cheetah Run, I have been confused about the definition of "step" used in your code. In each "step", the agent will interact with the Env once, and the step should be "policy step". However, In your readme, you mean that the step "S" is the total number of environment steps. After running your code by myself, I think the scores you reported in RAD paper should be consistent with the "S" in your log, which is not consistent with the definition of 100K/500K environment steps.
So, can you tell me what is wrong in my words above?
The attached logs are from two eval.log files.
The text was updated successfully, but these errors were encountered: