-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fixes DQN run_n_episodes using the wrong environment variable #525
Conversation
Codecov Report
@@ Coverage Diff @@
## master #525 +/- ##
=======================================
Coverage 78.95% 78.95%
=======================================
Files 105 105
Lines 6121 6121
=======================================
Hits 4833 4833
Misses 1288 1288
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
@@ -171,7 +171,7 @@ def run_n_episodes(self, env, n_epsiodes: int = 1, epsilon: float = 1.0) -> List | |||
while not done: | |||
self.agent.epsilon = epsilon | |||
action = self.agent(episode_state, self.device) | |||
next_state, reward, done, _ = self.env.step(action[0]) | |||
next_state, reward, done, _ = env.step(action[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we also assign the env back, self.env = env
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we assign the test env to self.env
in test_step
, when training is done after testing, it'll use the test env (without seed) instead of what it was initialized with for training (env with seed)
@sid-sundrani it seems that the macOS test is hanging, mind have look at it? |
happening on 3.8. sure I'm looking into it |
it seems that it is hanging on master so most likely unrelated to this PR, but very welcome to find the hanging cases... |
yes, likely unrelated as the the hanging cases are unusual- in this PR, the pytest gets canceled after test_simclr, 3 more tests later than where the same pytest was canceled in the previous pr that was merged. i remember there was an issue in the past with perhaps OOM in a self_supervised failing test (#409) |
What does this PR do?
DQN's
run_n_episodes
method, which is used bytest_step
for testing the agent, uses the object's environmentself.env
and not its argumentenv
to run the simulation steps. It resetsenv
but takes simulation steps inself.env
.As a result, the testing stats are wrong because the wrong environment is used by the agent and a different environment is being reset after each episode
Fixes #516
Before submitting
PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
Did you have fun?
Make sure you had fun coding 🙃