Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixes DQN run_n_episodes using the wrong environment variable #525

Merged
merged 2 commits into from
Jan 18, 2021

Conversation

sidhantls
Copy link
Contributor

@sidhantls sidhantls commented Jan 18, 2021

What does this PR do?

DQN's run_n_episodes method, which is used by test_step for testing the agent, uses the object's environment self.env and not its argument env to run the simulation steps. It resets env but takes simulation steps in self.env.

As a result, the testing stats are wrong because the wrong environment is used by the agent and a different environment is being reset after each episode

Fixes #516

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests? [not needed for typos/docs]
  • Did you verify new and existing tests pass locally with your changes?
  • If you made a notable change (that affects users), did you update the CHANGELOG?

PR review

  • Is this pull request ready for review? (if not, please submit in draft mode)

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

@codecov
Copy link

codecov bot commented Jan 18, 2021

Codecov Report

Merging #525 (55f650d) into master (58aa93a) will not change coverage.
The diff coverage is 0.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #525   +/-   ##
=======================================
  Coverage   78.95%   78.95%           
=======================================
  Files         105      105           
  Lines        6121     6121           
=======================================
  Hits         4833     4833           
  Misses       1288     1288           
Flag Coverage Δ
cpu 25.66% <0.00%> (ø)
pytest 25.66% <0.00%> (ø)
unittests 78.48% <0.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pl_bolts/models/rl/dqn_model.py 78.98% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 58aa93a...e43ff17. Read the comment docs.

@@ -171,7 +171,7 @@ def run_n_episodes(self, env, n_epsiodes: int = 1, epsilon: float = 1.0) -> List
while not done:
self.agent.epsilon = epsilon
action = self.agent(episode_state, self.device)
next_state, reward, done, _ = self.env.step(action[0])
next_state, reward, done, _ = env.step(action[0])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we also assign the env back, self.env = env?

Copy link
Contributor Author

@sidhantls sidhantls Jan 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we assign the test env to self.env in test_step, when training is done after testing, it'll use the test env (without seed) instead of what it was initialized with for training (env with seed)

@Borda
Copy link
Member

Borda commented Jan 18, 2021

@sid-sundrani it seems that the macOS test is hanging, mind have look at it?

@sidhantls
Copy link
Contributor Author

@sid-sundrani it seems that the macOS test is hanging, mind have look at it?

happening on 3.8. sure I'm looking into it

@Borda Borda added this to the v0.3 milestone Jan 18, 2021
@Borda
Copy link
Member

Borda commented Jan 18, 2021

@sid-sundrani it seems that the macOS test is hanging, mind have look at it?

happening on 3.8. sure I'm looking into it

it seems that it is hanging on master so most likely unrelated to this PR, but very welcome to find the hanging cases...

@Borda Borda merged commit da35d3d into Lightning-Universe:master Jan 18, 2021
@sidhantls
Copy link
Contributor Author

sidhantls commented Jan 18, 2021

@sid-sundrani it seems that the macOS test is hanging, mind have look at it?

happening on 3.8. sure I'm looking into it

it seems that it is hanging on master so most likely unrelated to this PR, but very welcome to find the hanging cases...

yes, likely unrelated as the TestValueAgent tests pass

the hanging cases are unusual- in this PR, the pytest gets canceled after test_simclr, 3 more tests later than where the same pytest was canceled in the previous pr that was merged. i remember there was an issue in the past with perhaps OOM in a self_supervised failing test (#409)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DQN run_n_episodes ignores environment parameter
2 participants