fixes DQN run_n_episodes using the wrong environment variable #525

sidhantls · 2021-01-18T06:42:40Z

What does this PR do?

DQN's run_n_episodes method, which is used by test_step for testing the agent, uses the object's environment self.env and not its argument env to run the simulation steps. It resets env but takes simulation steps in self.env.

As a result, the testing stats are wrong because the wrong environment is used by the agent and a different environment is being reset after each episode

Fixes #516

Before submitting

Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests? [not needed for typos/docs]
Did you verify new and existing tests pass locally with your changes?
If you made a notable change (that affects users), did you update the CHANGELOG?

PR review

Is this pull request ready for review? (if not, please submit in draft mode)

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

codecov · 2021-01-18T06:44:39Z

Codecov Report

Merging #525 (55f650d) into master (58aa93a) will not change coverage.
The diff coverage is 0.00%.

@@           Coverage Diff           @@
##           master     #525   +/-   ##
=======================================
  Coverage   78.95%   78.95%           
=======================================
  Files         105      105           
  Lines        6121     6121           
=======================================
  Hits         4833     4833           
  Misses       1288     1288

Flag	Coverage Δ
cpu	`25.66% <0.00%> (ø)`
pytest	`25.66% <0.00%> (ø)`
unittests	`78.48% <0.00%> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
pl_bolts/models/rl/dqn_model.py	`78.98% <0.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 58aa93a...e43ff17. Read the comment docs.

Borda · 2021-01-18T07:53:16Z

pl_bolts/models/rl/dqn_model.py

@@ -171,7 +171,7 @@ def run_n_episodes(self, env, n_epsiodes: int = 1, epsilon: float = 1.0) -> List
            while not done:
                self.agent.epsilon = epsilon
                action = self.agent(episode_state, self.device)
-                next_state, reward, done, _ = self.env.step(action[0])
+                next_state, reward, done, _ = env.step(action[0])


shall we also assign the env back, self.env = env?

if we assign the test env to self.env in test_step, when training is done after testing, it'll use the test env (without seed) instead of what it was initialized with for training (env with seed)

Borda · 2021-01-18T09:33:30Z

@sid-sundrani it seems that the macOS test is hanging, mind have look at it?

sidhantls · 2021-01-18T09:50:38Z

@sid-sundrani it seems that the macOS test is hanging, mind have look at it?

happening on 3.8. sure I'm looking into it

Borda · 2021-01-18T14:44:19Z

@sid-sundrani it seems that the macOS test is hanging, mind have look at it?

happening on 3.8. sure I'm looking into it

it seems that it is hanging on master so most likely unrelated to this PR, but very welcome to find the hanging cases...

sidhantls · 2021-01-18T16:46:23Z

@sid-sundrani it seems that the macOS test is hanging, mind have look at it?

happening on 3.8. sure I'm looking into it

it seems that it is hanging on master so most likely unrelated to this PR, but very welcome to find the hanging cases...

yes, likely unrelated as the TestValueAgent tests pass

the hanging cases are unusual- in this PR, the pytest gets canceled after test_simclr, 3 more tests later than where the same pytest was canceled in the previous pr that was merged. i remember there was an issue in the past with perhaps OOM in a self_supervised failing test (#409)

sidhantls requested review from akihironitta, ananyahjha93 and Borda as code owners January 18, 2021 06:42

github-actions bot added the model label Jan 18, 2021

Borda approved these changes Jan 18, 2021

View reviewed changes

use argument variable instead of self variable

55f650d

Borda force-pushed the dqn_bug branch from 31218bd to 55f650d Compare January 18, 2021 07:53

Borda added the ready label Jan 18, 2021

Borda added this to the v0.3 milestone Jan 18, 2021

chlog

e43ff17

Borda merged commit da35d3d into Lightning-Universe:master Jan 18, 2021

This was referenced Mar 13, 2021

[tune](deps): Bump pytorch-lightning-bolts from 0.2.5 to 0.3.0 in /python/requirements suquark/ray#8

Closed

[tune](deps): Bump pytorch-lightning-bolts from 0.2.5 to 0.3.0 in /python/requirements sven1977/ray#8

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fixes DQN run_n_episodes using the wrong environment variable #525

fixes DQN run_n_episodes using the wrong environment variable #525

sidhantls commented Jan 18, 2021 •

edited by Borda

Loading

codecov bot commented Jan 18, 2021 •

edited

Loading

Borda Jan 18, 2021

sidhantls Jan 18, 2021 •

edited

Loading

Borda commented Jan 18, 2021 •

edited

Loading

sidhantls commented Jan 18, 2021

Borda commented Jan 18, 2021

sidhantls commented Jan 18, 2021 •

edited

Loading

fixes DQN run_n_episodes using the wrong environment variable #525

fixes DQN run_n_episodes using the wrong environment variable #525

Conversation

sidhantls commented Jan 18, 2021 • edited by Borda Loading

What does this PR do?

Before submitting

PR review

Did you have fun?

codecov bot commented Jan 18, 2021 • edited Loading

Codecov Report

Borda Jan 18, 2021

Choose a reason for hiding this comment

sidhantls Jan 18, 2021 • edited Loading

Choose a reason for hiding this comment

Borda commented Jan 18, 2021 • edited Loading

sidhantls commented Jan 18, 2021

Borda commented Jan 18, 2021

sidhantls commented Jan 18, 2021 • edited Loading

sidhantls commented Jan 18, 2021 •

edited by Borda

Loading

codecov bot commented Jan 18, 2021 •

edited

Loading

sidhantls Jan 18, 2021 •

edited

Loading

Borda commented Jan 18, 2021 •

edited

Loading

sidhantls commented Jan 18, 2021 •

edited

Loading