Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clear replay buffer after trajectory collection #425

Merged
merged 1 commit into from
Dec 7, 2020

Conversation

sidhantls
Copy link
Contributor

@sidhantls sidhantls commented Dec 3, 2020

What does this PR do?

clears replay buffer (states, actions, qvals lists) after each training batch.

the num_batch_episodes parameter in reinforce controls the number of episodes to rollout in each training batch. However, since the buffer is never cleared, each training batch has all previous episodes instead of just num_batch_episodes . in consequence even the training time is abnormally large as all trajectories from start are accumulating in the training dataset, instead of just having num_batch_episodes trajectories under the current policy.

The script cited in reinforce_model.py also does clear the replay buffer as in the fix.

Fixes #399

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together? Otherwise, we ask you to create a separate PR for every change.
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?
  • Did you verify new and existing tests pass locally with your changes?
  • If you made a notable change (that affects users), did you update the CHANGELOG?

PR review

  • Is this pull request ready for review? (if not, please submit in draft mode)

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

@codecov
Copy link

codecov bot commented Dec 3, 2020

Codecov Report

Merging #425 (2c60d3d) into master (7c2e651) will decrease coverage by 0.09%.
The diff coverage is 0.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #425      +/-   ##
==========================================
- Coverage   81.09%   80.99%   -0.10%     
==========================================
  Files         100      100              
  Lines        5722     5725       +3     
==========================================
- Hits         4640     4637       -3     
- Misses       1082     1088       +6     
Flag Coverage Δ
cpu 25.22% <0.00%> (-0.02%) ⬇️
pytest 25.22% <0.00%> (-0.02%) ⬇️
unittests 80.36% <0.00%> (-0.10%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pl_bolts/models/rl/reinforce_model.py 87.90% <0.00%> (-2.18%) ⬇️
...l_bolts/models/rl/vanilla_policy_gradient_model.py 91.96% <0.00%> (-2.68%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7c2e651...2c60d3d. Read the comment docs.

@akihironitta akihironitta added fix fixing issues... model labels Dec 4, 2020
Copy link
Member

@Borda Borda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Contributor

@akihironitta akihironitta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sid-sundrani LGTM. Thank you for your contribution!

@akihironitta akihironitta merged commit cbeb143 into Lightning-Universe:master Dec 7, 2020
chris-clem pushed a commit to chris-clem/pytorch-lightning-bolts that referenced this pull request Dec 9, 2020
chris-clem pushed a commit to chris-clem/pytorch-lightning-bolts that referenced this pull request Dec 16, 2020
Borda added a commit that referenced this pull request Jan 18, 2021
* Add DCGAN module

* Undo black on conf.py

* Add tests for DCGAN

* Fix flake8 and codefactor

* Add types and small refactoring

* Make image sampler callback work

* Upgrade DQN to use .log (#404)

* Upgrade DQN to use .log

* remove unused

* pep8

* fixed other dqn

* fix loss test case for batch size variation (#402)

* Decouple DataModules from Models - CPCV2 (#386)

* Decouple dms from CPCV2

* Update tests

* Add docstrings, fix import, and update changelog

* Update transforms

* bugfix: batch_size parameter for DataModules remaining (#344)

* bugfix: batch_size for DataModules remaining

* Update sklearn datamodule tests

* Fix default_transforms. Keep internal for every data module

* fix typo on binary_mnist_datamodule

thanks @akihironitta

Co-authored-by: Akihiro Nitta <[email protected]>

Co-authored-by: Akihiro Nitta <[email protected]>

* Fix a typo/copy paste error (#415)

* Just a Typo (#413)

missing a ' at the end of dataset='stl10

* Remove unused arguments (#418)

* tests: Use cached datasets in LitMNIST and the doctests (#414)

* Use cached datasets

* Use cached datasets in doctests

* clear replay buffer after trajectory (#425)

* stale: update label

* bugfix: Add missing imports to pl_bolts/__init__.py (#430)

* Add missing imports

* Add missing imports

* Apply isort

* Fix CIFAR num_samples (#432)

* Add static type checker mypy to the tests and pre-commit hooks (#433)

* Add mypy check to GitHub Actions

* Run mypy on pl_bolts only

* Add mypy check to pre-commit

* Add an empty line at the end of files

* Update mypy config

* Update mypy config

* Update mypy config

* show

Co-authored-by: Jirka Borovec <[email protected]>

* missing logo

* Add type annotations to pl_bolts/__init__.py (#435)

* Run mypy on pl_bolts only

* Update mypy config

* Add type hints to pl_bolts/__init__.py

* mypy

Co-authored-by: Jirka Borovec <[email protected]>

* skip hanging (#437)

* Option to normalize latent interpolation images (#438)

* add option to normalize latent interpolation images

* linspace

* update

Co-authored-by: ananyahjha93 <[email protected]>

* 0.2.6rc1

* Warnings fix (#449)

* Revert "Merge pull request #1 from ganprad/warnings_fix"

This reverts commit 7c5aaf0.

* Fixes warning related np.integer in SklearnDataModule

Fixes this warning:
```DeprecationWarning: Converting `np.integer` or `np.signedinteger` to a dtype is deprecated. The current result is `np.dtype(np.int_)` which is not strictly correct. Note that the result depends on the system. To ensure stable results use may want to use `np.int64` or `np.int32````

* Refactor datamodules/datasets (#338)

* Remove try: ... except: ...

* Fix experience_source

* Fix imagenet

* Fix kitti

* Fix sklearn

* Fix vocdetection

* Fix typo

* Remove duplicate

* Fix by flake8

* Add optional packages availability vars

* binary_mnist

* Use pl_bolts._SKLEARN_AVAILABLE

* Apply isort

* cifar10

* mnist

* cityscapes

* fashion mnist

* ssl_imagenet

* stl10

* cifar10

* dummy

* fix city

* fix stl10

* fix mnist

* ssl_amdim

* remove unused DataLoader and fix docs

* use from ... import ...

* fix pragma: no cover

* Fix forward reference in annotations

* binmnist

* Same order as imports

* Move vars from __init__ to utils/__init__

* Remove vars from __init__

* Update vars

* Apply isort

* update min requirements - PL 1.1.1 (#448)

* update min requirements

* rc0

* imports

* isort

* flake8

* 1.1.1

* flake8

* docs

* Add missing optional packages to `requirements/*.txt` (#450)

* Import matplotlib at the top

* Add missing optional packages

* Update wandb

* Add mypy to requirements

* update Isort (#457)

* Adding flags to datamodules (#388)

* Adding flags to datamodules

* Finishing up changes

* Fixing syntax error

* More syntax errors

* More

* Adding drop_last flag to sklearn test

* Adding drop_last flag to sklearn test

* Updating doc for reflect drop_last=False

* Adding flags to datamodules

* Finishing up changes

* Fixing syntax error

* More syntax errors

* More

* Adding drop_last flag to sklearn test

* Adding drop_last flag to sklearn test

* Updating doc for reflect drop_last=False

* Cleaning up parameters and docstring

* Fixing syntax error

* Fixing documentation

* Hardcoding shuffle=False for val and test

* Add DCGAN module

* Small fixes

* Remove DataModules

* Update docs

* Update docs

* Update torchvision import

* Import gym as optional package to build docs successfully (#458)

* Import gym as optional package

* Fix import

* Apply isort

* bugfix: batch_size parameter for DataModules remaining (#344)

* bugfix: batch_size for DataModules remaining

* Update sklearn datamodule tests

* Fix default_transforms. Keep internal for every data module

* fix typo on binary_mnist_datamodule

thanks @akihironitta

Co-authored-by: Akihiro Nitta <[email protected]>

Co-authored-by: Akihiro Nitta <[email protected]>

* Option to normalize latent interpolation images (#438)

* add option to normalize latent interpolation images

* linspace

* update

Co-authored-by: ananyahjha93 <[email protected]>

* update min requirements - PL 1.1.1 (#448)

* update min requirements

* rc0

* imports

* isort

* flake8

* 1.1.1

* flake8

* docs

* Apply suggestions from code review

* Apply suggestions from code review

* Add docs

* Use LSUN instead of CIFAR10

* Update TensorboardGenerativeModelImageSampler

* Update docs with lsun

* Update test

* Revert TensorboardGenerativeModelImageSampler changes

* Remove ModelCheckpoint callback and nrow=5 arg

* Apply suggestions from code review

* Fix test_dcgan

* Apply yapf

* Apply suggestions from code review

Co-authored-by: Teddy Koker <[email protected]>
Co-authored-by: Sidhant Sundrani <[email protected]>
Co-authored-by: Akihiro Nitta <[email protected]>
Co-authored-by: Héctor Laria <[email protected]>
Co-authored-by: Bartol Karuza <[email protected]>
Co-authored-by: Happy Sugar Life <[email protected]>
Co-authored-by: Jirka Borovec <[email protected]>
Co-authored-by: Jirka Borovec <[email protected]>
Co-authored-by: ananyahjha93 <[email protected]>
Co-authored-by: Pradeep Ganesan <[email protected]>
Co-authored-by: Brian Ko <[email protected]>
Co-authored-by: Christoph Clement <[email protected]>
@Borda Borda added this to the v0.3 milestone Jan 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix fixing issues... model
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Replay buffer not cleared after each epoch in reinforce
3 participants