Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Gym 0.26 support #780

Closed
wants to merge 201 commits into from
Closed
Show file tree
Hide file tree
Changes from 80 commits
Commits
Show all changes
201 commits
Select commit Hold shift + click to select a range
ee71299
Remove references to GoalEnv
carlosluis Feb 19, 2022
65343f5
Fix env tests
carlosluis Feb 19, 2022
4d794f3
Fix bug in test creating invalid box space
carlosluis Feb 19, 2022
513ed08
Add classic_control extra packages from gym
carlosluis Feb 21, 2022
861b612
Merge branch 'master' into fix_tests
araffin Feb 21, 2022
2277012
Merge branch 'master' into fix_tests
carlosluis Feb 28, 2022
e5195a0
Change back to gym 0.22 for testing
carlosluis Feb 28, 2022
435f5fb
Fix failing set_env test
carlosluis Feb 28, 2022
f64346a
Fix test failiing due to deprectation of env.seed
carlosluis Mar 3, 2022
9efc37b
Merge branch 'master' into fix_tests
carlosluis Mar 7, 2022
daaa84c
Adjust mean reward threshold in failing test
carlosluis Mar 8, 2022
e62edde
Fix her test failing due to rng
carlosluis Mar 8, 2022
9a41c51
Change seed and revert reward threshold to 90
carlosluis Mar 8, 2022
41f260b
Merge branch 'master' into fix_tests
carlosluis Mar 16, 2022
9c73732
Pin gym version
carlosluis Mar 16, 2022
110be78
Make VecEnv compatible with gym seeding change
carlosluis Mar 16, 2022
dc9c645
Revert change to VecEnv reset signature
carlosluis Mar 17, 2022
e1c6e1b
Change subprocenv seed cmd to call reset instead
carlosluis Mar 17, 2022
29bd222
Fix type check
carlosluis Mar 17, 2022
b1730f4
Add backward compat
araffin Mar 17, 2022
3297091
Merge branch 'master' into fix_tests
araffin Mar 25, 2022
00e7946
Add `compat_gym_seed` helper
araffin Mar 25, 2022
ecf02ce
Merge branch 'master' into fix_tests
araffin Mar 28, 2022
f0e1ccb
Merge branch 'master' into fix_tests
carlosluis Apr 11, 2022
a116a1a
Add goal env checks in env_checker
carlosluis Apr 11, 2022
8780919
Add docs on HER requirements for envs
carlosluis Apr 11, 2022
e0db1ed
Merge branch 'master' into fix_tests
carlosluis Apr 11, 2022
ff44a2f
Merge branch 'master' into fix_tests
araffin Apr 13, 2022
c2ab5cd
Capture user warning in test with inverted box space
carlosluis Apr 13, 2022
cb50e9e
Update ale-py version
araffin Apr 18, 2022
de3f086
Fix randint
araffin Apr 19, 2022
1980db2
Allow noop_max to be zero
araffin Apr 19, 2022
85087bc
Merge branch 'master' into fix_tests
araffin Apr 19, 2022
776217a
Merge branch 'master' into fix_tests
araffin Apr 25, 2022
fa40152
Merge branch 'master' into fix_tests
araffin May 8, 2022
90adf8f
Update changelog
araffin May 8, 2022
d0300c8
Merge branch 'master' into fix_tests
araffin May 8, 2022
25136d7
Merge branch 'master' into fix_tests
araffin May 10, 2022
ec9d50f
Merge branch 'master' into fix_tests
araffin May 10, 2022
0f240fa
Update docker image
araffin May 10, 2022
a6766bd
Merge branch 'fix_tests' of github.com:carlosluis/stable-baselines3 i…
araffin May 10, 2022
3087d58
Update doc conda env and dockerfile
araffin May 13, 2022
7629016
Custom envs should not have any warnings
araffin May 13, 2022
7bb643b
Fix test for numpy >= 1.21
araffin May 13, 2022
706f072
Add check for vectorized compute reward
araffin May 13, 2022
af8e51b
Merge branch 'master' into fix_tests
araffin May 29, 2022
77d188f
Bump to gym 0.24
araffin May 29, 2022
68cec40
Fix gym default step docstring
araffin May 29, 2022
0072b77
Test downgrading gym
araffin May 29, 2022
07a85b8
Revert "Test downgrading gym"
araffin May 29, 2022
d755cc6
Fix protobuf error
araffin May 29, 2022
99b91eb
Fix in dependencies
araffin May 29, 2022
1d7da08
Fix protobuf dep
araffin May 29, 2022
f205865
Merge branch 'master' into fix_tests
araffin May 29, 2022
cf7e438
Use newest version of cartpole
araffin May 29, 2022
3d6f28a
Merge branch 'master' into fix_tests
araffin May 31, 2022
626db1d
Update gym
araffin Jun 10, 2022
f864501
Fix warning
araffin Jun 14, 2022
3af1488
Merge branch 'master' into fix_tests
araffin Jun 23, 2022
5fa3cd9
Loosen required scipy version
araffin Jun 23, 2022
e79148b
Scipy no longer needed
araffin Jun 23, 2022
ebf73d6
Merge branch 'master' into fix_tests
araffin Jul 18, 2022
1ad5f78
Try gym 0.25
araffin Jul 18, 2022
3f0b531
Silence warnings from gym
araffin Jul 25, 2022
2fc09f7
Merge branch 'master' into fix_tests
araffin Jul 25, 2022
b27e555
Filter warnings during tests
araffin Jul 25, 2022
c1e01c9
Merge branch 'master' into fix_tests
araffin Jul 30, 2022
78ef1e4
Merge branch 'master' into fix_tests
araffin Aug 22, 2022
c4bf066
Merge branch 'master' into fix_tests
araffin Aug 22, 2022
ed0f79e
Merge branch 'master' into fix_tests
araffin Aug 25, 2022
ee52936
Merge branch 'master' into fix_tests
araffin Sep 7, 2022
5d3b07f
Merge branch 'master' into fix_tests
araffin Sep 16, 2022
99d2155
Merge branch 'master' into fix_tests
araffin Sep 26, 2022
a0d5c79
Merge branch 'master' into fix_tests
qgallouedec Sep 28, 2022
75fd27e
Merge branch 'master' into fix_tests
araffin Sep 30, 2022
8c65748
Update doc
araffin Oct 1, 2022
0874da1
Update requirements
araffin Oct 1, 2022
26ceefc
Add gym 26 compat in vec env
araffin Oct 1, 2022
a8c579a
Fixes in envs and tests for gym 0.26+
araffin Oct 1, 2022
6ed3079
Enforce gym 0.26 api
araffin Oct 1, 2022
0851440
Merge pull request #1 from carlosluis/fix_tests
tlpss Oct 2, 2022
95bb4d6
format
tlpss Oct 2, 2022
c4517f2
Fix formatting
araffin Oct 2, 2022
9ac7592
Fix dependencies
araffin Oct 2, 2022
d2e6873
Fix syntax
araffin Oct 2, 2022
969c1cf
Cleanup doc and warnings
araffin Oct 2, 2022
2fcd072
Faster tests
araffin Oct 2, 2022
dd67a20
Higher budget for HER perf test (revert prev change)
araffin Oct 2, 2022
ae04b20
Merge branch 'master' into fix_tests
araffin Oct 3, 2022
9ae6fa2
Fixes and update doc
araffin Oct 5, 2022
056454b
Fix doc build
araffin Oct 5, 2022
f9bbb29
Merge branch 'master' into fix_tests
araffin Oct 6, 2022
29dffed
Merge branch 'master' into fix_tests
araffin Oct 10, 2022
cf565d9
Merge branch 'master' into fix_tests
araffin Oct 13, 2022
ed191b0
Merge branch 'master' of https://github.com/DLR-RM/stable-baselines3 …
qgallouedec Oct 17, 2022
04aa926
Merge branch 'master' into fix_tests
araffin Oct 24, 2022
734e19f
Merge branch 'master' into fix_tests
araffin Oct 26, 2022
9ad927b
Fix breaking change
araffin Oct 31, 2022
3ca1b73
Fixes for rendering
araffin Oct 31, 2022
0f5374f
Rename variables in monitor
araffin Oct 31, 2022
9cf2c3d
Merge pull request #2 from carlosluis/fix_tests
tlpss Nov 3, 2022
3320e78
update render method for gym 0.26 API
tlpss Nov 3, 2022
1596ea4
update tests and docs to new gym render API
tlpss Nov 3, 2022
b4b911c
Merge branch 'master' into fix_tests
araffin Nov 7, 2022
008fdce
undo removal of render modes metatadata check
tlpss Nov 7, 2022
93bd988
set rgb_array as default render mode for gym.make
tlpss Nov 7, 2022
c9a29b9
Merge branch 'master' into fix_tests
araffin Nov 15, 2022
720317f
Merge branch 'master' into fix_tests
araffin Nov 16, 2022
53da2d0
undo changes & raise warning if not 'rgb_array'
tlpss Nov 20, 2022
8675011
Merge branch 'master' into fix_tests
araffin Nov 22, 2022
db81278
Merge branch 'master' into fix_tests
araffin Nov 25, 2022
54fb37e
Fix type check
araffin Nov 28, 2022
96793b9
Merge branch 'fix_tests' into fix_tests
araffin Nov 28, 2022
e30c117
Merge branch 'master' into fix_tests
qgallouedec Nov 28, 2022
07ca271
Remove recursion and fix type checking
araffin Nov 28, 2022
acd0420
Merge branch 'master' into fix_tests
araffin Nov 29, 2022
c0a6a18
Remove hacks for protobuf and gym 0.24
araffin Nov 29, 2022
b954703
Fix type annotations
araffin Nov 29, 2022
870139c
reuse existing render_mode attribute
tlpss Dec 3, 2022
bc5335f
return tiled images for 'human' render mode
tlpss Dec 3, 2022
c718106
Merge branch 'fix_tests' of https://github.com/carlosluis/stable-base…
tlpss Dec 3, 2022
11cf07f
Allow to use opencv for human render, fix typos
araffin Dec 6, 2022
9d91ea3
Merge pull request #4 from tlpss/tlss/fix_tests
araffin Dec 6, 2022
3f75a8a
Add warning when using non-zero start with Discrete (fixes #1197)
araffin Dec 7, 2022
08a4712
Merge branch 'master' into fix_tests
araffin Dec 10, 2022
93c86f2
Merge branch 'master' into fix_tests
araffin Dec 13, 2022
d35cfad
Merge branch 'master' into fix_tests
araffin Dec 18, 2022
6251fdc
Fix type checking
araffin Dec 18, 2022
be998e8
Merge branch 'fix_tests' of github.com:carlosluis/stable-baselines3 i…
araffin Dec 18, 2022
d24d30d
Merge branch 'fix_tests' into tlpss/fix_tests
araffin Dec 18, 2022
6b80c93
Bug fixes and handle more cases
araffin Dec 18, 2022
c09fa74
Throw proper warnings
araffin Dec 18, 2022
480a793
Update test
araffin Dec 18, 2022
e03b885
Fix new metadata name
araffin Dec 18, 2022
e4248df
Ignore numpy warnings
araffin Dec 18, 2022
16bb26b
Merge branch 'fix_tests' into tlpss/fix_tests
araffin Dec 18, 2022
1f8ccbe
Fixes in vec recorder
araffin Dec 18, 2022
f4e978a
Global ignore
araffin Dec 18, 2022
408e9c2
Merge branch 'fix_tests' into tlpss/fix_tests
araffin Dec 18, 2022
f98903a
Filter local warning too
araffin Dec 18, 2022
046c12b
Merge branch 'fix_tests' into tlpss/fix_tests
araffin Dec 18, 2022
595be86
Merge branch 'master' into fix_tests
araffin Dec 19, 2022
29086a5
Monkey patch not needed for gym 26
araffin Dec 19, 2022
254107a
Merge branch 'fix_tests' into tlpss/fix_tests
araffin Dec 19, 2022
6103962
Merge branch 'master' into fix_tests
araffin Dec 20, 2022
fbaa8ac
Merge branch 'fix_tests' into tlss/fix_tests
araffin Dec 20, 2022
20faf11
Merge branch 'master' into fix_tests
araffin Dec 22, 2022
01831b6
Merge branch 'master' into fix_tests
araffin Dec 22, 2022
3780476
Merge branch 'fix_tests' into tlpss/fix_tests
araffin Dec 22, 2022
085b4c8
Merge branch 'fix_tests' of github.com:tlpss/stable-baselines3 into t…
araffin Dec 22, 2022
6a3f45d
Add doc of VecEnv vs Gym API
araffin Dec 23, 2022
c645d49
Add render test
araffin Dec 23, 2022
3cf4d00
Merge pull request #3 from tlpss/fix_tests
araffin Dec 23, 2022
3f6413d
Fix return type
araffin Dec 23, 2022
ff609ad
Update VecEnv vs Gym API doc
araffin Dec 23, 2022
99589ce
Add note in the quickstart section
araffin Jan 2, 2023
cf75cdc
Merge branch 'master' into fix_tests
araffin Jan 2, 2023
3611f2c
Merge branch 'master' into fix_tests
araffin Jan 5, 2023
b428a27
Fix for custom render mode
araffin Jan 5, 2023
d98dc9e
Fix return type
araffin Jan 5, 2023
5c32861
Merge branch 'master' into fix_tests
araffin Jan 11, 2023
6eaecd2
Merge branch 'master' into fix_tests
araffin Jan 16, 2023
cd28666
Fix type checking
araffin Jan 16, 2023
4f796a0
Merge branch 'master' into fix_tests
araffin Jan 23, 2023
669ef02
Merge branch 'master' into fix_tests
araffin Jan 23, 2023
c9430ec
check test env test_buffer
qgallouedec Jan 24, 2023
546928c
skip render check
qgallouedec Jan 24, 2023
8462dbb
check env test_dict_env
qgallouedec Jan 24, 2023
205b987
test_env test_gae
qgallouedec Jan 24, 2023
1946082
check envs in remaining tests
qgallouedec Jan 24, 2023
7460782
Update tests
araffin Jan 25, 2023
0431c7a
Merge pull request #5 from DLR-RM/check_test_env
araffin Jan 25, 2023
e5575d8
Add warning for Discrete action space with non-zero (#1295)
araffin Jan 25, 2023
c951311
Merge branch 'master' into fix_tests
qgallouedec Jan 28, 2023
b787d98
Fix atari annotation
qgallouedec Jan 28, 2023
85bb0d4
ignore get_action_meanings [attr-defined]
qgallouedec Jan 28, 2023
ba5827e
Merge branch 'master' into fix_tests
araffin Feb 6, 2023
afa1c73
Fix mypy issues
araffin Feb 6, 2023
f03fd75
Merge branch 'master' into fix_tests
araffin Feb 11, 2023
40048ba
Merge branch 'master' into fix_tests
araffin Feb 11, 2023
75217fa
Fix undefined info
araffin Feb 13, 2023
cff332c
Merge branch 'fix_tests' of github.com:carlosluis/stable-baselines3 i…
araffin Feb 13, 2023
6da9b39
Merge branch 'master' into fix_tests
araffin Feb 25, 2023
06ad5a8
Rename done to terminated
araffin Feb 25, 2023
93c10cf
Fix pygame dependency for python 3.7
araffin Feb 25, 2023
c5a5d73
Merge branch 'master' into fix_tests
araffin Feb 27, 2023
51ffad2
Merge branch 'master' into fix_tests
araffin Mar 3, 2023
00b9bbb
Merge branch 'master' into fix_tests
araffin Mar 11, 2023
8a149f4
Merge branch 'fix_tests' of github.com:carlosluis/stable-baselines3 i…
araffin Mar 11, 2023
b82cacd
Forks don't have access to private variables
araffin Mar 11, 2023
07d2171
Merge branch 'master' into fix_tests
araffin Mar 12, 2023
d0f5e8a
Fix linter warnings
araffin Mar 12, 2023
d5c79b0
[ci skip] Merge branch 'master' into fix_tests
araffin Mar 14, 2023
e91b436
Merge branch 'master' into fix_tests
araffin Mar 20, 2023
5ec7e39
Merge branch 'fix_tests' of github.com:carlosluis/stable-baselines3 i…
araffin Mar 20, 2023
986e6c0
Fix env checker for GoalEnv
araffin Mar 20, 2023
2be728d
Merge branch 'master' into fix_tests
araffin Mar 29, 2023
331853a
Update env checker (more info) and fix dtype
araffin Mar 29, 2023
68861b6
Use micromamab for Docker
araffin Mar 29, 2023
9f0d5d8
Update dependencies
araffin Mar 29, 2023
6617e6e
Clarify VecEnv doc
araffin Mar 29, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions .github/ISSUE_TEMPLATE/custom_env.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,19 +44,20 @@ from stable_baselines3.common.env_checker import check_env
class CustomEnv(gym.Env):

def __init__(self):
super(CustomEnv, self).__init__()
super().__init__()
self.observation_space = gym.spaces.Box(low=-np.inf, high=np.inf, shape=(14,))
self.action_space = gym.spaces.Box(low=-1, high=1, shape=(6,))

def reset(self):
return self.observation_space.sample()
return self.observation_space.sample(), {}

def step(self, action):
obs = self.observation_space.sample()
reward = 1.0
done = False
truncated = False
info = {}
return obs, reward, done, info
return obs, reward, done, truncated, info

env = CustomEnv()
check_env(env)
Expand Down
2 changes: 1 addition & 1 deletion .gitlab-ci.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
image: stablebaselines/stable-baselines3-cpu:1.4.1a0
image: stablebaselines/stable-baselines3-cpu:1.5.1a6

type-check:
script:
Expand Down
5 changes: 4 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,9 @@ FROM $PARENT_IMAGE
ARG PYTORCH_DEPS=cpuonly
ARG PYTHON_VERSION=3.7

# for tzdata
araffin marked this conversation as resolved.
Show resolved Hide resolved
ENV DEBIAN_FRONTEND="noninteractive" TZ="Europe/Paris"

RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
cmake \
Expand All @@ -20,7 +23,7 @@ RUN curl -o ~/miniconda.sh https://repo.anaconda.com/miniconda/Miniconda3-latest
~/miniconda.sh -b -p /opt/conda && \
rm ~/miniconda.sh && \
/opt/conda/bin/conda install -y python=$PYTHON_VERSION numpy pyyaml scipy ipython mkl mkl-include && \
/opt/conda/bin/conda install -y pytorch $PYTORCH_DEPS -c pytorch && \
/opt/conda/bin/conda install -y pytorch=1.11 $PYTORCH_DEPS -c pytorch && \
araffin marked this conversation as resolved.
Show resolved Hide resolved
/opt/conda/bin/conda clean -ya
ENV PATH /opt/conda/bin:$PATH

Expand Down
3 changes: 2 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,8 @@ check-codestyle:
commit-checks: format type lint

doc:
cd docs && make html
# Prevent weird error due to protobuf
cd docs && PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=cpp make html
araffin marked this conversation as resolved.
Show resolved Hide resolved

spelling:
cd docs && make spelling
Expand Down
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,12 +124,12 @@ env = gym.make("CartPole-v1")
model = PPO("MlpPolicy", env, verbose=1)
model.learn(total_timesteps=10_000)

obs = env.reset()
obs, info = env.reset()
for i in range(1000):
action, _states = model.predict(obs, deterministic=True)
obs, reward, done, info = env.step(action)
obs, reward, done, truncated, info = env.step(action)
env.render()
if done:
if done or truncated:
obs = env.reset()

env.close()
Expand Down
6 changes: 3 additions & 3 deletions docs/conda_env.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,11 @@ channels:
- defaults
dependencies:
- cpuonly=1.0=0
- pip=21.1
- pip=22.1.1
- python=3.7
- pytorch=1.11=py3.7_cpu_0
- pytorch=1.11.0=py3.7_cpu_0
- pip:
- gym==0.21
- gym==0.26
- cloudpickle
- opencv-python-headless
- pandas
Expand Down
15 changes: 8 additions & 7 deletions docs/guide/examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -94,11 +94,12 @@ In the following example, we will train, save and load a DQN model on the Lunar
mean_reward, std_reward = evaluate_policy(model, model.get_env(), n_eval_episodes=10)

# Enjoy trained agent
obs = env.reset()
vec_env = model.get_env()
obs = vec_env.reset()
for i in range(1000):
action, _states = model.predict(obs, deterministic=True)
obs, rewards, dones, info = env.step(action)
env.render()
obs, rewards, dones, info = vec_env.step(action)
vec_env.render()


Multiprocessing: Unleashing the Power of Vectorized Environments
Expand Down Expand Up @@ -470,19 +471,19 @@ The parking env is a goal-conditioned continuous control task, in which the vehi
# HER must be loaded with the env
model = SAC.load("her_sac_highway", env=env)

obs = env.reset()
obs, info = env.reset()

# Evaluate the agent
episode_reward = 0
for _ in range(100):
action, _ = model.predict(obs, deterministic=True)
obs, reward, done, info = env.step(action)
obs, reward, done, truncated, info = env.step(action)
araffin marked this conversation as resolved.
Show resolved Hide resolved
env.render()
episode_reward += reward
if done or info.get("is_success", False):
if done or truncated or info.get("is_success", False):
print("Reward:", episode_reward, "Success?", info.get("is_success", False))
episode_reward = 0.0
obs = env.reset()
obs, info = env.reset()


Learning Rate Schedule
Expand Down
20 changes: 13 additions & 7 deletions docs/guide/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,18 +14,24 @@ Here is a quick example of how to train and run A2C on a CartPole environment:

from stable_baselines3 import A2C

env = gym.make('CartPole-v1')
env = gym.make("CartPole-v1")

model = A2C('MlpPolicy', env, verbose=1)
model = A2C("MlpPolicy", env, verbose=1)
model.learn(total_timesteps=10000)

obs = env.reset()
# Note: Gym 0.26+ reset() returns a tuple
# where SB3 VecEnv only return an observation
obs, info = env.reset()
for i in range(1000):
action, _state = model.predict(obs, deterministic=True)
obs, reward, done, info = env.step(action)
# Note: Gym 0.26+ step() returns an additional boolean
# "truncated" where SB3 store truncation information
# in info["TimeLimit.truncated"]
obs, reward, done, truncated, info = env.step(action)
env.render()
if done:
obs = env.reset()
# Note: reset is automated in SB3 VecEnv
if done or truncated:
obs, info = env.reset()

.. note::

Expand All @@ -40,4 +46,4 @@ the policy is registered:

from stable_baselines3 import A2C

model = A2C('MlpPolicy', 'CartPole-v1').learn(10000)
model = A2C("MlpPolicy", "CartPole-v1").learn(10000)
7 changes: 6 additions & 1 deletion docs/misc/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ Release 1.6.0 (2022-07-11)

Breaking Changes:
^^^^^^^^^^^^^^^^^
- Switched minimum Gym version to 0.24 (@carlosluis)
araffin marked this conversation as resolved.
Show resolved Hide resolved
- Changed the way policy "aliases" are handled ("MlpPolicy", "CnnPolicy", ...), removing the former
``register_policy`` helper, ``policy_base`` parameter and using ``policy_aliases`` static attributes instead (@Gregwar)
- SB3 now requires PyTorch >= 1.11
Expand All @@ -73,6 +74,7 @@ Breaking Changes:

New Features:
^^^^^^^^^^^^^
- ``noop_max`` and ``frame_skip`` are now allowed to be equal to zero when using ``AtariWrapper``

SB3-Contrib
^^^^^^^^^^^
Expand All @@ -98,6 +100,7 @@ Deprecations:
Others:
^^^^^^^
- Upgraded to Python 3.7+ syntax using ``pyupgrade``
- Updated docker base image to Ubuntu 20.04 and cuda 11.3
- Removed redundant double-check for nested observations from ``BaseAlgorithm._wrap_env`` (@TibiGG)

Documentation:
Expand All @@ -107,6 +110,7 @@ Documentation:
- Added link to PPO ICLR blog post
- Added remark about breaking Markov assumption and timeout handling
- Added doc about MLFlow integration via custom logger (@git-thor)
- Updated tutorials to work with Gym 0.23 (@arjun-kg)
- Updated Huggingface integration doc
- Added copy button for code snippets
- Added doc about EnvPool and Isaac Gym support
Expand All @@ -119,7 +123,7 @@ Release 1.5.0 (2022-03-25)

Breaking Changes:
^^^^^^^^^^^^^^^^^
- Switched minimum Gym version to 0.21.0.
- Switched minimum Gym version to 0.21.0

New Features:
^^^^^^^^^^^^^
Expand Down Expand Up @@ -1043,5 +1047,6 @@ And all the contributors:
@eleurent @ac-93 @cove9988 @theDebugger811 @hsuehch @Demetrio92 @thomasgubler @IperGiove @ScheiklP
@simoninithomas @armandpl @manuel-delverme @Gautam-J @gianlucadecola @buoyancy99 @caburu @xy9485
@Gregwar @ycheng517 @quantitative-technologies @bcollazo @git-thor @TibiGG @cool-RR @MWeltevrede
@carlosluis @arjun-kg @tlpss
@Melanol @qgallouedec @francescoluciano @jlp-ue @burakdmb @timothe-chaumont @honglu2875
@anand-bala @hughperkins @sidney-tio @AlexPasqua @dominicgkerr @Akhilez @Rocamonde
5 changes: 4 additions & 1 deletion docs/modules/her.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,10 @@ It creates "virtual" transitions by relabeling transitions (changing the desired

.. warning::

HER requires the environment to inherits from `gym.GoalEnv <https://github.com/openai/gym/blob/3394e245727c1ae6851b504a50ba77c73cd4c65b/gym/core.py#L160>`_
HER requires the environment to follow the legacy `gym.GoalEnv interface <https://github.com/openai/gym/blob/3394e245727c1ae6851b504a50ba77c73cd4c65b/gym/core.py#L160>`_
araffin marked this conversation as resolved.
Show resolved Hide resolved
In short, the ``gym.Env`` must have:
- a vectorized implementation of ``compute_reward()``
- a dictionary observation space with three keys: ``observation``, ``achieved_goal`` and ``desired_goal``


.. warning::
Expand Down
6 changes: 3 additions & 3 deletions scripts/build_docker.sh
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
#!/bin/bash

CPU_PARENT=ubuntu:18.04
GPU_PARENT=nvidia/cuda:10.1-cudnn7-runtime-ubuntu18.04
CPU_PARENT=ubuntu:20.04
GPU_PARENT=nvidia/cuda:11.3.1-base-ubuntu20.04

TAG=stablebaselines/stable-baselines3
VERSION=$(cat ./stable_baselines3/version.txt)

if [[ ${USE_GPU} == "True" ]]; then
PARENT=${GPU_PARENT}
PYTORCH_DEPS="cudatoolkit=10.1"
PYTORCH_DEPS="cudatoolkit=11.3"
else
PARENT=${CPU_PARENT}
PYTORCH_DEPS="cpuonly"
Expand Down
9 changes: 5 additions & 4 deletions setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,12 @@ filterwarnings =
# Tensorboard warnings
ignore::DeprecationWarning:tensorboard
# Gym warnings
ignore:Parameters to load are deprecated.:DeprecationWarning
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: remove or update to gymnasium later

ignore:the imp module is deprecated in favour of importlib:PendingDeprecationWarning
; ignore:Parameters to load are deprecated.:DeprecationWarning
; ignore:the imp module is deprecated in favour of importlib:PendingDeprecationWarning
ignore::UserWarning:gym
ignore:SelectableGroups dict interface is deprecated.:DeprecationWarning
ignore:`np.bool` is a deprecated alias for the builtin `bool`:DeprecationWarning
; ignore:SelectableGroups dict interface is deprecated.:DeprecationWarning
; ignore:`np.bool` is a deprecated alias for the builtin `bool`:DeprecationWarning
ignore:.*step API:DeprecationWarning:gym
markers =
expensive: marks tests as expensive (deselect with '-m "not expensive"')

Expand Down
17 changes: 8 additions & 9 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,13 +48,13 @@
model = PPO("MlpPolicy", env, verbose=1)
model.learn(total_timesteps=10_000)

obs = env.reset()
obs, info = env.reset()
for i in range(1000):
action, _states = model.predict(obs, deterministic=True)
obs, reward, done, info = env.step(action)
obs, reward, done, truncated, info = env.step(action)
env.render()
if done:
obs = env.reset()
if done or truncated:
obs, info = env.reset()
```

Or just train a model with a one liner if [the environment is registered in Gym](https://www.gymlibrary.ml/content/environment_creation/) and if [the policy is registered](https://stable-baselines3.readthedocs.io/en/master/guide/custom_policy.html):
Expand All @@ -73,7 +73,7 @@
packages=[package for package in find_packages() if package.startswith("stable_baselines3")],
package_data={"stable_baselines3": ["py.typed", "version.txt"]},
install_requires=[
"gym==0.21", # Fixed version due to breaking changes in 0.22
"gym==0.26",
"numpy",
"torch>=1.11",
# For saving models
Expand All @@ -100,11 +100,9 @@
"isort>=5.0",
# Reformat
"black",
# For toy text Gym envs
"scipy>=1.4.1",
],
"docs": [
"sphinx",
"sphinx~=4.5.0",
"sphinx-autobuild",
"sphinx-rtd-theme",
# For spelling
Expand All @@ -117,8 +115,9 @@
"extra": [
# For render
"opencv-python",
"pygame",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pygame is not used directly, but I assume it is listed, because some environments from gym(nasium) use pygame for rendering. Nevertheless, imho I think it is the responsibility of the environment to install all of its dependencies.

# For atari games,
"ale-py==0.7.4",
"ale-py~=0.8.0",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.8.1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in the Gymnasium branch (where we rely on shimmy)

"autorom[accept-rom-license]~=0.4.2",
"pillow",
# Tensorboard support
Expand Down
4 changes: 4 additions & 0 deletions stable_baselines3/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import os
import warnings

from stable_baselines3.a2c import A2C
from stable_baselines3.common.utils import get_system_info
Expand All @@ -14,6 +15,9 @@
with open(version_file) as file_handler:
__version__ = file_handler.read().strip()

# Silence Gym warnings due to new API
warnings.filterwarnings("ignore", message=r".*step API", module="gym")


def HER(*args, **kwargs):
raise ImportError(
Expand Down
Loading