Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make RecordEpisodeStatistics work with VectorEnv #2296

Merged

Conversation

vwxyzjn
Copy link
Contributor

@vwxyzjn vwxyzjn commented Aug 5, 2021

Following up with #2279, this PR makes RecordEpisodeStatistics work with VectorEnv as well. That is

import gym
from gym.vector import SyncVectorEnv

def make_env(gym_id, seed):
    def thunk():
        env = gym.make(gym_id)
        env = gym.wrappers.RecordEpisodeStatistics(env)
        env.seed(seed)
        env.action_space.seed(seed)
        env.observation_space.seed(seed)
        return env
    return thunk

envs = SyncVectorEnv([make_env("CartPole-v1", 1 + i) for i in range(2)])
envs.reset()
for i in range(100):
    _, _, _, infos = envs.step(envs.action_space.sample())
    for info in infos:
        if "episode" in info.keys():
            print(f"i, episode_reward={info['episode']['r']}")
            break

will produce the same results as

import gym
from gym.vector import SyncVectorEnv

def make_env(gym_id, seed):
    def thunk():
        env = gym.make(gym_id)
        env.seed(seed)
        env.action_space.seed(seed)
        env.observation_space.seed(seed)
        return env
    return thunk

envs = SyncVectorEnv([make_env("CartPole-v1", 1 + i) for i in range(2)])
envs = gym.wrappers.RecordEpisodeStatistics(envs)
envs.reset()
for i in range(100):
    _, _, _, infos = envs.step(envs.action_space.sample())
    for info in infos:
        if "episode" in info.keys():
            print(f"i, episode_reward={info['episode']['r']}")
            break

The reason why this PR is important is that some envs don't allow you to insert a RecordEpisodeStatistics in the make_env function. The procgen environment is one such example. So this PR will allow you to do something like this:

import gym
from procgen import ProcgenEnv

envs = ProcgenEnv(num_envs=2, env_name="coinrun", num_levels=0, start_level=0, distribution_mode='hard')
envs = gym.wrappers.RecordEpisodeStatistics(envs)
envs.reset()
for i in range(100):
    _, _, _, infos = envs.step(envs.action_space.sample())
    for info in infos:
        if "episode" in info.keys():
            print(f"i, episode_reward={info['episode']['r']}")
            break

@jkterry1
Copy link
Collaborator

jkterry1 commented Aug 5, 2021

Could you please add tests before I merge?

@vwxyzjn
Copy link
Contributor Author

vwxyzjn commented Aug 5, 2021

@jkterry1 Could you approve workflows?

@vwxyzjn
Copy link
Contributor Author

vwxyzjn commented Aug 5, 2021

Hey @jkterry1 can you approve the workflows again

self.episode_length = 0
return observation
observations = super(RecordEpisodeStatistics, self).reset(**kwargs)
self.episode_returns = np.zeros(self.num_envs, dtype=np.float32)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.num_envs is not defined here if env is a VectorEnv instance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t follow. VectorEnv has a num_envs attribute, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right the wrapper inherits the properties from env, sorry my bad!

gym/wrappers/record_episode_statistics.py Outdated Show resolved Hide resolved
Comment on lines 27 to 29
@pytest.mark.parametrize("env_id", ["CartPole-v0"])
def test_record_episode_statistics_with_vectorenv(env_id):
envs = gym.vector.make(env_id, asynchronous=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the corresponding imports

from gym.vector import AsyncVectorEnv, SyncVectorEnv
from gym.vector.tests.utils import make_env
Suggested change
@pytest.mark.parametrize("env_id", ["CartPole-v0"])
def test_record_episode_statistics_with_vectorenv(env_id):
envs = gym.vector.make(env_id, asynchronous=False)
@pytest.mark.parametrize("klass", [SyncVectorEnv, AsyncVectorEnv])
@pytest.mark.parametrize("num_envs", [1, 4])
def test_record_episode_statistics_with_vectorenv(klass, num_envs):
env_fns = [make_env("CartPole-v0", i) for i in range(num_envs)]
envs = klass(env_fns)

Copy link
Contributor Author

@vwxyzjn vwxyzjn Aug 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately it’s gonna fail with AsyncVectorEnv because the envs.env.envs[0].spec.max_episode_steps is inaccessible. Maybe I should just hardcore a 201 instead of envs.env.envs[0].spec.max_episode_steps? Do we really need the test case with AsyncVectorEnv?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry I didn't see that you were looping over that later. Then you can ignore this (maybe keeping the parametrization for num_envs?).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Would you mind Allowing the GitHub action workflow runs? I have some weird setup That makes it difficult to run test cases locally….

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@vwxyzjn vwxyzjn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @tristandeleu so much for the review. I just have one comment regarding the test cases.

Comment on lines 27 to 29
@pytest.mark.parametrize("env_id", ["CartPole-v0"])
def test_record_episode_statistics_with_vectorenv(env_id):
envs = gym.vector.make(env_id, asynchronous=False)
Copy link
Contributor Author

@vwxyzjn vwxyzjn Aug 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately it’s gonna fail with AsyncVectorEnv because the envs.env.envs[0].spec.max_episode_steps is inaccessible. Maybe I should just hardcore a 201 instead of envs.env.envs[0].spec.max_episode_steps? Do we really need the test case with AsyncVectorEnv?

self.episode_length = 0
return observation
observations = super(RecordEpisodeStatistics, self).reset(**kwargs)
self.episode_returns = np.zeros(self.num_envs, dtype=np.float32)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t follow. VectorEnv has a num_envs attribute, right?

@jkterry1 jkterry1 merged commit 1397e70 into openai:master Aug 5, 2021
zlig pushed a commit to zlig/gym that referenced this pull request Sep 6, 2021
* Make RecordEpisodeStatistics work with VectorEnv

* fix test cases

* fix lint

* add test cases

* fix linting

* fix tests

* fix test cases...

* Update gym/wrappers/record_episode_statistics.py

Co-authored-by: Tristan Deleu <[email protected]>

* fix test cases

* fix test cases again

Co-authored-by: Tristan Deleu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants