Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate to gymnasium #72

Closed
wants to merge 1 commit into from
Closed

Migrate to gymnasium #72

wants to merge 1 commit into from

Conversation

Rocamonde
Copy link
Member

No description provided.

@Rocamonde
Copy link
Member Author

@ernestum could you take an initial look at this? There's still some errors to fix but seems like most of the stuff is updated now.

@Rocamonde
Copy link
Member Author

Rocamonde commented Jul 1, 2023

Also @AdamGleave introducing some API changes here, since now that gym has generic types a lot of our weird workarounds are no longer necessary. In particular, at the ABC top-level, I am keeping inheritance of observation_space and action_space type declaration instead of creating our custom private variable etc. Users have to create these replacements on implementation. I am also migrating the old random number generation to the new interface and adding proper generic typing to numpy where appropriate.

Also, since Discrete is an int64, the state and action values now are also int64.

@Rocamonde
Copy link
Member Author

Rocamonde commented Jul 1, 2023

Also we seem to be getting a CI error:

Either git or ssh (required by git to clone through SSH) is not installed in the image. 
Falling back to CircleCI's native git client but the behavior may be different from official git. 
If this is an issue, please use an image that has official git and ssh installed.

Don't think this is related to my changes?

Copy link
Collaborator

@ernestum ernestum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good so far. Thanks for the fast progress on this one!

src/seals/base_envs.py Show resolved Hide resolved
src/seals/base_envs.py Show resolved Hide resolved
src/seals/base_envs.py Show resolved Hide resolved
src/seals/base_envs.py Show resolved Hide resolved
src/seals/classic_control.py Show resolved Hide resolved
src/seals/util.py Show resolved Hide resolved
@ernestum
Copy link
Collaborator

ernestum commented Jul 3, 2023

I don't think the CI error should be due to your changes.

Copy link

@RedTachyon RedTachyon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few comments, but it wasn't necessarily 100% comprehensive

One thing to consider, which I don't have a good answer for right now (would need to spend some more time getting used to the library) -- when we hit the end of a fixed-horizon env, should it be a terminated or truncated situation?

The first-order approximation (in "regular" RL) is that if you're hitting a time limit, it's truncation; if you're reaching a terminal state, it's termination.

The second-order approximation is that if, in regular RL training, you'd use the value estimation in the final step to get the full discounted reward estimate, then that's truncation.

Fixed-horizon envs might have a weird interaction with this. In the end, the most important thing is probably that it's consistent between here and imitation. This might also slightly push it towards using truncated, since there are many environments (like Half-Cheetah, for example) which will use that semantic automatically

src/seals/base_envs.py Show resolved Hide resolved
src/seals/base_envs.py Show resolved Hide resolved
src/seals/base_envs.py Show resolved Hide resolved
src/seals/classic_control.py Show resolved Hide resolved
src/seals/testing/envs.py Show resolved Hide resolved
src/seals/util.py Show resolved Hide resolved
src/seals/util.py Show resolved Hide resolved
src/seals/util.py Show resolved Hide resolved
tests/test_base_env.py Show resolved Hide resolved
tests/test_wrappers.py Show resolved Hide resolved
@AdamGleave
Copy link
Member

@Rocamonde what's the current status on this? Planning on having someone take a look at the Gymnasium PR in SB3 next week.

@Rocamonde
Copy link
Member Author

@Rocamonde what's the current status on this? Planning on having someone take a look at the Gymnasium PR in SB3 next week.

Currently quite swamped with other things and have not been able to look at this. I expect that I will have a bit more time in the future.

tests/test_envs.py Outdated Show resolved Hide resolved
@ernestum
Copy link
Collaborator

Closing this in favor of #73

@ernestum ernestum closed this Aug 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants