Migrate to gymnasium #72

Rocamonde · 2023-07-01T16:22:39Z

No description provided.

Rocamonde · 2023-07-01T16:29:40Z

@ernestum could you take an initial look at this? There's still some errors to fix but seems like most of the stuff is updated now.

Rocamonde · 2023-07-01T16:31:04Z

Also @AdamGleave introducing some API changes here, since now that gym has generic types a lot of our weird workarounds are no longer necessary. In particular, at the ABC top-level, I am keeping inheritance of observation_space and action_space type declaration instead of creating our custom private variable etc. Users have to create these replacements on implementation. I am also migrating the old random number generation to the new interface and adding proper generic typing to numpy where appropriate.

Also, since Discrete is an int64, the state and action values now are also int64.

Rocamonde · 2023-07-01T16:34:57Z

Also we seem to be getting a CI error:

Either git or ssh (required by git to clone through SSH) is not installed in the image. 
Falling back to CircleCI's native git client but the behavior may be different from official git. 
If this is an issue, please use an image that has official git and ssh installed.

Don't think this is related to my changes?

ernestum

Looks good so far. Thanks for the fast progress on this one!

src/seals/base_envs.py

src/seals/classic_control.py

src/seals/util.py

ernestum · 2023-07-03T09:10:35Z

I don't think the CI error should be due to your changes.

RedTachyon

I left a few comments, but it wasn't necessarily 100% comprehensive

One thing to consider, which I don't have a good answer for right now (would need to spend some more time getting used to the library) -- when we hit the end of a fixed-horizon env, should it be a terminated or truncated situation?

The first-order approximation (in "regular" RL) is that if you're hitting a time limit, it's truncation; if you're reaching a terminal state, it's termination.

The second-order approximation is that if, in regular RL training, you'd use the value estimation in the final step to get the full discounted reward estimate, then that's truncation.

Fixed-horizon envs might have a weird interaction with this. In the end, the most important thing is probably that it's consistent between here and imitation. This might also slightly push it towards using truncated, since there are many environments (like Half-Cheetah, for example) which will use that semantic automatically

src/seals/base_envs.py

src/seals/classic_control.py

src/seals/testing/envs.py

src/seals/util.py

tests/test_base_env.py

tests/test_wrappers.py

AdamGleave · 2023-07-23T19:47:44Z

@Rocamonde what's the current status on this? Planning on having someone take a look at the Gymnasium PR in SB3 next week.

src/seals/util.py

Rocamonde · 2023-07-27T18:57:43Z

@Rocamonde what's the current status on this? Planning on having someone take a look at the Gymnasium PR in SB3 next week.

Currently quite swamped with other things and have not been able to look at this. I expect that I will have a bit more time in the future.

tests/test_envs.py

ernestum · 2023-08-31T14:27:27Z

Closing this in favor of #73

Initial commit

440d5a7

Rocamonde mentioned this pull request Jul 1, 2023

Gymnasium Compatibility HumanCompatibleAI/imitation#735

Merged

3 tasks

Rocamonde requested a review from ernestum July 1, 2023 16:29

ernestum reviewed Jul 2, 2023

View reviewed changes

RedTachyon reviewed Jul 3, 2023

View reviewed changes

RedTachyon mentioned this pull request Jul 3, 2023

Pin SB3 version to 1.7.0 HumanCompatibleAI/imitation#738

Merged

EdoardoPona reviewed Jul 26, 2023

View reviewed changes

src/seals/util.py Show resolved Hide resolved

EdoardoPona mentioned this pull request Jul 28, 2023

Migrate to gymnasium maintaining python 3.8 compatibility #73

Merged

1 task

RedTachyon reviewed Aug 2, 2023

View reviewed changes

tests/test_envs.py Outdated Show resolved Hide resolved

ernestum force-pushed the fix/71 branch from 46242e0 to 440d5a7 Compare August 3, 2023 11:13

ernestum closed this Aug 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate to gymnasium #72

Migrate to gymnasium #72

Rocamonde commented Jul 1, 2023

Rocamonde commented Jul 1, 2023

Rocamonde commented Jul 1, 2023 •

edited

Loading

Rocamonde commented Jul 1, 2023 •

edited

Loading

ernestum left a comment

ernestum commented Jul 3, 2023

RedTachyon left a comment

AdamGleave commented Jul 23, 2023

Rocamonde commented Jul 27, 2023

ernestum commented Aug 31, 2023

Migrate to gymnasium #72

Migrate to gymnasium #72

Conversation

Rocamonde commented Jul 1, 2023

Rocamonde commented Jul 1, 2023

Rocamonde commented Jul 1, 2023 • edited Loading

Rocamonde commented Jul 1, 2023 • edited Loading

ernestum left a comment

Choose a reason for hiding this comment

ernestum commented Jul 3, 2023

RedTachyon left a comment

Choose a reason for hiding this comment

AdamGleave commented Jul 23, 2023

Rocamonde commented Jul 27, 2023

ernestum commented Aug 31, 2023

Rocamonde commented Jul 1, 2023 •

edited

Loading

Rocamonde commented Jul 1, 2023 •

edited

Loading