You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe
Hi, I'm currently developing a library with environments for meta-RL research. In order not to reinvent the wheel, I wanted to use the Jumanji interface (I like it better than gymnax, and Jumanji is more actively maintained), however I've encountered that with the current interface it is extremely difficult or impossible to do so.
In meta-RL we need to be able to adaptively change the environment parameters or the problem generator parameters and we need to do it from outside, because with sampling inside the environment we lose the ability to implement different training curriculums (besides implemented and hardcoded by default). Therefore, you need to be able to pass these parameters when resetting the environment. Gymnax does something similar.
Describe the solution you'd like
It seems to me that it would be enough to change the reset interface to:
where EnvOptions is arbitrary jit-compitable dataclass. step method can be left as it is, because these options can be stored in State and there is no need to pass it explicitly further. The only important thing is the possibility to change them on reset. Actually, gymnasium also does this (but for different reasons I guess...)
Currently I plan to add this argument in the subclass for my environments, but this will break compatibility with Jumanji wrappers (and in general) for example.
Describe alternatives you've considered
We can not use common meta-RL interface of env.set_task(task_params), as after jitting step and reset methods, this will not have any effect. We also can not give them at initalization, as base Environment class is not jit compatible and should be created once outside the jitted region.
Misc
Check for duplicate requests.
The text was updated successfully, but these errors were encountered:
Howuhh
changed the title
Jumanji is not suitable to meta-learning, but adding options parameter to reset method could fix this
Jumanji is not suitable to meta-learning, but adding options parameter to reset method can fix this
Oct 20, 2023
Hey @Howuhh really sorry for only getting back to you literally a year later. Really great work on xland minigrid, it's an awesome benchmark!
I'd be open to this although as you can probably tell (seeing as we took a year to reply) time is unfortunately a bit limited. Is this the solution you've settled on - passing some optional pytree to reset?
Importantly would this still benefit you as I assume you're not using jumanji wrappers etc? Or is it useful to be able to perform meta learning on some of these envs.
Is your feature request related to a problem? Please describe
Hi, I'm currently developing a library with environments for meta-RL research. In order not to reinvent the wheel, I wanted to use the Jumanji interface (I like it better than gymnax, and Jumanji is more actively maintained), however I've encountered that with the current interface it is extremely difficult or impossible to do so.
In meta-RL we need to be able to adaptively change the environment parameters or the problem generator parameters and we need to do it from outside, because with sampling inside the environment we lose the ability to implement different training curriculums (besides implemented and hardcoded by default). Therefore, you need to be able to pass these parameters when resetting the environment. Gymnax does something similar.
Describe the solution you'd like
It seems to me that it would be enough to change the reset interface to:
where
EnvOptions
is arbitrary jit-compitable dataclass.step
method can be left as it is, because these options can be stored inState
and there is no need to pass it explicitly further. The only important thing is the possibility to change them on reset. Actually, gymnasium also does this (but for different reasons I guess...)Currently I plan to add this argument in the subclass for my environments, but this will break compatibility with Jumanji wrappers (and in general) for example.
Describe alternatives you've considered
We can not use common meta-RL interface of
env.set_task(task_params)
, as after jitting step and reset methods, this will not have any effect. We also can not give them at initalization, as baseEnvironment
class is not jit compatible and should be created once outside the jitted region.Misc
The text was updated successfully, but these errors were encountered: