-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Normalized gym env #125
Normalized gym env #125
Conversation
rllab/envs/normalized_gym_env.py
Outdated
else: | ||
raise NotImplementedError | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add bounds()
, flatten_n_gym_space()
, unflatten_gym_space()
, and unflatten_n_gym_space()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or we can move these functions into a new helper class. I think that will be beneficial when porting the other rllab.Envs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree these would be useful in something like rllab.env.utils
rllab/envs/normalized_gym_env.py
Outdated
raise NotImplementedError | ||
|
||
|
||
class NormalizedGymEnv(gym.Env, Serializable): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider inheriting from Wrapper
because this will take care of the render, close, seed, etc. functions.
rllab/envs/normalized_gym_env.py
Outdated
def step(self, action): | ||
if isinstance(self._env.action_space, gym.spaces.Box): | ||
# rescale the action | ||
lb, ub = self._env.action_space.bounds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the bounds can sometimes be (-np.inf, np.inf)
so you shouldn't normalize in this case.
rllab/envs/normalized_gym_env.py
Outdated
|
||
def _apply_normalize_obs(self, obs): | ||
self._update_obs_estimate(obs) | ||
return (flatten_gym_space(obs, self._env.observation_space) - |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For completeness, the returned obs
should be "unflattened" again to be consistent with how the env behaved originally. You should probably support unflattening for Discrete
, Box
and Tuple
spaces (given that your flatten function handles these cases).
Just in case this results in a significant computing overhead (benchmarking would help clarify), I would suggest you let the user set a constructor parameter, e.g. flatten_obs
(False
by default), to manually deactivate this costly unflattening.
rllab/envs/normalized_gym_env.py
Outdated
lb, ub = self._env.action_space.bounds | ||
scaled_action = lb + (action + 1.) * 0.5 * (ub - lb) | ||
scaled_action = np.clip(scaled_action, lb, ub) | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Discrete
is also a common action space and should be handled here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Discrete does not need to be scaled.
rllab/envs/normalized_gym_env.py
Outdated
else: | ||
raise NotImplementedError | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree these would be useful in something like rllab.env.utils
tests/test_normalized_gym.py
Outdated
@@ -0,0 +1,30 @@ | |||
import gym | |||
from rllab.envs.normalized_gym_env import NormalizedGymEnv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PEP8: import grouping
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree about the utils. I asked @jonashen to add that into his pr since he is doing the refactoring of gym.Env so let me reopen this after his pr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since all envs will be gym.Envs as of #118, shouldn't this just replace normalized?
Yes, the normalized class will be deleted when pr #129 is done. |
67b9ee5
to
954d496
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make a Github issue to rename/replaced normalized with this once the gym.Env change has posted.
Please reopen this PR against https://github.com/rlworkgroup/garage |
This is basically a rewrite of normalized_env and it conforms to the interface of gym.Env.
See issue #64