-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RLlib] Attention Net prep PR #2: Smaller cleanups. #12449
Conversation
…ntion_nets_prep_2
…ntion_nets_prep_2
@@ -133,7 +140,7 @@ def build(self, view_requirements: Dict[str, ViewRequirement]) -> \ | |||
continue | |||
# OBS are already shifted by -1 (the initial obs starts one ts | |||
# before all other data columns). | |||
shift = view_req.shift - \ | |||
shift = view_req.data_rel_pos - \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed this b/c this will support (in the upcoming PRs) not just a single shift (int), but also:
- list of ints (include not just one ts in this view, but several)
- a range string, e.g. "-50:-1" (will be used by attention nets and Atari framestacking).
@@ -52,17 +52,19 @@ def __init__(self, shift_before: int = 0): | |||
# each time a (non-initial!) observation is added. | |||
self.count = 0 | |||
|
|||
def add_init_obs(self, episode_id: EpisodeID, agent_id: AgentID, | |||
env_id: EnvID, init_obs: TensorType, | |||
def add_init_obs(self, episode_id: EpisodeID, agent_index: int, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- agent_id vs agent_idx was a bug
- added timestep
…ntion_nets_prep_2
@@ -29,7 +29,7 @@ class ViewRequirement: | |||
def __init__(self, | |||
data_col: Optional[str] = None, | |||
space: gym.Space = None, | |||
shift: Union[int, List[int]] = 0, | |||
data_rel_pos: Union[int, List[int]] = 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not keep it as shift? It seems to be intuitive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I liked shift
, too. The problem is, there will also be an abs_pos
soon (see attention net PRs). So I wanted to distinguish between these two concepts.
whether to create those new envs in remote processes instead of | ||
in the current process. This adds overheads, but can make sense | ||
if your envs are expensive to step/reset (e.g., for StarCraft). | ||
Use this cautiously, overheads are significant! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
The current attention net trajectory view PR (#11729) is too large (>1000 lines added).
Therefore, I'm moving smaller preparatory and cleanup changes into 3 pre-PRs. This is the second one of these. Only review it once this one here (#12447) has been merged.
Why are these changes needed?
Related issue number
Checks
scripts/format.sh
to lint the changes in this PR.