Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] Attention Net prep PR #2: Smaller cleanups. #12449

Merged
merged 11 commits into from
Dec 1, 2020

Conversation

sven1977
Copy link
Contributor

@sven1977 sven1977 commented Nov 26, 2020

The current attention net trajectory view PR (#11729) is too large (>1000 lines added).
Therefore, I'm moving smaller preparatory and cleanup changes into 3 pre-PRs. This is the second one of these. Only review it once this one here (#12447) has been merged.

Why are these changes needed?

Related issue number

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@sven1977 sven1977 changed the title [RLlib] Attention Net prep PR #2: Smaller cleanups. [ONHOLD RLlib] Attention Net prep PR #2: Smaller cleanups. Nov 26, 2020
@sven1977 sven1977 mentioned this pull request Nov 26, 2020
6 tasks
@sven1977 sven1977 added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Nov 27, 2020
@sven1977 sven1977 removed the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Nov 27, 2020
@sven1977 sven1977 changed the title [ONHOLD RLlib] Attention Net prep PR #2: Smaller cleanups. [RLlib] Attention Net prep PR #2: Smaller cleanups. Nov 28, 2020
@sven1977 sven1977 requested a review from ericl November 28, 2020 10:42
@@ -133,7 +140,7 @@ def build(self, view_requirements: Dict[str, ViewRequirement]) -> \
continue
# OBS are already shifted by -1 (the initial obs starts one ts
# before all other data columns).
shift = view_req.shift - \
shift = view_req.data_rel_pos - \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed this b/c this will support (in the upcoming PRs) not just a single shift (int), but also:

  • list of ints (include not just one ts in this view, but several)
  • a range string, e.g. "-50:-1" (will be used by attention nets and Atari framestacking).

@@ -52,17 +52,19 @@ def __init__(self, shift_before: int = 0):
# each time a (non-initial!) observation is added.
self.count = 0

def add_init_obs(self, episode_id: EpisodeID, agent_id: AgentID,
env_id: EnvID, init_obs: TensorType,
def add_init_obs(self, episode_id: EpisodeID, agent_index: int,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • agent_id vs agent_idx was a bug
  • added timestep

@sven1977 sven1977 added tests-ok The tagger certifies test failures are unrelated and assumes personal liability. and removed tests-ok The tagger certifies test failures are unrelated and assumes personal liability. labels Nov 28, 2020
@@ -29,7 +29,7 @@ class ViewRequirement:
def __init__(self,
data_col: Optional[str] = None,
space: gym.Space = None,
shift: Union[int, List[int]] = 0,
data_rel_pos: Union[int, List[int]] = 0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not keep it as shift? It seems to be intuitive

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I liked shift, too. The problem is, there will also be an abs_pos soon (see attention net PRs). So I wanted to distinguish between these two concepts.

@ericl ericl added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Nov 30, 2020
whether to create those new envs in remote processes instead of
in the current process. This adds overheads, but can make sense
if your envs are expensive to step/reset (e.g., for StarCraft).
Use this cautiously, overheads are significant!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@sven1977 sven1977 merged commit 3ad9365 into ray-project:master Dec 1, 2020
@sven1977 sven1977 deleted the attention_nets_prep_2 branch March 27, 2021 11:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. tests-ok The tagger certifies test failures are unrelated and assumes personal liability.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants