-
-
Notifications
You must be signed in to change notification settings - Fork 719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove superfluous ShuffleSchedulerExtension.barriers
#7389
Conversation
Unit Test ResultsSee test report for an extended history of previous test failures. This is useful for diagnosing flaky tests. 18 files ±0 18 suites ±0 8h 23m 27s ⏱️ + 23m 3s For more details on these failures, see this check. Results for commit 6fede6c. ± Comparison against base commit 047b082. ♻️ This comment has been updated with latest results. |
if not self._is_barrier_key(key): | ||
return | ||
shuffle_id = self.id_from_key(key) | ||
if shuffle_id not in self.worker_for: | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: I ran a couple of tests and think we do not suffer any significant performance impact here. However, I would like to ask for caution when it comes to this sort of refactoring in a transition hook. Str comparisons + replacements are not cheap, relatively speaking. Transitions are amongst the hottest for loops we have and it typically pays off to be a bit cautious here. Everything past these guards is allowed to be slow since we're only executing it for barriers but the guards themselves are evaluated for every task in a potentially very large graph.
This changes introduces an overhead of 100-200ns compared to the earlier version so I think this is fine. However, this can quickly spiral out of control if the logic on these methods become more complex or we open the transition to other allowed finished states, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, could you share the snippet you used to benchmark this change? I ran a very simple test that showed no meaningful difference for the latest commit.(However, 9c2a9c4 had a ~200ns overhead when I checked runtimes.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: We could inline _is_barrier_key
to shave off the function call overhead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran something like this
adict = {f"foo-{x}": True for x in range(10)}
a_long_key = "shuffle-barrier-123566886512341"
def foo_main():
if finish != "forgotten":
return
if a_long_key in adict.values():
return
def is_barrier():
return a_long_key.startswith("shuffle-barrier")
def foo():
if finish != "forgotten":
return
if not is_barrier():
return
assert a_long_key.startswith("shuffle-barrier")
key = a_long_key.replace("shuffle-barrier", "")
if key in adict:
return
return
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: We could inline _is_barrier_key to shave off the function call overhead.
Don't worry about it. I merely wanted to ensure some awareness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for sharing, the difference in our results makes sense. I had focused on the common case where the key is forgotten but it's not a shuffle barrier, which is coincidentally faster in your example.
I merely wanted to ensure some awareness.
Appreciated!
No related failures on CI |
The use of the
ShuffleSchedulerExtension.barriers
dictionary can be replaced by the use of existing state-tracking collections and class-level key-generation methods.pre-commit run --all-files