-
-
Notifications
You must be signed in to change notification settings - Fork 720
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support receiving stimulus_id's in Scheduler.reschedule #6307
Conversation
Unit Test Results 16 files ± 0 16 suites ±0 6h 59m 45s ⏱️ - 45m 51s For more details on these failures, see this check. Results for commit f2f3a39. ± Comparison against base commit 8411c2d. ♻️ This comment has been updated with latest results. |
distributed/scheduler.py
Outdated
@@ -6697,7 +6697,7 @@ async def get_story(self, keys=()): | |||
|
|||
transition_story = story | |||
|
|||
def reschedule(self, key=None, worker=None): | |||
def reschedule(self, key=None, worker=None, stimulus_id=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not counting tests, I can see that this method is invoked only
- from the worker, through RPC
- directly, from stealing.py
In both cases a stimulus_id is available. So you could change this to mandatory.
So the Scheduler.reschedule method is never successful when invoked from the worker, and yet no tests break. What's the impact of having it work again? Could it be that the scheduler-side transition was unneeded to begin with? Or (at the other extreme) would this fix a potential deadlock? I think understanding this is way more important than the fix itself (which is straightforward). |
With your latest changes, the two calls from stealing.py won't work anymore - and yet all tests still pass. |
Related: #6332 |
Superseded by #6339 |
Follow up on #6340 |
In the log output e.g. https://github.com/dask/distributed/runs/6348164203?check_suite_focus=true for d7549b0
the following errors are occuring but are silently ignored:
worker_state_machine.RescheduleMsg
always sends stimulus_ids so this should be handled.pre-commit run --all-files