-
Notifications
You must be signed in to change notification settings - Fork 615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scheduler: Fix accounting when task ends up in multiple groups #2031
scheduler: Fix accounting when task ends up in multiple groups #2031
Conversation
LGTM |
Codecov Report
@@ Coverage Diff @@
## master #2031 +/- ##
==========================================
- Coverage 53.74% 53.66% -0.09%
==========================================
Files 109 109
Lines 19194 19193 -1
==========================================
- Hits 10316 10300 -16
- Misses 7631 7640 +9
- Partials 1247 1253 +6 Continue to review full report at Codecov.
|
Tasks are split between task groups based on common specs. This allows nodes to only be ranked once per group, not once per task. This logic doesn't work correctly because maps are marshalled in a random order. Currently, the same task can end up in a multiple groups (say, if it's updated multiple times, and the marshalling ends up being different). To make sure we don't try to schedule the same task twice within the same batch, use a map for unassignedTasks instead of a linked list. Note this doesn't fix the brokenness of task spec deduplication based on marshalling the protobuf. This is a fix for the symptom that can be backported, and I'm going to replace the marshalling stuff in a different PR. Signed-off-by: Aaron Lehmann <[email protected]>
281c31e
to
83d3652
Compare
I've changed this to make |
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Tasks are split between task groups based on common specs. This allows
nodes to only be ranked once per group, not once per task.
This logic doesn't work correctly because maps are marshalled in a
random order. Currently, the same task can end up in a multiple groups
(say, if it's updated multiple times, and the marshalling ends up being
different). To make sure we don't try to schedule the same task twice
within the same batch, use a map for unassignedTasks instead of a linked
list.
Note this doesn't fix the brokenness of task spec deduplication based on
marshalling the protobuf. This is a fix for the symptom that can be
backported, and I'm going to replace the marshalling stuff in a
different PR.
cc @aluzzardi @dongluochen