Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scheduler: Fix accounting when task ends up in multiple groups #2031

Merged

Conversation

aaronlehmann
Copy link
Collaborator

@aaronlehmann aaronlehmann commented Mar 13, 2017

Tasks are split between task groups based on common specs. This allows
nodes to only be ranked once per group, not once per task.

This logic doesn't work correctly because maps are marshalled in a
random order. Currently, the same task can end up in a multiple groups
(say, if it's updated multiple times, and the marshalling ends up being
different). To make sure we don't try to schedule the same task twice
within the same batch, use a map for unassignedTasks instead of a linked
list.

Note this doesn't fix the brokenness of task spec deduplication based on
marshalling the protobuf. This is a fix for the symptom that can be
backported, and I'm going to replace the marshalling stuff in a
different PR.

cc @aluzzardi @dongluochen

@dongluochen
Copy link
Contributor

LGTM

@codecov
Copy link

codecov bot commented Mar 13, 2017

Codecov Report

Merging #2031 into master will decrease coverage by 0.08%.
The diff coverage is 66.66%.

@@            Coverage Diff             @@
##           master    #2031      +/-   ##
==========================================
- Coverage   53.74%   53.66%   -0.09%     
==========================================
  Files         109      109              
  Lines       19194    19193       -1     
==========================================
- Hits        10316    10300      -16     
- Misses       7631     7640       +9     
- Partials     1247     1253       +6

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2b1b24b...83d3652. Read the comment docs.

Tasks are split between task groups based on common specs. This allows
nodes to only be ranked once per group, not once per task.

This logic doesn't work correctly because maps are marshalled in a
random order. Currently, the same task can end up in a multiple groups
(say, if it's updated multiple times, and the marshalling ends up being
different). To make sure we don't try to schedule the same task twice
within the same batch, use a map for unassignedTasks instead of a linked
list.

Note this doesn't fix the brokenness of task spec deduplication based on
marshalling the protobuf. This is a fix for the symptom that can be
backported, and I'm going to replace the marshalling stuff in a
different PR.

Signed-off-by: Aaron Lehmann <[email protected]>
@aaronlehmann aaronlehmann force-pushed the scheduler-resource-accounting branch from 281c31e to 83d3652 Compare March 13, 2017 22:42
@aaronlehmann
Copy link
Collaborator Author

I've changed this to make unassignedTasks a map instead. I think this is a better fix. It covers a case that wasn't being handled previously (noSuitableNode could overwrite an entry in schedulingDecisions).

@dongluochen
Copy link
Contributor

LGTM

Copy link
Contributor

@nishanttotla nishanttotla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants