-
Notifications
You must be signed in to change notification settings - Fork 262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable Multiple Preemptions within Cohort in a single Scheduling Cycle #2641
Enable Multiple Preemptions within Cohort in a single Scheduling Cycle #2641
Conversation
✅ Deploy Preview for kubernetes-sigs-kueue canceled.
|
ea4fc2f
to
6174900
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some nits and clarifications
/approve
/hold
pkg/scheduler/scheduler.go
Outdated
// NetUsage returns how much capacity this entry will require from the ClusterQueue/Cohort. | ||
// When a workload is preempting, it subtracts the preempted resources from the resources | ||
// required, as the remaining quota is all we need from the CQ/Cohort. | ||
func (e *entry) NetUsage() resources.FlavorResourceQuantitiesFlat { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func (e *entry) NetUsage() resources.FlavorResourceQuantitiesFlat { | |
func (e *entry) netUsage() resources.FlavorResourceQuantitiesFlat { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
pkg/scheduler/scheduler.go
Outdated
usage := maps.Clone(e.assignment.Usage) | ||
for target := range e.preemptionTargets { | ||
for fr, v := range e.preemptionTargets[target].WorkloadInfo.FlavorResourceUsage() { | ||
if _, hasFlavor := usage[fr]; !hasFlavor { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if _, hasFlavor := usage[fr]; !hasFlavor { | |
if _, uses := usage[fr]; !uses { |
It cannot be hasFlavor, because it's the flavor and resource.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
pkg/scheduler/scheduler_test.go
Outdated
// we disable this test for MultiplePreemption | ||
// logic, as the new logic considers this | ||
// unschedulable, while the old logic considers | ||
// it skipped. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so what changes is that eng-gamma would not be in wantLeft
? That's fine, but I wonder if we care about duplicating the test case for it.
Maybe not worth, but instead we can update this test when we remove the Legacy mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exactly, but rather in wantInadmissibleLeft
. I added a duplicate test case, with that minor change, to cover the new logic.
as a follow up PR, we could update the legacy logic to treat this case as inadmissible
as well.
pkg/scheduler/scheduler_test.go
Outdated
*utiltesting.MakeWorkload("a1", "eng-alpha"). | ||
Priority(0). | ||
Queue("other"). | ||
Request(corev1.ResourceCPU, "1500m"). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Request(corev1.ResourceCPU, "1500m"). | |
Request(corev1.ResourceCPU, "1.5"). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to 2
and removed the comment, as now the test case will fail in the old code after #2646
pkg/scheduler/scheduler_test.go
Outdated
Priority(0). | ||
Queue("other"). | ||
Request(corev1.ResourceCPU, "1500m"). | ||
ReserveQuota(utiltesting.MakeAdmission("other-alpha").Assignment(corev1.ResourceCPU, "default", "1500m").Obj()). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also use SimpleReserveQuota
for less verbosity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack for the future
i will not update existing test cases (if we do, we can do this all at once, when trying to reduce verbosity)
pkg/scheduler/scheduler_test.go
Outdated
// multiple workloads are able to issue preemptions on workloads within | ||
// their own CQs in a single scheduling cycle. | ||
// | ||
// Note: this test case passes for legacy logic when |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting... how does it overcome this check? (mode == flavorassigner.Preempt && cycleCohortsSkipPreemption.Has(cq.Cohort.Name))
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this line - when usage is equal to nominal, we set our cycleCohortsUsage to 0. We are checking that cycleCohortsUsage is > 0, rather than a membership check - a subtle change in https://github.com/kubernetes-sigs/kueue/pull/2622/files#r1684180659 which I'm reverting in #2646.
pkg/scheduler/scheduler_test.go
Outdated
Cohort("other"). | ||
Preemption(kueue.ClusterQueuePreemption{ | ||
WithinClusterQueue: kueue.PreemptionPolicyLowerPriority, | ||
BorrowWithinCohort: &kueue.BorrowWithinCohort{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove this test case, as it feels a bit artificial, and I would like to keep the flexibility of changing the algorithm without having to worry about breaking this case.
pkg/scheduler/scheduler_test.go
Outdated
@@ -2762,7 +3349,7 @@ func TestResourcesToReserve(t *testing.T) { | |||
}, | |||
} | |||
for _, tc := range cases { | |||
t.Run(tc.name, func(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't depend on the gate, so undo this to avoid increase tests' running time. I know it might be negligible, but we might forget about the possibility of optimizing this one if we ever need to optimize.
6174900
to
ef57f78
Compare
ef57f78
to
44375c1
Compare
// While requiring the same shared FlavorResource (Default, cpu), | ||
// multiple workloads are able to issue preemptions on workloads within | ||
// their own CQs in a single scheduling cycle. | ||
multiplePreemptions: MultiplePremptions, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't this be left for both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, the previous logic would trigger the skip preemption logic, as it counted overlapping FlavorResource.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we added a test case to cover this for the previous logic in #2646
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, true, true, this is what we are trying to fix lol
second commit b0c29f0 lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
LGTM label has been added. Git tree hash: 2fb9e195d49be81b6cb1e1d6c3bae301f679e08d
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alculquicondor, gabesaba The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold cancel |
kubernetes-sigs#2641) * Multiple Preemptions * Improve logging when skipping workload
What type of PR is this?
/kind feature
What this PR does / why we need it:
When scheduling a
Preempt
workload, if conflicting workload (same cohort, overlapping resource flavors) was processed before, we would skip this preemption as our calculations were invalidated.In this PR, we introduce less conservative calculations, to allow multiple preemptions within a cohort within one cycle, as long as the preemption targets do not overlap, and the workload still fits.
Additionally, we allow a
Preempt
fit workload to proceed, even if aFit
mode workload was previously processed, as long as the workload still fits.Finally we improve logging in the old logic, to differentiate "No Longer Fits" from "Preemptions Invalidated"
Which issue(s) this PR fixes:
Fixes #2596
Contributes to #1867
Special notes for your reviewer:
See initial round of comments here
Does this PR introduce a user-facing change?