Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Multiple Preemptions within Cohort in a single Scheduling Cycle #2641

Merged
merged 2 commits into from
Jul 19, 2024

Conversation

gabesaba
Copy link
Contributor

@gabesaba gabesaba commented Jul 18, 2024

What type of PR is this?

/kind feature

What this PR does / why we need it:

When scheduling a Preempt workload, if conflicting workload (same cohort, overlapping resource flavors) was processed before, we would skip this preemption as our calculations were invalidated.

In this PR, we introduce less conservative calculations, to allow multiple preemptions within a cohort within one cycle, as long as the preemption targets do not overlap, and the workload still fits.

Additionally, we allow a Preempt fit workload to proceed, even if a Fit mode workload was previously processed, as long as the workload still fits.

Finally we improve logging in the old logic, to differentiate "No Longer Fits" from "Preemptions Invalidated"

Which issue(s) this PR fixes:

Fixes #2596
Contributes to #1867

Special notes for your reviewer:

See initial round of comments here

Does this PR introduce a user-facing change?

Introduce the MultiplePreemptions flag, which allows more than one
preemption to occur in the same scheduling cycle, even with overlapping
FlavorResources

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jul 18, 2024
@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Jul 18, 2024
Copy link

netlify bot commented Jul 18, 2024

Deploy Preview for kubernetes-sigs-kueue canceled.

Name Link
🔨 Latest commit b0c29f0
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-kueue/deploys/669a6f2313e13900082bcd14

@gabesaba gabesaba force-pushed the multiple_preemptions branch 3 times, most recently from ea4fc2f to 6174900 Compare July 18, 2024 16:27
Copy link
Contributor

@alculquicondor alculquicondor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some nits and clarifications

/approve
/hold

// NetUsage returns how much capacity this entry will require from the ClusterQueue/Cohort.
// When a workload is preempting, it subtracts the preempted resources from the resources
// required, as the remaining quota is all we need from the CQ/Cohort.
func (e *entry) NetUsage() resources.FlavorResourceQuantitiesFlat {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func (e *entry) NetUsage() resources.FlavorResourceQuantitiesFlat {
func (e *entry) netUsage() resources.FlavorResourceQuantitiesFlat {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

usage := maps.Clone(e.assignment.Usage)
for target := range e.preemptionTargets {
for fr, v := range e.preemptionTargets[target].WorkloadInfo.FlavorResourceUsage() {
if _, hasFlavor := usage[fr]; !hasFlavor {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if _, hasFlavor := usage[fr]; !hasFlavor {
if _, uses := usage[fr]; !uses {

It cannot be hasFlavor, because it's the flavor and resource.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

pkg/scheduler/scheduler_test.go Show resolved Hide resolved
// we disable this test for MultiplePreemption
// logic, as the new logic considers this
// unschedulable, while the old logic considers
// it skipped.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so what changes is that eng-gamma would not be in wantLeft? That's fine, but I wonder if we care about duplicating the test case for it.
Maybe not worth, but instead we can update this test when we remove the Legacy mode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exactly, but rather in wantInadmissibleLeft. I added a duplicate test case, with that minor change, to cover the new logic.

as a follow up PR, we could update the legacy logic to treat this case as inadmissible as well.

*utiltesting.MakeWorkload("a1", "eng-alpha").
Priority(0).
Queue("other").
Request(corev1.ResourceCPU, "1500m").
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Request(corev1.ResourceCPU, "1500m").
Request(corev1.ResourceCPU, "1.5").

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to 2 and removed the comment, as now the test case will fail in the old code after #2646

Priority(0).
Queue("other").
Request(corev1.ResourceCPU, "1500m").
ReserveQuota(utiltesting.MakeAdmission("other-alpha").Assignment(corev1.ResourceCPU, "default", "1500m").Obj()).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also use SimpleReserveQuota for less verbosity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack for the future

i will not update existing test cases (if we do, we can do this all at once, when trying to reduce verbosity)

// multiple workloads are able to issue preemptions on workloads within
// their own CQs in a single scheduling cycle.
//
// Note: this test case passes for legacy logic when
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting... how does it overcome this check? (mode == flavorassigner.Preempt && cycleCohortsSkipPreemption.Has(cq.Cohort.Name))?

Copy link
Contributor Author

@gabesaba gabesaba Jul 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line - when usage is equal to nominal, we set our cycleCohortsUsage to 0. We are checking that cycleCohortsUsage is > 0, rather than a membership check - a subtle change in https://github.com/kubernetes-sigs/kueue/pull/2622/files#r1684180659 which I'm reverting in #2646.

Cohort("other").
Preemption(kueue.ClusterQueuePreemption{
WithinClusterQueue: kueue.PreemptionPolicyLowerPriority,
BorrowWithinCohort: &kueue.BorrowWithinCohort{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove this test case, as it feels a bit artificial, and I would like to keep the flexibility of changing the algorithm without having to worry about breaking this case.

@@ -2762,7 +3349,7 @@ func TestResourcesToReserve(t *testing.T) {
},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't depend on the gate, so undo this to avoid increase tests' running time. I know it might be negligible, but we might forget about the possibility of optimizing this one if we ever need to optimize.

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jul 18, 2024
// While requiring the same shared FlavorResource (Default, cpu),
// multiple workloads are able to issue preemptions on workloads within
// their own CQs in a single scheduling cycle.
multiplePreemptions: MultiplePremptions,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't this be left for both?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, the previous logic would trigger the skip preemption logic, as it counted overlapping FlavorResource.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we added a test case to cover this for the previous logic in #2646

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, true, true, this is what we are trying to fix lol

@mimowo
Copy link
Contributor

mimowo commented Jul 19, 2024

second commit b0c29f0 lgtm

Copy link
Contributor

@alculquicondor alculquicondor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 19, 2024
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 2fb9e195d49be81b6cb1e1d6c3bae301f679e08d

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alculquicondor, gabesaba

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@alculquicondor
Copy link
Contributor

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 19, 2024
@k8s-ci-robot k8s-ci-robot merged commit 06d49d7 into kubernetes-sigs:main Jul 19, 2024
16 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v0.8 milestone Jul 19, 2024
@gabesaba gabesaba deleted the multiple_preemptions branch July 19, 2024 14:52
kannon92 pushed a commit to openshift-kannon92/kubernetes-sigs-kueue that referenced this pull request Nov 19, 2024
kubernetes-sigs#2641)

* Multiple Preemptions

* Improve logging when skipping workload
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Can we preempt in more than one CQ per cohort in a cycle?
4 participants