Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix preemption within cohort when there is no borrowingLimit #1561

Merged

Conversation

mimowo
Copy link
Contributor

@mimowo mimowo commented Jan 9, 2024

What type of PR is this?

/kind bug

What this PR does / why we need it:

To fix handling of preemption within a cohort when there is no borrowingLimit. In that case
during preemption the available resources coming from borrowing were calculated as if borrowingLimit=0.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Fix handling of preemption within a cohort when there is no borrowingLimit. In that case,
during preemption, the permitted resources to borrow were calculated as if borrowingLimit=0, instead of unlimited.

As a consequence, when using `reclaimWithinCohort`, it was possible that a workload, scheduled to ClusterQueue with no borrowingLimit, would preempt more workloads than needed, even though it could fit by borrowing.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. labels Jan 9, 2024
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 9, 2024
Copy link

netlify bot commented Jan 9, 2024

Deploy Preview for kubernetes-sigs-kueue canceled.

Name Link
🔨 Latest commit a550f53
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-kueue/deploys/659d8aa02357770008a83d18

@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jan 9, 2024
@mimowo mimowo force-pushed the borrowing-while-preemption-fix branch from 9785bbe to 9776fa4 Compare January 9, 2024 08:15
@mimowo
Copy link
Contributor Author

mimowo commented Jan 9, 2024

/hold
To investigate a potential bug

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 9, 2024
@mimowo mimowo force-pushed the borrowing-while-preemption-fix branch from 9776fa4 to 24bdf54 Compare January 9, 2024 12:03
@mimowo
Copy link
Contributor Author

mimowo commented Jan 9, 2024

/hold cancel
Should be fixed

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 9, 2024
@mimowo mimowo force-pushed the borrowing-while-preemption-fix branch from 24bdf54 to 0fab512 Compare January 9, 2024 12:07
@mimowo
Copy link
Contributor Author

mimowo commented Jan 9, 2024

@yaroslava-serdiuk
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 9, 2024
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 27ef62c50e345742b36da1faeb527364b7e42818

Copy link
Contributor Author

@mimowo mimowo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/assign @alculquicondor

@alculquicondor
Copy link
Contributor

/release-note-edit

NONE

Because this is a fix for an unreleased feature.

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. and removed release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Jan 9, 2024
@alculquicondor
Copy link
Contributor

My bad, this is about an existing feature

/release-note-edit

Fix handling of preemption within a cohort when there is no borrowingLimit. In that case
during preemption the available resources coming from borrowing were calculated as if borrowingLimit=0.

As a consequence, when using `borrowWithinCohort` it was possible that a workload scheduled to ClusterQueue with no borrowingLimit would not be scheduled, even though preemption while borrowing could let it schedule.

Also, when using `reclaimWithinCohort` it was possible that a workload, scheduled to ClusterQueue with no borrowingLimit, would preempt more workloads than needed, even though it could fit by borrowing. 

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Jan 9, 2024
Copy link
Contributor

@alculquicondor alculquicondor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need to make a separate cherry-pick for 0.5

pkg/scheduler/preemption/preemption_test.go Outdated Show resolved Hide resolved
pkg/scheduler/preemption/preemption.go Show resolved Hide resolved
@mimowo mimowo force-pushed the borrowing-while-preemption-fix branch from 0fab512 to fce9ac3 Compare January 9, 2024 17:29
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 9, 2024
@mimowo
Copy link
Contributor Author

mimowo commented Jan 9, 2024

My bad, this is about an existing feature

Indeed, this is a bug that affects both the existing feature and the new one. I can clean up the release note to only mention the existing feature.

@mimowo mimowo force-pushed the borrowing-while-preemption-fix branch from b936787 to a550f53 Compare January 9, 2024 18:04
@alculquicondor
Copy link
Contributor

/release-note-edit

Fix handling of preemption within a cohort when there is no borrowingLimit. In that case,
during preemption, the permitted resources to borrow were calculated as if borrowingLimit=0, instead of unlimited.

As a consequence, when using `reclaimWithinCohort`, it was possible that a workload, scheduled to ClusterQueue with no borrowingLimit, would preempt more workloads than needed, even though it could fit by borrowing.

Copy link
Contributor

@alculquicondor alculquicondor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 9, 2024
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 20e4cbe3c067b7a7df474769606211a577266ea9

@alculquicondor
Copy link
Contributor

Don't forget to create a separate PR for 0.5

@alculquicondor
Copy link
Contributor

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alculquicondor, mimowo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 9, 2024
@k8s-ci-robot k8s-ci-robot merged commit a7ad78d into kubernetes-sigs:main Jan 9, 2024
14 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v0.6 milestone Jan 9, 2024
k8s-ci-robot pushed a commit that referenced this pull request Jan 10, 2024
…en no borrowingLimit (#1564)

* Fix the borrowing while preemption when no borrowingLimit

* improve readability
@mimowo mimowo deleted the borrowing-while-preemption-fix branch February 10, 2024 11:48
kannon92 pushed a commit to openshift-kannon92/kubernetes-sigs-kueue that referenced this pull request Nov 19, 2024
…tes-sigs#1561)

* Fix the borrowing while preemption when no borrowingLimit

* improve readability
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants