-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: requeue pods rejected by Extenders properly #122022
Conversation
Please note that we're already in Test Freeze for the Fast forwards are scheduled to happen every 6 hours, whereas the most recent run was: Thu Nov 23 09:57:18 UTC 2023. |
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/cc @alculquicondor I think it's worth cherry-picking this PR in older versions. |
/hold To involve other approvers |
23b9cf5
to
468e2da
Compare
/lgtm |
LGTM label has been added. Git tree hash: 33b5a3ba2a251a8b9a749a9185978a2e9ae1d34b
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alculquicondor, sanposhiho The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Well, it's better to cherry-pick this for all supported versions because this event-based requeueing has been existing for a long time. |
sgtm |
/unhold Got approval, not merged anyway though |
…122022-upstream-release-1.28 Automated cherry pick of #122022: fix: requeue pods rejected by Extenders properly
…122022-upstream-release-1.26 Automated cherry pick of #122022: fix: requeue pods rejected by Extenders properly
…122022-upstream-release-1.27 Automated cherry pick of #122022: fix: requeue pods rejected by Extenders properly
I do not understand this PR. This was merged in 1.26, 1.27, 1.28 and 1.30, but not in 1.29? Why? cc @alculquicondor |
Looks like this PR gets approved during 1.29 code freeze, we cherry-picked 26, 27, and 28, and then it's merged into 30. |
…022-upstream-release-1.29 Automated cherry pick of #122022: fix: requeue pods rejected by Extenders properly
What type of PR is this?
/kind bug
What this PR does / why we need it:
Extender doesn't support any kind of requeueing feature like EnqueueExtensions in the scheduling framework.
When Extender filters out some Nodes, we don't set any unschedulable plugins at all.
It means Extender is completely ignored during the requeueing process.
So, what's happening is:
The latter case is serious because it could make Pods stuck in unschedulable pod pool in 5min in the worst case scenario.
This PR makes the scheduling queue aware of extenders' failures.
After this PR, when Extenders reject some Nodes and the pod ends up being unschedulable, this Pod will be requeued from unschedulable pod pool to activeQ/backoffQ by any kind of cluster events.
Which issue(s) this PR fixes:
Fixes #122019
Special notes for your reviewer:
Probably, we have to cherry-pick this PR into past versions?
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: