-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP: add non-preempting option to PriorityClasses #901
Conversation
the scheduler will preempt lower priority pods to schedule this pod, | ||
as is current behavior. | ||
|
||
If NonPreempting is true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we support update this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The nice thing about this KEP is that it only affects the scheduling of new pods and not the "evictability" of running pods. Updating this value seems simple to support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm.... if we add a new field in PodTemplate
, it's hard to update created pods :) If we hold PriorityClass in scheduler, how about kubelet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll look more into this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we're going with denormalizing the field into PodSpec... so from my understanding, that will present some upgrade challenges.
the scheduler will preempt lower priority pods to schedule this pod, | ||
as is current behavior. | ||
|
||
If NonPreempting is true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The nice thing about this KEP is that it only affects the scheduling of new pods and not the "evictability" of running pods. Updating this value seems simple to support.
If I declare a priority class "foo" and say it is not pre-emptible, is a system-node-critical or system-cluster-critical priority based workload able to pre-empt it? It seems like we would need to always enable those priorities to force pre-emption to maintain a functional cluster. |
So I had to read the KEP carefully to get this, but I don't think that's how it works. This new setting controls whether a pod will trigger eviction of lower priority pods, not whether the pod can itself be evicted. If that's the behavior we decide to implement, we'll need to be extra careful to document it clearly. |
@misterikkit -- i reread, and i think this is right. the system-* priorities can pre-empt other workloads, but the "data-science" priority would not pre-empt anything else. if that was the case, that makes sense. |
I'll think about rephrasing the description so that's more apparent. |
|
||
## Graduation Criteria | ||
|
||
* Users are reporting that this resolves their workload priority use-cases |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this feature will be featuregated or since for the most part it will be backward compatible assuming default preempting field value to be true, we'd keep it without a featuregate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we'd need a feature gate. The default behavior is the same as existing behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new fields added to stable APIs get feature gated for a release to ensure HA upgrade can succeed without data loss
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, okay.
@denkensk it would help if you could handle some of the technical questions, as the current owner of kubernetes/kubernetes#67671 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please address the remaining items to get this merged?
Will do. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @vllry! It looks good. It would be great if you could also add a user story. You can use the info in my comment to write about usage of this feature for batch and/or other workloads that want faster scheduling without interrupting running pods.
To clarify my comment, you may want to move part of the current "motivation" section to a "user story" section. |
Thanks, @bsalamat, that makes sense. Still working on the details about feature gating and pod spec upgrades... but it's starting to look a lot more complete. |
I've addressed all the outstanding comments, except for compatibility and changes w/r/t the PodSpec. I'll do more reading this weekend on the implications and how this is normally handled, but suggestions would be appreciated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
This change is backward compatible since the default value of the "preempting" field is true.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bsalamat, vllry The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
ref/ kubernetes/kubernetes#67671