-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP-3960: Introducing Sleep Action for PreStop Hook #3961
Conversation
AxeZhan
commented
Apr 22, 2023
- One-line PR description: This KEP proposes the addition of a new sleep action for the PreStop lifecycle hook in Kubernetes, allowing containers to pause for a specified duration before termination.
- Issue link: Introducing Sleep Action for PreStop Hook #3960
- Other comments: issue in k/k
/assign @SergeyKanzhelev @mrunalp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👋 Giving this a PRR review as a shadow this time around, have a few questions to clarify.
|
||
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)? | ||
|
||
The feature can be disabled in Alpha and Beta versions by restarting kube-apiserver with the feature-gate off. In terms of Stable versions, users can choose to opt-out by not setting the sleep field. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If users have set the sleep field and then the feature is disabled, it will just be ignored and the old behavior (i.e. no sleep) would apply?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then feature disabled
IIUC, you mean enabled first then setup the sleep field, then restart/disabled it? if that was the case, we want it be ignored and old behavior apply
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If users have set the sleep field and then the feature is disabled, it will just be ignored and the old behavior (i.e. no sleep) would apply?
Yes, in that case, the prestop hook will not take effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For most API changes, the feature gate controls the admission of new uses to the system, and not the actual implementation logic.
In other words, if you enable the feature, use the feature, then disable the feature - it keeps working, bt no NEW uses of the feature work. This is not universally true, we actually do not have a consistent rule here.
|
||
###### How can a rollout or rollback fail? Can it impact already running workloads? | ||
|
||
The change is opt-in, it doesn't impact already running workloads. But problems with the updated validation logic may cause crashes in the apiserver. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would the user determine this is the cause of crashes in the apiserver? Will there be any tests that help prevent this from making it into the release?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is misleading (crash), let me update later. If a pod with sleepAction was created and the featue is disabled. And this pod is recreated/updated by a user, the pod's yaml won't pass the validation.
In this case, an error will occur to point out the wrong field, instead of the "crash in the apiserver"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a hard rule that previously accepted objects must not later fail validation. When it comes to actual API review, we will ensure that :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a hard rule that previously accepted objects must not later fail validation. When it comes to actual API review, we will ensure that :)
Then I think, we can safely say that this feature will not impact already running workloads?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few more comments about PRR on top of previously added by @jeremyrickard
|
||
### Version Skew Strategy | ||
|
||
N/A |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disagree - it actually matters. It's definitely possible [even more, it will happen for sure at least for a moment] that the FG will be enabled only in one component and not the other.
Ideally, I would like to see matrix of:
- kubeapiserver enabled/disable
- kubelet enabled/disable
with description of what exactly we expect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated, but I'm not sure what will happen in the scenario only the kube-apiserver enable this feature
, will the creating request pass the validation and successsfully processed or will it be rejected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wojtek-t this goes to the discussion about alpha/beta that I started and have not followed up on :)
Should the kubelet even have the gate? We are not consistent.
IOW:
-
The feature gate only exists to prevent use of the field in the API. Once accepted on a pod, the feature is on for that pod. Kubelet treats the field like it is GA.
-
The feature gate prevents new use of the field in the API and nullifies the effect on existing uses. If gate is disabled, Kubelet will see the field and ignore it.
What do you think makes more sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The semantic that I believe works best is close to (2) [although I didn't fully understand "and nullifies ..." part]
I believe what we should target is:
If FG is disabled:
(a) any attempt to set the field for an object doesn't succeed - the field is silently dropped - the "dropFields" strategy: https://github.com/kubernetes/kubernetes/blob/master/pkg/registry/core/pod/strategy.go#LL89C10-L89C31
(b) if the field was set once the FG was enabled, it stays to be set
[the above is what kube-apiserver is doing]
(c) if the FG is disabled, even if Kubelet (or in general any other component) observes the field it ignores its existence
[for (c) there are exceptions - the nice example appeared in Sidecar KEP and computing resources in scheduler, but I personally treat them as exceptions - by default I consider the above the desired semantic]
@thockin - do you have any concerns about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this specific field, that seems somewhat acceptable because it only matters exactly once (when pod is deleted), although it is pretty surprising that the API says it is on but its really not.
But imagine a field which is used in real-time on a long-lived resource like Service or Deployment. The object was admitted, the feature was used, then the gate was disabled, the API stil says the feature is on, but suddenly it stopped working. Worse - it could stop working on some components and not others (e.g. disabled on some kube-proxy and noth others).
That makes testing MUCH more complicated.
Now imagine an enum sort of field or a loosening of validation, where a new value was allowed by the gate, and then the gate is disabled, and the value is no longer allowable. Do we fall back on some next-closest value without updating the API? That seems terrible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for the real-time example - I would actually say that the "stops working" (or "partially stops working") is still what I want. If I'm disabling a FG, I'm doing it on purpose, so I actually want this feature to really stop working, as this is presumably causing some troubles.
I definitely agree that the current support for it is poor, but I still think it's what I want.
And eventually the FeatureGate KEP will address the issue [you register a hook for disabling FG that is clearing this field, or doing whatever else is better in a given case]
Re enum example - this is harder one. The only ones that I've seen where effectively "yet another type of optional behavior", so disabling meant "you no longer have this, so you don't get anything". Which is kind of the same case as the above.
If it won't be an "optional behavior", but rather "some behavior is required" that would become super tricky and that may justify an exception. But I guess it depends on the specific case.
So I guess the summary of my answer is: if I'm requesting disabling a FG, I actually expect the feature to become disabled. And I acknowledge the fact that if someone was using it, they may got affected/broken, but I'm disabling FG for a reason.
@deads2k @johnbelamaric - FYI - this is an interesting discussion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jpbetz also :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, in this kep, should the kubelet have the gate?
if yes:
- only apiserver enable the FG
- validation will pass, but kubelet will ignore this field when pod is terminating?
- only kubelet enable the FG
- new pod will fail the validation, but the existing pods will exec the sleepAction.
if no:
- FG only controls the validation, once a container with sleepAction is set, it will always sleep before terminating regardless of whether FG still holds
Am I understanding this correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
THis is correct.
And what I'm saying is that we should go with the first option.
I know it has drawbacks, but I think the gains from it outweigh those drawbacks. It seems that Tim is far from being convinced on the generic case, but he seems to be ok for this particular case, so let's go for it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK
I remain LGTM on the design, not sure if this will make the deadline or not, but don't block it on me. IMO the KEP is substantially well understood and should proceeed to implementation, even if we argue a bit more about testing. |
/approve |
I had two more comments, but given they are quite minor I don't want to block my approval given deadline. /approve PRR |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: AxeZhan, mrunalp, SergeyKanzhelev, thockin, wojtek-t The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/unhold let's address as a follow up, only 30 minnutes left |
/lgtm |