-
Notifications
You must be signed in to change notification settings - Fork 774
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] CloneSet validating webhook requests pod with restartPolicy onFailure
even though it is supported
#794
Comments
@wgitg Actually, workloads for both stateless and stateful long-running-service, e.g. Deployment, CloneSet, StatefulSet, are all limited that It is to avoid continuously creating pods when pods turn into completed state, no matter expected or unexpected. |
@FillZpp I think you meant to say that To circle back to the first point here, any insights on why webhook configuration |
@wgitg You can read this doc. |
@FillZpp the thing is that the cloneset's validating webhook rejects the request when we specified the webhook configuration |
You may need to understand that:
|
Well, we are talking about the first case here, where we supposed to see apiserver should ignore the error from webhook and allow the request, even though the webhook returns errors, as specified in the doc Is it possible that the error here
returned from validating webhook for cloneset is not ignored by apiserver and used to reject the request? |
But the question is, if apiserver should allow the request no matter what response the webhook returns for |
@FillZpp Good question. But isn't that what the doc says: The doc does not specify what error is ignored or any error should be ignored..Clarifying this will help us to know if anything related to how cloneset validating webhook should return error or nothing needs to be done, do you agree? Maybe we can get more insights form the apiserver code? Are you familiar with the apiserver code where it does the behavior? |
It is the error for HTTPS calling, but not the rejection in response. You have to know that, when webhook rejects a requst, the HTTPS response code is still 200, but the message is in the response body, such as: {
"allowed": false,
"status": {
"code": 400,
"message": "spec.template.spec.restartPolicy: Unsupported value ......"
}
} |
OK following your logic, in the case of failurePolicy is Ignore, shouldn't the HTTPS response be like this?
|
No, webhook itself does not care of the |
Understood. But back to the original ask, why does the request is rejected by apisever (basically no pod can be created) when the specifies the Cloneset validating webhook policy failurePolicy is Ignore? what do you suggest at this point? |
I don't quite understand your question. As i said, even the failurePolicy is Ignore, kube-apiserver get the 200 response from webhook and use the status code |
Thanks for the summary. |
I'm not sure whether we have to provide a new feature-gate that allows users to create CloneSet with unlimited restartPolicy. In your case, is that some of the containers in Pod will exit while other containers always remain running? |
@FillZpp Yes. The use case is that not all of the applications is long running services, some colocated containers in the same CloneSet pod will do some stuff and exit. Per container restartPolicy might be expensive to support (?), but the feature gate seems to be a great idea to enable the flexibility. Can you enable the CloneSet to do that? |
@wgitg If a Pod with
|
@wgitg If you will use the InPlace Update feature, please note the comment before. Could you accept that? Besides, can those colocated containers define as init containers? So you don't have to set onFailure restartPolicy. |
@veophi @FillZpp Thanks for the above clarification! It is a surprise that Yes, we do want to leverage the InPlace Update feature for all containers in the same pod. In fact, part of the reason we did not do it in the init container because the in-place update only works in app containers, and we did not want the changes in the init containers from the colocated short running app cause the entire pod to be recreated thus affecting the main applications in the same pod. However, I am still pondering why the complications of mixing the flow of in-place update with the pod policy like this, because in-place update is about the restarting containers to deploy a new version, it should happen irrespective of how container exit before, shouldn't it? |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Given the complication been discussed, we think the alternative proposal can help us to truly achieve the in-place update. We may close this issue. |
closed by #901 |
What happened:
We are deploying a workload using CloneSet with restartPolicy as onFailure.
because one of the colocating containers is not a long running service and it needs to exit after its job is done.
We configured the validating webhook cloneset (
vcloneset.kb.io
) failurePolicy asIgnore
.However, the pod request was rejected given the error:
Then we somehow disabled the webhook completely (hacks such as giving the webhook a wrong cert in the field
caBundle
, controller-manager happily proceeds but just the webhook will not take effect), and we see our workloads deployed successfully with the expected behavior (no restarting the colocating container in the pod if exited successfully)I noticed something in the code base specified the webhook to fail the request once the restartPolicy is not Always.
What you expected to happen:
We expect that
How to reproduce it (as minimally and precisely as possible):
Just create the Cloneset with a pod restartPolicy as onFailure, webhook mutating webhook failurePolicy as
Ignore
and you will see the pod request is rejected with the aforementioned error
Environment
We are using Kruise version: 0.9.0, but the same issue exists for other versions.
If the problem fixed, we'd appreciate the change is available >= 0.9.0
The text was updated successfully, but these errors were encountered: