-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve patch helper #8706
Comments
This issue is currently awaiting triage. If CAPI contributors determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Not sure if my suggestion is feasible. I see this issue for now mostly as a place where we can collect feedback about the patch helper and then over time figure out if we want to change something |
Eventually we could make use of server side apply instead of doing our own patching logic? The reason the patching logic isn't using optimistic locking, is ultimately to avoid deadlocks or missing data; if we haven't seen problems in reality with the current logic and the above is mostly an optimization, we could start revisiting by conditionally using a new patcher that uses server side apply instead. That said, I think having better support in controller-runtime first would be a pre-requisite. |
It is an option, although I wonder how to use SSA in a performant way. Executing one SSA per reconcile seems not very efficient compared to what we do today (nothing if nothing changes). But maybe we can also use some sort of diff logic and only run SSA if something changes. Probably fine for fields that are only owned by us. I'm not sure if just using SSA helps against the issues mentioned above. (Does it change something regarding status conditions?)
How would we deadlock with optimistic locking? What do you mean with missing data? (rather write incomplete/outdated data vs no data if we would requeue?)
The problem is that the status is not reliable. An example: KCP is going through a rollout
If patchHelper calculates the diff based on T0 instead of T1 .status.replicas won't be patched to 4 and stay on 3. We had code in the topology controller which was starting MD upgrades once KCP is rolled out. To check if KCP is rolled out we were checking .spec.replicas, .status.replicas, .status.readyReplicas, .status.updatedReplicas. Once we figured out what was going on we implemented a workaround to also check .status.unavailableReplicas. (#7914) But this is just one example of how status can be not reliable and the "fix" is also very brittle. What it comes down to for me is that basically our status is not trustworthy and I think there is no way to guard against it. I think .status.observedGeneration only guarantees us that the status was calculated against the latest generation (i.e. spec) (status updates don't trigger a new generation). Thus we have no guarantee that the patch to write the status was calculated correctly. That being said, overall I think it's okay'ish to keep it as is for now. The edge cases are rare and we didn't have many reports of concrete problems like the one I described above. |
FYI there is also another issue about optimistic lock #8412 in patch helper. Optimistic lock seems something that we can investigate at this point; it will also be great to combine this investigation with the work we are doing on stress tests, so we can battle-test the idea and have a good signal that we are not introducing issues due to the change. I think we should wait a little bit before SSA, we are still learning how to deal with it... |
Thinking about more about the historical reasons, we initially used to use In the patch case the above scenario shouldn't happen; that said, the immediate concern that comes from locking based on resource version is the sequential calls we make to patch metadata+spec, status, and then conditions. By enabling optimistic locks everywhere, we'd need to handle the case where the call to metadata+spec goes through, the resource version changes, and we'd have to reapply the diff on status on the latest resource version. The conditions code today does exactly that cluster-api/util/patch/patch.go Lines 231 to 234 in 1f69d07
In a generic case, which the patch helper is trying to satisfy, if the metadata+spec patch goes through, retrieving the object again and re-applying (forcibly) the status diff on the current object might encounter the same issues we have today; hence why I was suggesting SSA, which more in general would be a safer option, but it comes with its own set of limitations. The other point to understand, if the current patching behavior isn't causing real world problems and has been working good enough (at least for now), is this something that needs to be optimized? |
For now my goal was mainly to get a discussion started and collect limitations / issues of the current patch helper (and make them visible) somewhere which is not my local todo list :). I don't have any immediate plans (and bandwith) to optimize it anyway. Happy to wait for now how the SSA story evolves. |
We hit bugs in the described scenario by @sbueringer when using CAPI patchHelper in our controllers due to the optimistic locking is not used to write Spec & Status, only for status conditions. I think CAPI controllers itself could hit the same bugs too. We can write another enhanced PatchHelper based on CAPI PatchHelper, but it would be great if CAPI patchHelper can expose the optimistic locking as an option for the caller to specify. |
Are there test cases or examples you can provide to understand the issue a bit more? |
This is the case in CAPI controller. As PatchHelper will patch CR.Status.Conditions -> CR.Spec & CR.Metadata -> CR.Status in sequence.
It's similar in the case of my controller. It depends on an annotation and a status field in reconcilation. But the patch helper could calculate a patch based on a stale object and patch CR with the stale annotation.
|
In this case I'd expect the entire reconcile loop to be kicked off again; in reality we've seen that having locking in place can cause data loss in most cases, especially when in Cluster API (and its providers) controllers perform non re-entrant operations like creating an external resource and need to store identifiers that would otherwise be lost. In cases like these, reconciling again would result in duplicated resources for example. |
/triage needs-information |
We hit another problem caused by CAPI patchHelper without setting resourceVersion. When creating two ClusterResourceSets for a Cluster at the same time, CAPI starts reconciling ClusterResourceSets and both reconciles use patchHelper to patch ClusterResourceSetBinding.Spec.Bindings which is slice. Then these two patch calls could overwrite other's data. In most cases this will not cause issues. Especially in our case, the kapp-controller yaml manifest in one of the ClusterResourceSet can not be applied more than once, because kapp-controller pod sets caBundle for the APIService CR in this yaml only when the pod starts. |
This seems something that can be solved by using SSA, and potentially improve the implementation of ClusterResourceSet and its bindings. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close not-planned |
@k8s-triage-robot: Closing this issue, marking it as "Not Planned". In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Today patch helper has the following behaviors which can lead to surprising behavior / CR state:
Overall this seems to work good enough today and eventually after multiple reconciles CRs should be in the correct state again.
Nonetheless, I wonder if we should change / re-design the patch helper.
One idea would be to:
Detailed Description
.
Anything else you would like to add?
No response
Label(s) to be applied
/kind feature
One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels.
The text was updated successfully, but these errors were encountered: