-
Notifications
You must be signed in to change notification settings - Fork 532
A single federated cluster can stop propagation of a type for all clusters if it does not have the specified resource version. #1241
Comments
@dgorst Thanks for your feedback. |
@dgorst Prepare clusters:
Operation Steps:
Result:
You expected the CRD will be propagated to cluster2 and ignore the cluster3, right? |
Yes exactly @RainbowMango 👍 It feels like the blast radius from a single (tbf misconfigured) cluster, should not impact propagation to the good clusters. So in your example, yes I don't expect a v1 CRD in cluster1 to be propagated to cluster3, but I would expect it to continue to be propagated to cluster2. I mention a CR of the type of the CRD as that would also stop propagating at the point the 1.15 cluster is joined. But it's the same issue I guess (the CRD doesn't get propagated because it can't list v1/crds, so it also can't list that type either) |
@dgorst The following check keeps failing. I agree with you that the propagation process should ignore |
Thanks @RainbowMango for recreating and confirming 👍 Happy to have a stab at resolving this if that'll help? (caveat: I'm new to the kubefed codebase so may need to reach on slack with some questions though!) |
I've tried a workaround locally, but the community has discussed a better solution. @hectorj2f @jimmidyson @irfanurrehman |
@RainbowMango thanks for tracking this. IMO the solution proposed by pmorie as per the link you mentioned is completely legit and can be implemented. As far as I understand @font might not be available to complete it. |
Given the implementation is a little bit complicated(API change, controller adopt, testing, etc...), I'd like to set up an umbrella issue and split this to several tasks and then run it by iteration. @dgorst you are welcome and feel free to pick any iterated items you interested in. How do you say? @irfanurrehman , and If it's ok for you, can you help review the following PRs? |
Awsome suggestion @RainbowMango. I can certainly review the same. |
Thanks for taking care of this @RainbowMango. It sounds good to me too. Share the action items to see if we can help somehow. |
Just sent a draft issue #1252. I have started some work locally, so I'll take the first task. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
Rotten issues close after 30d of inactivity. Send feedback to sig-contributor-experience at kubernetes/community. |
@fejta-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
A single federated cluster can stop propagation of a type for all clusters if it does not have a particular resource version.
And a question - any good strategies for handling cluster estates that could have multiple versions of a resource in circulation (e.g. v1beta1 and v1 CRDs)
Editing the target type version in the federated type config to v1beta1 (lowest common denominator) appears to work around this ok (tbc), but it's still worrying a single cluster could stop all federation working - seems like this shouldn't be the expected behaviour.
What happened:
Run a federation control plane at kube version 1.16
Enabled federation of CRDs (v1)
Joined another 1.16 cluster - confirmed CRDs and CRs of that type are being propagated ok
Joined a 1.15 cluster - CRDs+CRs not propagated to the 1.15 cluster (CRDs at version v1beta1). All propagation of CRDs and CRs of the same type stopped working for the 1.16 cluster as well.
Logs for the controller manager show msgs like:
What you expected to happen:
I expected v1 CRDs not to propagate to the 1.15 cluster, however I did not expect the propagation of all CRDs to all clusters to stop working.
How to reproduce it (as minimally and precisely as possible):
Run a federation control plane at kube version 1.16+
Enabled federation of v1 CRDs
Create a Federated CRD, and a CR of that type with placement that will match all clusters
Join another 1.16 cluster - confirmed CRD and CR are being propagated ok
Join a 1.15 cluster - expect the CRD and CR not to be propagated
Create a new federated CRD, or a CR of the original type - these should still be propagated to the 1.16 cluster but I have observed they are not.
Anything else we need to know?:
Environment:
/kind bug
The text was updated successfully, but these errors were encountered: