-
Notifications
You must be signed in to change notification settings - Fork 187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prune should respect dependsOn in a reverse order #301
Comments
Succinctly, I think the logic here is: This specifically needs to be done in the finalizer for the Kustomization that's being dependent on. |
@stealthybox we could update the Ready condition to make the deadlocks visible to users as the ready condition is showed on flux/kubectl get. |
Since this Finalizer behavior depends on the lifecycle of other objects, it could cause a state where an administrator or parent team's repository becomes undesirably unreconcilable, simply because of another Team's repo. This would be improved with the fluxcd/flux2#582 proposal however which would provide an RBAC solution to revoking access to dependsOn!! We should consider what the workflow or options could be for a team to forcefully indicate that their Kustomization should be deleted regardless of the consequences on dependent Kustomizations. Perhaps we don't want to support that. Option B When a Kustomization contains a CRD, make the Finalizer block until there are no more dependent CR's in the entire cluster. This seems reasonable but might be hard to query for across all Namespaces. Maybe there's a performant mechanism? Then a simple Boolean could opt-out, specifically for CRD's This doesn't cover other types of dependencies. Technically, it's possible (but complicated) for both of these behaviors to be implemented. |
Regarding general Kubernetes objects: However, it is very difficult to atomically delete these resources, because deleting in the wrong order can cause issues. fluxcd/helm-controller#270 explores some strategies to manage these needs using annotations, Finalizers, and Admission Control. I suspect that this is a limitation of Kubernetes API design, since dependencies are not required information. For Kustomizations and HelmReleases, we have Perhaps there is an already existing community component that can enforce ordered deletes between arbitrary in a generic, non-flux specific way. |
Client-side CLI removal using the user's privileged credentials could help users reconcile undesired removal situations. |
This caused problems for me as well. Had to write custom logic in wrapper script to handle deletion correctly (which kinda contradicts gitops deployment principle). |
You can do this via Git, delete the dependants first and commit the changes, Flux will finalise them accordingly. Then delete the top Kustomization and commit it. |
@stefanprodan My case is: We do blue/green k8s cluster upgrades. Instead of upgrading k8s cluster in-place, we bring up second cluster with the same apps deployed, switch traffic to it, then delete the original cluster, thus ensuring safe upgrades with zero downtime and ability to instantly roll back or abort upgrade in case of problems. GitOps controller (flux) provides a way to ensure that new clusters has the same apps (both clusters monitor the same git repository), so deleting by git commits is not possible. What actually causes problem: We install Istio operator controller with one Kustomization, then install IstioOperator resource with another Kustomization that depends on first one. Creation works fine, but deletion ignores the order and deletes operator controller before Istio control plane/ingress is deleted, which causes deletion of Istio to hand indefinitely. And we cannot just ignore Istio and delete the cluster anyways, because Istio creates a service of LoadBalancer type, which creates an actual load balancer, which will be orphaned if not deleted in cluster correctly. Current workaround: patch entrypoint Kustomization in old cluster to exclude IstioOperator resource Kustomization, wait till Istio operator controller actually deletes Istio, then delete entrypoint Kustomization. |
Are there any plans to get this worked on? |
Has anyone developed any workarounds for this behavior? We're running into the same issue when trying to cleanup resources deployed within a vcluster. vcluster creates a secret in the cluster that is used in subsequent HelmReleases to deploy things into the vcluster, similar to the following: apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: example-release
namespace: example-namespace
spec:
interval: 5m0s
dependsOn:
- name: vcluster
targetNamespace: default
storageNamespace: default
kubeConfig:
secretRef:
name: vc-vcluster
key: config
chart:
spec:
... Notice the It is worth noting that in our case, first removing all HelmReleases except for the |
@sgerace We made a workaround, but it's not really suitable for general case. In our case, Kustomizations are installed from HelmRelease (for reasons unrelated to this issue), and we made a custom pre-delete helm hook to handle deletion in proper order. |
@artem-nefedov, that's interesting, I'm wondering if you'd be willing to share any of the details around the pre-delete helm hook that you've developed? That's the direction we've started investigating, so any information you might be able to provide would help us come up with a suitable workaround for our situation. |
@sgerace We have multiple Kustomizations installed from a single helm chart via HelmRelease. The chart has a pre-delete hook Job (meaning it runs before any actual object is deleted by helm). The logic inside the Job is pretty simple: it deletes some Kustomizations that should be deleted first based on label selector, then sleeps for some time and exits. After that, remaining Kustomizations are deleted by helm itself. Unrelated to the topic, but also worth mentioning that we also have post-install and post-upgrade hooks that wait for successful reconciliation of Kustomizations, because helm-controller can't perform health checks on custom resources (even ones that are part of flux suite). This can be circumvented if you instead define healthChecks in Kustomization that creates HelmRelease. |
I wonder if a temporary workaround would be to support a manual depends-on annotation for child resources and the reconciler could just do a quick topological sort before deleting. Even an (opt-in?) topological sort at the level of kustomizations could be enough to fix this bug. At least assuming the garbage collector could accumulate all of the resources across kustomizations before pruning. |
When 2 kustomizations exists, where one dependsOn the other, then the pruning should happen in reversed ordering:
Usecase:
kustomization
operator
contains CRD and an operator for this CRDkustomization
usage
contains a CR from the CRD, where a finalizer from the operator is existingWhen then a commit is existing, where
usage.kustomization.yaml
andoperator.kustomization.yaml
is deleted, the garbage collection should respect the dependsOn in a reverse order:first the
usage.kustomization.yaml
should be deleted (with e.g. a namespace, which when pruned deletes all custom resources, and thus as theoperator.kustomization.yaml
is not deleted at this point in time finalizers can be applied)and then it can delete the
operator.kustomization.yaml
, as no depending kustomization is there anymore.currently the
operator.kustomization.yaml
and theusage.kustomization.yaml
will be deleted, which may result in not finishable finalizers on resources inusage.kustomization.yaml
because the operator may be gone.The text was updated successfully, but these errors were encountered: