Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to cleanly delete an AKS managed cluster by capz #1395

Closed
LochanRn opened this issue May 27, 2021 · 4 comments · Fixed by #1397
Closed

Unable to cleanly delete an AKS managed cluster by capz #1395

LochanRn opened this issue May 27, 2021 · 4 comments · Fixed by #1397
Assignees
Labels
area/managedclusters Issues related to managed AKS clusters created through the CAPZ ManagedCluster Type kind/bug Categorizes issue or PR as related to a bug.

Comments

@LochanRn
Copy link
Member

/kind bug

What steps did you take and what happened:
Created an AKS Cluster using clusterctl.

The cluster was successfully created and was running fine.

When I tried to delete the cluster using the command

kubectl delete cluster

The kubernetes service in azure was deleted and all the resources created by it also got deleted successfully, but Azuremanagedcontrolplane, MachinePool, AzuremanagedCluster and Cluster Objects were stuck and not deleted.

What did you expect to happen:
All the cluster objects to be cleaned up successfully on deletion of the cluster

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]
PFB logs of capz pod and capi controller manager
capz-del.log
capi-cm-del.log

Environment:

  • cluster-api-provider-azure version: v0.4.15
  • Kubernetes version: (use kubectl version): 1.20.1
  • OS (e.g. from /etc/os-release):
@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label May 27, 2021
@CecileRobertMichon CecileRobertMichon added the area/managedclusters Issues related to managed AKS clusters created through the CAPZ ManagedCluster Type label May 27, 2021
@alexeldeib
Copy link
Contributor

alexeldeib commented May 27, 2021

There’s a circular dependency with deletion and finalizers.

If you delete only the CAPI cluster object, it will try to delete all the descendant machines/machinepools before deleting the control plane object. With AKS/AzureManagedControlPlane, the last MachinePool will fail to be deleted because an AKS cluster requires at least one agent pool.

The workaround is to manually delete the AzureManagedControlPlane. This will delete the entire AKS cluster, allowing the lingering MachinePool to hit a 404 on deletion, and the finalizer will be removed.

The probable fix is to check the cluster deletion timestamp. If it’s set, then remove the finalizers from all corresponding machine pools and don’t bother trying to delete them. Let AKS clean everything up on cluster deletion.

@alexeldeib
Copy link
Contributor

we're also not deleting unmanaged vnets in a user-managed RG since #1009

@alexeldeib
Copy link
Contributor

/assign

@alexeldeib
Copy link
Contributor

alexeldeib commented May 28, 2021

there's also a circular watch dependency from AMCP -> Cluster -> AMCP being initialized and ready. although that one self-resolves, it just is slow.

fixes incoming :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/managedclusters Issues related to managed AKS clusters created through the CAPZ ManagedCluster Type kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants