Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CAPA cluster stuck in deleting #1765

Closed
alex-dabija opened this issue Dec 8, 2022 · 3 comments
Closed

CAPA cluster stuck in deleting #1765

alex-dabija opened this issue Dec 8, 2022 · 3 comments
Labels
area/kaas Mission: Cloud Native Platform - Self-driving Kubernetes as a Service kind/bug provider/cluster-api-aws Cluster API based running on AWS team/phoenix Team Phoenix topic/capi

Comments

@alex-dabija
Copy link

alex-dabija commented Dec 8, 2022

Issue

CAPA cluster stuck in deleting:

❯ kubectl tree --context gs-golem -n org-giantswarm cluster alextest26
NAMESPACE       NAME                                                         READY  REASON   AGE
org-giantswarm  Cluster/alextest26                                           False  Deleted  2d23h
org-giantswarm  └─BackgroundScanReport/524730e1-b443-4d58-b389-cc51067a27f5  -               25h

The aws-network-topology-operator is the only remaining finalizer on the Cluster CR:

apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  annotations:
    cluster.giantswarm.io/description: test
    meta.helm.sh/release-name: alextest26
    meta.helm.sh/release-namespace: org-giantswarm
    network-topology.giantswarm.io/mode: GiantSwarmManaged
    network-topology.giantswarm.io/prefix-list: pl-0a8a617c87d81e774
    network-topology.giantswarm.io/transit-gateway: tgw-034e681b2d0288423
  creationTimestamp: "2022-12-05T11:42:56Z"
  deletionGracePeriodSeconds: 0
  deletionTimestamp: "2022-12-05T12:36:09Z"
  finalizers:
  - network-topology.finalizers.giantswarm.io
  generation: 4
  labels:
    app: cluster-aws
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: 0.19.0
    application.giantswarm.io/team: hydra
    cluster-apps-operator.giantswarm.io/watching: ""
    cluster.x-k8s.io/cluster-name: alextest26
    cluster.x-k8s.io/watch-filter: capi
    giantswarm.io/cluster: alextest26
    giantswarm.io/organization: giantswarm
    helm.sh/chart: cluster-aws-0.19.0
    release.giantswarm.io/version: 20.0.0-alpha1
  name: alextest26
  namespace: org-giantswarm
  resourceVersion: "107345451"
  uid: 524730e1-b443-4d58-b389-cc51067a27f5

Logs from aws-network-topology-operator:

❯ k -n giantswarm logs aws-network-topology-operator-5f44b6fd78-qxgxs | grep -i alextest26
1.6704951185634408e+09  INFO    Reconciling     {"controller": "cluster", "controllerGroup": "cluster.x-k8s.io", "controllerKind": "Cluster", "cluster": {"name":"alextest26","namespace":"org-giantswarm"}, "namespace": "org-giantswarm", "name": "alextest26", "reconcileID": "6f8785e6-b282-4b47-ab41-965a0e649b30"}
1.6704951185635045e+09  INFO    Reconciling delete      {"controller": "cluster", "controllerGroup": "cluster.x-k8s.io", "controllerKind": "Cluster", "cluster": {"name":"alextest26","namespace":"org-giantswarm"}, "namespace": "org-giantswarm", "name": "alextest26", "reconcileID": "6f8785e6-b282-4b47-ab41-965a0e649b30"}
1.670495118563537e+09   ERROR   transitgateway-registrar        Failed to get AWSCluster for Cluster    {"controller": "cluster", "controllerGroup": "cluster.x-k8s.io", "controllerKind": "Cluster", "cluster": {"name":"alextest26","namespace":"org-giantswarm"}, "namespace": "org-giantswarm", "name": "alextest26", "reconcileID": "6f8785e6-b282-4b47-ab41-965a0e649b30", "error": "AWSCluster.infrastructure.cluster.x-k8s.io \"alextest26\" not found"}
1.670495118563574e+09   INFO    Done reconciling        {"controller": "cluster", "controllerGroup": "cluster.x-k8s.io", "controllerKind": "Cluster", "cluster": {"name":"alextest26","namespace":"org-giantswarm"}, "namespace": "org-giantswarm", "name": "alextest26", "reconcileID": "6f8785e6-b282-4b47-ab41-965a0e649b30"}
1.6704951185635982e+09  ERROR   Reconciler error        {"controller": "cluster", "controllerGroup": "cluster.x-k8s.io", "controllerKind": "Cluster", "cluster": {"name":"alextest26","namespace":"org-giantswarm"}, "namespace": "org-giantswarm", "name": "alextest26", "reconcileID": "6f8785e6-b282-4b47-ab41-965a0e649b30", "error": "AWSCluster.infrastructure.cluster.x-k8s.io \"alextest26\" not found"}
1.6704961185647144e+09  INFO    Reconciling     {"controller": "cluster", "controllerGroup": "cluster.x-k8s.io", "controllerKind": "Cluster", "cluster": {"name":"alextest26","namespace":"org-giantswarm"}, "namespace": "org-giantswarm", "name": "alextest26", "reconcileID": "0fc1d11c-beaf-4e51-804e-593a1d290aae"}
1.6704961185647752e+09  INFO    Reconciling delete      {"controller": "cluster", "controllerGroup": "cluster.x-k8s.io", "controllerKind": "Cluster", "cluster": {"name":"alextest26","namespace":"org-giantswarm"}, "namespace": "org-giantswarm", "name": "alextest26", "reconcileID": "0fc1d11c-beaf-4e51-804e-593a1d290aae"}
1.6704961185647979e+09  ERROR   transitgateway-registrar        Failed to get AWSCluster for Cluster    {"controller": "cluster", "controllerGroup": "cluster.x-k8s.io", "controllerKind": "Cluster", "cluster": {"name":"alextest26","namespace":"org-giantswarm"}, "namespace": "org-giantswarm", "name": "alextest26", "reconcileID": "0fc1d11c-beaf-4e51-804e-593a1d290aae", "error": "AWSCluster.infrastructure.cluster.x-k8s.io \"alextest26\" not found"}
1.670496118564834e+09   INFO    Done reconciling        {"controller": "cluster", "controllerGroup": "cluster.x-k8s.io", "controllerKind": "Cluster", "cluster": {"name":"alextest26","namespace":"org-giantswarm"}, "namespace": "org-giantswarm", "name": "alextest26", "reconcileID": "0fc1d11c-beaf-4e51-804e-593a1d290aae"}
1.6704961185648572e+09  ERROR   Reconciler error        {"controller": "cluster", "controllerGroup": "cluster.x-k8s.io", "controllerKind": "Cluster", "cluster": {"name":"alextest26","namespace":"org-giantswarm"}, "namespace": "org-giantswarm", "name": "alextest26", "reconcileID": "0fc1d11c-beaf-4e51-804e-593a1d290aae", "error": "AWSCluster.infrastructure.cluster.x-k8s.io \"alextest26\" not found"}

Based on the logs, it's probably a race condition. The AWSCluster CR got deleted and the operator doesn't know how to handle this situation.

@alex-dabija alex-dabija added area/kaas Mission: Cloud Native Platform - Self-driving Kubernetes as a Service team/hydra topic/capi provider/cluster-api-aws Cluster API based running on AWS kind/bug labels Dec 8, 2022
@alex-dabija
Copy link
Author

I checked another cluster and the AWSCluster CR does have a finalizer for the network topology operator, but somehow in the case of alextest26 we ended up with a Cluster CR with the finalizer set, but no AWSCluster CR.

@AndiDog
Copy link

AndiDog commented Jan 4, 2023

Likely same as #1827

@alex-dabija alex-dabija added team/phoenix Team Phoenix and removed team/hydra labels Jun 29, 2023
@fiunchinho
Copy link
Member

As @AndiDog mentioned, I'd close this one because it looks like a duplicate of #1827

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kaas Mission: Cloud Native Platform - Self-driving Kubernetes as a Service kind/bug provider/cluster-api-aws Cluster API based running on AWS team/phoenix Team Phoenix topic/capi
Projects
None yet
Development

No branches or pull requests

3 participants