Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic on Cluster deletion #27

Closed
clementblaise opened this issue Sep 27, 2022 · 4 comments
Closed

Panic on Cluster deletion #27

clementblaise opened this issue Sep 27, 2022 · 4 comments
Labels
bug Something isn't working stale

Comments

@clementblaise
Copy link
Contributor

clementblaise commented Sep 27, 2022

What happened?

The cluster was successfully created, but when the resource is deleted the controller had a panic :

E0927 12:40:05.990900       1 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 496 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic(0x1e6e180, 0x3448730)
	/home/runner/work/provider-argocd/provider-argocd/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:74 +0x95
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/home/runner/work/provider-argocd/provider-argocd/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x86
panic(0x1e6e180, 0x3448730)
	/opt/hostedtoolcache/go/1.16.11/x64/src/runtime/panic.go:965 +0x1b9
github.com/crossplane-contrib/provider-argocd/pkg/controller/cluster.(*external).Observe(0xc000ade1c0, 0x24029c0, 0xc0008681e0, 0x242de10, 0xc0001666c0, 0x2402b80, 0xc000ade1c0, 0x0, 0x0)
	/home/runner/work/provider-argocd/provider-argocd/pkg/controller/cluster/controller.go:111 +0x7b4
github.com/crossplane/crossplane-runtime/pkg/reconciler/managed.(*Reconciler).Reconcile(0xc000294e60, 0x24029f8, 0xc000c180c0, 0x0, 0x0, 0xc000bbf68c, 0x4, 0xc000c18000, 0x0, 0x0, ...)
	/home/runner/work/provider-argocd/provider-argocd/vendor/github.com/crossplane/crossplane-runtime/pkg/reconciler/managed/reconciler.go:577 +0x1278
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000294f00, 0x2402950, 0xc000a36e40, 0x1ef91e0, 0xc000bec620)
	/home/runner/work/provider-argocd/provider-argocd/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:298 +0x30d
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000294f00, 0x2402950, 0xc000a36e40, 0xc0009c1e00)
	/home/runner/work/provider-argocd/provider-argocd/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253 +0x205
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.2(0x2402950, 0xc000a36e40)
	/home/runner/work/provider-argocd/provider-argocd/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:216 +0x4a
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1()
	/home/runner/work/provider-argocd/provider-argocd/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185 +0x37
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc0009c1f50)
	/home/runner/work/provider-argocd/provider-argocd/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155 +0x5f
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc000c97f50, 0x23bf140, 0xc000c18060, 0xc000a36e01, 0xc0000b9020)
	/home/runner/work/provider-argocd/provider-argocd/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156 +0x9b
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc0009c1f50, 0x3b9aca00, 0x0, 0x9e1f01, 0xc0000b9020)
	/home/runner/work/provider-argocd/provider-argocd/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x98
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext(0x2402950, 0xc000a36e40, 0xc000b68030, 0x3b9aca00, 0x0, 0x1)
	/home/runner/work/provider-argocd/provider-argocd/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185 +0xa6
k8s.io/apimachinery/pkg/util/wait.UntilWithContext(0x2402950, 0xc000a36e40, 0xc000b68030, 0x3b9aca00)
	/home/runner/work/provider-argocd/provider-argocd/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:99 +0x57
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
	/home/runner/work/provider-argocd/provider-argocd/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:213 +0x40d
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x1c88934]

The following code block get the cluster, but when a cluster has been deleted the Argocd API return a permission denied error, which is not checked.

observedCluster, err := e.client.Get(ctx, &clusterQuery)
if cluster.IsErrorClusterNotFound(err) ||
(meta.WasDeleted(mg) && meta.GetExternalName(cr) != observedCluster.Name) {
// ArgoCD Cluster resource ignores the name field. This detects the deletion of the default cluster resource.
return managed.ExternalObservation{}, nil
}

Using the cli I confirmed the behaviour

argocd  cluster get deleted-cluster
WARN[0000] Failed to invoke grpc call. Use flag --grpc-web in grpc calls. To avoid this warning message, use flag --grpc-web.
FATA[0000] rpc error: code = PermissionDenied desc = permission denied

How can we reproduce it?

Create a Cluster then delete the resource

What environment did it happen in?

Argocd 2.4.11
Crossplane version: 1.9.0
Crossplane Provider argocd version: 0.1.0

@clementblaise clementblaise added the bug Something isn't working label Sep 27, 2022
@clementblaise clementblaise changed the title Panic on Cluster Deletion Panic on Cluster deletion when missing permission Sep 27, 2022
@clementblaise clementblaise changed the title Panic on Cluster deletion when missing permission Panic on Cluster deletion Sep 27, 2022
@janwillies
Copy link
Collaborator

janwillies commented Sep 29, 2022

Did that change recently? It used to be code = NotFound desc = cluster: https://github.com/crossplane-contrib/provider-argocd/blob/main/pkg/clients/cluster/client.go#L15

EDIT: I remember we had some challenges with non-existing resources from the argocd api before: argoproj/argo-cd#5951

@clementblaise
Copy link
Contributor Author

It would be surprising if this was changed on purpose, might be a regression along the way.

The issue cannot be fully resolve in the provider as long as Argo doesn't return the appropriate code, I opened an issue argoproj/argo-cd#10830.

We could check before IsErrorClusterNotFound if the API returns PermissionDenied, it would return an empty ExternalObservation with the error and avoid the panic. What do you think @janwillies ?

@janwillies
Copy link
Collaborator

We could check before IsErrorClusterNotFound if the API returns PermissionDenied, it would return an empty ExternalObservation with the error and avoid the panic

Reading the discussion on the upstream issue probably the best workaround would be to check additionally for the new error (code = PermissionDenied desc = permission denied)

@github-actions
Copy link

Crossplane does not currently have enough maintainers to address every issue and pull request. This issue has been automatically marked as stale because it has had no activity in the last 90 days. It will be closed in 7 days if no further activity occurs. Leaving a comment starting with /fresh will mark this issue as not stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale
Projects
None yet
Development

No branches or pull requests

2 participants