Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadmControlPlane controller crashed if infrastructure template is not found #2751

Closed
MartinForReal opened this issue Mar 23, 2020 · 7 comments
Labels
area/control-plane Issues or PRs related to control-plane lifecycle management kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Milestone

Comments

@MartinForReal
Copy link
Contributor

MartinForReal commented Mar 23, 2020

What steps did you take and what happened:
the controller crashed when infrastructure template object is not found

What did you expect to happen:
controller should not crash when infrastructure template object is not found

Anything else you would like to add:
logic in updateStatus should be changed if infrastructure template object is not found

func (r *KubeadmControlPlaneReconciler) updateStatus(ctx context.Context, kcp *controlplanev1.KubeadmControlPlane, cluster *clusterv1.Cluster) error {

Environment:

  • Cluster-api version: 0.3.2
  • Minikube/KIND version:
  • Kubernetes version: (use kubectl version): 1.17.4
  • OS (e.g. from /etc/os-release): coreos

/kind bug

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Mar 23, 2020
@MartinForReal
Copy link
Contributor Author

I think we need to add a few extra logic in

// Since we are building up the LabelSelector above, this should not fail

to check if the controller exits due to apierror generated here

if err := r.reconcileExternalReference(ctx, cluster, kcp.Spec.InfrastructureTemplate); err != nil {

@vincepri
Copy link
Member

@sedefsavas do you want to take a look a this one?
/milestone v0.3.3
/priority important-soon

@k8s-ci-robot k8s-ci-robot added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Mar 23, 2020
@k8s-ci-robot k8s-ci-robot added this to the v0.3.3 milestone Mar 23, 2020
@sadysnaat
Copy link
Contributor

I have also observed kubeadmControlPlane crashing

E0323 15:13:08.749266       1 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 340 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic(0x16ac540, 0x2703060)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:74 +0xa3
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:48 +0x82
panic(0x16ac540, 0x2703060)
	/usr/local/go/src/runtime/panic.go:679 +0x1b2
sigs.k8s.io/cluster-api/controlplane/kubeadm/internal.(*Management).GetWorkloadCluster(0xc0003056e0, 0x1b085e0, 0xc0000d2060, 0xc00064c810, 0xb, 0xc00064c7f0, 0xb, 0xc000638000, 0x1, 0x1, ...)
	/workspace/controlplane/kubeadm/internal/cluster.go:73 +0xb6
sigs.k8s.io/cluster-api/controlplane/kubeadm/controllers.(*KubeadmControlPlaneReconciler).updateStatus(0xc0006bb020, 0x1b085e0, 0xc0000d2060, 0xc00025e280, 0xc000718780, 0x100000000000000, 0x1ad9ce0)
	/workspace/controlplane/kubeadm/controllers/kubeadm_control_plane_controller.go:333 +0x488
sigs.k8s.io/cluster-api/controlplane/kubeadm/controllers.(*KubeadmControlPlaneReconciler).Reconcile.func1(0xc00046dca8, 0xc00046dc98, 0xc0006bb020, 0x1b085e0, 0xc0000d2060, 0xc00025e280, 0xc000718780, 0x1b185e0, 0xc0003af7a0, 0xc0002f4870)
	/workspace/controlplane/kubeadm/controllers/kubeadm_control_plane_controller.go:175 +0xea
sigs.k8s.io/cluster-api/controlplane/kubeadm/controllers.(*KubeadmControlPlaneReconciler).Reconcile(0xc0006bb020, 0xc000765ed0, 0xb, 0xc000765eb0, 0xe, 0xc000474c00, 0x0, 0x1ac7e20, 0xc0006991e0)
	/workspace/controlplane/kubeadm/controllers/kubeadm_control_plane_controller.go:194 +0x707
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0001e40c0, 0x170d900, 0xc0000c8060, 0x0)
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:256 +0x162
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0001e40c0, 0x0)
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:232 +0xcb
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker(0xc0001e40c0)
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:211 +0x2b
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc000573af0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:152 +0x5e
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc000573af0, 0x3b9aca00, 0x0, 0x1, 0xc0006be0c0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:153 +0xf8
k8s.io/apimachinery/pkg/util/wait.Until(0xc000573af0, 0x3b9aca00, 0xc0006be0c0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:88 +0x4d
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:193 +0x328
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x1e0 pc=0x151fea6]

goroutine 340 [running]:
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:55 +0x105
panic(0x16ac540, 0x2703060)
	/usr/local/go/src/runtime/panic.go:679 +0x1b2
sigs.k8s.io/cluster-api/controlplane/kubeadm/internal.(*Management).GetWorkloadCluster(0xc0003056e0, 0x1b085e0, 0xc0000d2060, 0xc00064c810, 0xb, 0xc00064c7f0, 0xb, 0xc000638000, 0x1, 0x1, ...)
	/workspace/controlplane/kubeadm/internal/cluster.go:73 +0xb6
sigs.k8s.io/cluster-api/controlplane/kubeadm/controllers.(*KubeadmControlPlaneReconciler).updateStatus(0xc0006bb020, 0x1b085e0, 0xc0000d2060, 0xc00025e280, 0xc000718780, 0x100000000000000, 0x1ad9ce0)
	/workspace/controlplane/kubeadm/controllers/kubeadm_control_plane_controller.go:333 +0x488
sigs.k8s.io/cluster-api/controlplane/kubeadm/controllers.(*KubeadmControlPlaneReconciler).Reconcile.func1(0xc00046dca8, 0xc00046dc98, 0xc0006bb020, 0x1b085e0, 0xc0000d2060, 0xc00025e280, 0xc000718780, 0x1b185e0, 0xc0003af7a0, 0xc0002f4870)
	/workspace/controlplane/kubeadm/controllers/kubeadm_control_plane_controller.go:175 +0xea
sigs.k8s.io/cluster-api/controlplane/kubeadm/controllers.(*KubeadmControlPlaneReconciler).Reconcile(0xc0006bb020, 0xc000765ed0, 0xb, 0xc000765eb0, 0xe, 0xc000474c00, 0x0, 0x1ac7e20, 0xc0006991e0)
	/workspace/controlplane/kubeadm/controllers/kubeadm_control_plane_controller.go:194 +0x707
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0001e40c0, 0x170d900, 0xc0000c8060, 0x0)
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:256 +0x162
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0001e40c0, 0x0)
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:232 +0xcb
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker(0xc0001e40c0)
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:211 +0x2b
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc000573af0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:152 +0x5e
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc000573af0, 0x3b9aca00, 0x0, 0x1, 0xc0006be0c0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:153 +0xf8
k8s.io/apimachinery/pkg/util/wait.Until(0xc000573af0, 0x3b9aca00, 0xc0006be0c0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:88 +0x4d
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:193 +0x328

I think this comes from the assignment in GetWorkloadCluster before checking the error

restConfig, err := remote.RESTConfig(ctx, m.Client, clusterKey)
restConfig.Timeout = 30 * time.Second
if err != nil {
return nil, err

@vincepri
Copy link
Member

vincepri commented Mar 23, 2020

The latter should be fixed in v0.3.3. Although it seems the original PR has been closed #2732 (comment), so we need a fix for that as well

@vincepri vincepri added area/clusterctl Issues or PRs related to clusterctl area/control-plane Issues or PRs related to control-plane lifecycle management and removed area/clusterctl Issues or PRs related to clusterctl labels Mar 23, 2020
@vincepri
Copy link
Member

@MartinForReal can you paste the crash log / panic that you were seeing?

@MartinForReal
Copy link
Contributor Author

MartinForReal commented Mar 23, 2020

Sure

I0323 17:50:21.654749       1 listener.go:44] controller-runtime/metrics "msg"="metrics server is starting to listen"  "addr"="127.0.0.1:8080"
I0323 17:50:21.654967       1 main.go:139] setup "msg"="starting manager"
I0323 17:50:21.655144       1 leaderelection.go:242] attempting to acquire leader lease  capi-kubeadm-control-plane-system/kubeadm-control-plane-manager-leader-election-capi...
I0323 17:50:21.655216       1 internal.go:356] controller-runtime/manager "msg"="starting metrics server"  "path"="/metrics"
I0323 17:50:39.043164       1 leaderelection.go:252] successfully acquired lease capi-kubeadm-control-plane-system/kubeadm-control-plane-manager-leader-election-capi
I0323 17:50:39.045842       1 controller.go:164] controller-runtime/controller "msg"="Starting EventSource"  "controller"="kubeadmcontrolplane" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"version":"","infrastructureTemplate":{},"kubeadmConfigSpec":{}},"status":{"initialized":false,"ready":false}}}
I0323 17:50:39.146317       1 controller.go:164] controller-runtime/controller "msg"="Starting EventSource"  "controller"="kubeadmcontrolplane" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"clusterName":"","bootstrap":{},"infrastructureRef":{}},"status":{"bootstrapReady":false,"infrastructureReady":false}}}
I0323 17:50:39.246793       1 controller.go:164] controller-runtime/controller "msg"="Starting EventSource"  "controller"="kubeadmcontrolplane" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"controlPlaneEndpoint":{"host":"","port":0}},"status":{"infrastructureReady":false,"controlPlaneInitialized":false}}}
I0323 17:50:39.347112       1 controller.go:171] controller-runtime/controller "msg"="Starting Controller"  "controller"="kubeadmcontrolplane"
I0323 17:50:39.347143       1 controller.go:190] controller-runtime/controller "msg"="Starting workers"  "controller"="kubeadmcontrolplane" "worker count"=10
I0323 17:50:39.347221       1 kubeadm_control_plane_controller.go:120] controllers/KubeadmControlPlane "msg"="Reconcile KubeadmControlPlane" "kubeadmControlPlane"="capi-sample-control-plane" "namespace"="cluster-ops"
E0323 17:50:39.553630       1 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 296 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic(0x16ac540, 0x2703060)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:74 +0xa3
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:48 +0x82
panic(0x16ac540, 0x2703060)
	/usr/local/go/src/runtime/panic.go:679 +0x1b2
sigs.k8s.io/cluster-api/controlplane/kubeadm/internal.(*Management).GetWorkloadCluster(0xc00021d300, 0x1b085e0, 0xc000058068, 0xc0005e4380, 0xb, 0xc0005e4360, 0xb, 0xc0000ea0c8, 0x1, 0x1, ...)
	/workspace/controlplane/kubeadm/internal/cluster.go:73 +0xb6
sigs.k8s.io/cluster-api/controlplane/kubeadm/controllers.(*KubeadmControlPlaneReconciler).updateStatus(0xc00064e660, 0x1b085e0, 0xc000058068, 0xc00043f680, 0xc000460a80, 0x100000000000000, 0x1ad9ce0)
	/workspace/controlplane/kubeadm/controllers/kubeadm_control_plane_controller.go:333 +0x488
sigs.k8s.io/cluster-api/controlplane/kubeadm/controllers.(*KubeadmControlPlaneReconciler).Reconcile.func1(0xc0005f3ca8, 0xc0005f3c98, 0xc00064e660, 0x1b085e0, 0xc000058068, 0xc00043f680, 0xc000460a80, 0x1b185e0, 0xc00060bf50, 0xc00059a140)
	/workspace/controlplane/kubeadm/controllers/kubeadm_control_plane_controller.go:175 +0xea
sigs.k8s.io/cluster-api/controlplane/kubeadm/controllers.(*KubeadmControlPlaneReconciler).Reconcile(0xc00064e660, 0xc0005ee200, 0xb, 0xc00068c1a0, 0x19, 0xc0003c9c00, 0x0, 0x1ac7e20, 0xc00000c9c0)
	/workspace/controlplane/kubeadm/controllers/kubeadm_control_plane_controller.go:194 +0x707
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000182540, 0x170d900, 0xc000451080, 0x0)
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:256 +0x162
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000182540, 0x0)
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:232 +0xcb
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker(0xc000182540)
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:211 +0x2b
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc00002c6c0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:152 +0x5e
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc00002c6c0, 0x3b9aca00, 0x0, 0x1, 0xc0003109c0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:153 +0xf8
k8s.io/apimachinery/pkg/util/wait.Until(0xc00002c6c0, 0x3b9aca00, 0xc0003109c0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:88 +0x4d
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:193 +0x328
I0323 17:50:39.553774       1 kubeadm_control_plane_controller.go:120] controllers/KubeadmControlPlane "msg"="Reconcile KubeadmControlPlane" "kubeadmControlPlane"="capi-sample-control-plane" "namespace"="cluster-ops"
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x1e0 pc=0x151fea6]

goroutine 296 [running]:
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:55 +0x105
panic(0x16ac540, 0x2703060)
	/usr/local/go/src/runtime/panic.go:679 +0x1b2
sigs.k8s.io/cluster-api/controlplane/kubeadm/internal.(*Management).GetWorkloadCluster(0xc00021d300, 0x1b085e0, 0xc000058068, 0xc0005e4380, 0xb, 0xc0005e4360, 0xb, 0xc0000ea0c8, 0x1, 0x1, ...)
	/workspace/controlplane/kubeadm/internal/cluster.go:73 +0xb6
sigs.k8s.io/cluster-api/controlplane/kubeadm/controllers.(*KubeadmControlPlaneReconciler).updateStatus(0xc00064e660, 0x1b085e0, 0xc000058068, 0xc00043f680, 0xc000460a80, 0x100000000000000, 0x1ad9ce0)
	/workspace/controlplane/kubeadm/controllers/kubeadm_control_plane_controller.go:333 +0x488
sigs.k8s.io/cluster-api/controlplane/kubeadm/controllers.(*KubeadmControlPlaneReconciler).Reconcile.func1(0xc0005f3ca8, 0xc0005f3c98, 0xc00064e660, 0x1b085e0, 0xc000058068, 0xc00043f680, 0xc000460a80, 0x1b185e0, 0xc00060bf50, 0xc00059a140)
	/workspace/controlplane/kubeadm/controllers/kubeadm_control_plane_controller.go:175 +0xea
sigs.k8s.io/cluster-api/controlplane/kubeadm/controllers.(*KubeadmControlPlaneReconciler).Reconcile(0xc00064e660, 0xc0005ee200, 0xb, 0xc00068c1a0, 0x19, 0xc0003c9c00, 0x0, 0x1ac7e20, 0xc00000c9c0)
	/workspace/controlplane/kubeadm/controllers/kubeadm_control_plane_controller.go:194 +0x707
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000182540, 0x170d900, 0xc000451080, 0x0)
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:256 +0x162
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000182540, 0x0)
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:232 +0xcb
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker(0xc000182540)
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:211 +0x2b
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc00002c6c0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:152 +0x5e
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc00002c6c0, 0x3b9aca00, 0x0, 0x1, 0xc0003109c0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:153 +0xf8
k8s.io/apimachinery/pkg/util/wait.Until(0xc00002c6c0, 0x3b9aca00, 0xc0003109c0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:88 +0x4d
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:193 +0x328

@MartinForReal
Copy link
Contributor Author

fixed in #2757

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/control-plane Issues or PRs related to control-plane lifecycle management kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
4 participants