Panic while upgrading from 1.4.2 to 1.6.3 #345

jjsimps · 2023-10-13T21:04:07Z

Describe the bug (required)

I1013 20:26:21.848083       1 controller.go:118]  "msg"="Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference" "NebulaCluster"={"name":"sm-nebula","namespace":"starwatch-core"} "controller"="nebulacluster" "controllerGroup"="apps.nebula-graph.io" "controllerKind"="NebulaCluster" "name"="sm-nebula" "namespace"="starwatch-core" "reconcileID"="a485d21d-70c3-4726-b590-1eb5b7ce198a"
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x1556228]

goroutine 913 [running]:
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
	sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:119 +0x1fa
panic({0x1d5c2a0, 0x3461650})
	runtime/panic.go:884 +0x213
github.com/vesoft-inc/nebula-operator/apis/apps/v1alpha1.generateAgentContainer({_, _}, _)
	github.com/vesoft-inc/nebula-operator/[email protected]/apps/v1alpha1/nebulacluster_common.go:310 +0x1028
github.com/vesoft-inc/nebula-operator/apis/apps/v1alpha1.generateContainers({0x24aa478?, 0xc001355b60}, 0xc0007c2240)
	github.com/vesoft-inc/nebula-operator/[email protected]/apps/v1alpha1/nebulacluster_common.go:544 +0x1533
github.com/vesoft-inc/nebula-operator/apis/apps/v1alpha1.generateStatefulSet({0x24aa478, 0xc001355b60}, 0xc0007c2240)
	github.com/vesoft-inc/nebula-operator/[email protected]/apps/v1alpha1/nebulacluster_common.go:563 +0x19c
github.com/vesoft-inc/nebula-operator/apis/apps/v1alpha1.generateWorkload({0x24aa478, 0xc001355b60}, {{0xc0014497c0, 0x4}, {0xc0014497cc, 0x2}, {0xc0007e4ea0, 0xb}}, 0x1?)
	github.com/vesoft-inc/nebula-operator/[email protected]/apps/v1alpha1/nebulacluster_common.go:686 +0x4be
github.com/vesoft-inc/nebula-operator/apis/apps/v1alpha1.(*metadComponent).GenerateWorkload(0xc000da0930?, {{0xc0014497c0, 0x4}, {0xc0014497cc, 0x2}, {0xc0007e4ea0, 0xb}}, 0xc00036c5e0?)
	github.com/vesoft-inc/nebula-operator/[email protected]/apps/v1alpha1/nebulacluster_metad.go:369 +0x65
github.com/vesoft-inc/nebula-operator/pkg/controller/component.(*metadCluster).syncMetadWorkload(0xc000da0930, 0xc000f11400)
	github.com/vesoft-inc/nebula-operator/pkg/controller/component/metad_cluster.go:100 +0x41b
github.com/vesoft-inc/nebula-operator/pkg/controller/component.(*metadCluster).Reconcile(0x2494368?, 0xc000f11400)
	github.com/vesoft-inc/nebula-operator/pkg/controller/component/metad_cluster.go:65 +0x4b
github.com/vesoft-inc/nebula-operator/pkg/controller/nebulacluster.(*defaultNebulaClusterControl).updateNebulaCluster(0xc0009b6a00, 0xc000f11400)
	github.com/vesoft-inc/nebula-operator/pkg/controller/nebulacluster/nebula_cluster_control.go:112 +0x87
github.com/vesoft-inc/nebula-operator/pkg/controller/nebulacluster.(*defaultNebulaClusterControl).UpdateNebulaCluster(0xc0009b6a00, 0xc000f11400)
	github.com/vesoft-inc/nebula-operator/pkg/controller/nebulacluster/nebula_cluster_control.go:85 +0x86
github.com/vesoft-inc/nebula-operator/pkg/controller/nebulacluster.(*ClusterReconciler).syncNebulaCluster(...)
	github.com/vesoft-inc/nebula-operator/pkg/controller/nebulacluster/nebula_cluster_controller.go:192
github.com/vesoft-inc/nebula-operator/pkg/controller/nebulacluster.(*ClusterReconciler).Reconcile(0xc000299c20, {0x24943d8, 0xc000167e00}, {{{0xc00036c5e0?, 0x0?}, {0xc00036c5d0?, 0x40de67?}}})
	github.com/vesoft-inc/nebula-operator/pkg/controller/nebulacluster/nebula_cluster_controller.go:169 +0x5a2
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x24943d8?, {0x24943d8?, 0xc000167e00?}, {{{0xc00036c5e0?, 0x1c93b60?}, {0xc00036c5d0?, 0x40f926?}}})
	sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:122 +0xc8
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0009b6aa0, {0x2494330, 0xc000baf8b0}, {0x1ddbce0?, 0xc000e1e160?})
	sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:323 +0x377
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0009b6aa0, {0x2494330, 0xc000baf8b0})
	sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:274 +0x1d9
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
	sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235 +0x85
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
	sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:231 +0x587

Your Environments (required)

Currently running a cluster deployed using v3.4 nebula graph using 1.4.2 operator. When I upgraded to 1.6.3, the controller pod keeps crashing with this error. Please note that I DID apply the new CRDs.

Expected behavior

No crashes

The text was updated successfully, but these errors were encountered:

jjsimps · 2023-10-13T21:31:58Z

Looking into the code, it seems like it's panicking when it's trying to access the agent's resource object. And this isn't defined for previous versions of the schema. I had to upgrade nebula-cluster to 1.6.3 as well (where the agent portion) of the cluster object is added.

MegaByte875 · 2023-10-15T03:01:58Z

    if nc.IsBREnabled() || nc.IsLogRotateEnabled() {

Did you enable BR or log rotate?

jjsimps · 2023-10-17T19:01:04Z

I think I have log rotate enabled, but I think it's related to accessing the agent's resource object (since that seems to be what's on line 310 of nebulacluster_common

MegaByte875 · 2023-10-18T18:26:31Z

Yes, spec.Agent is a pointer type, the panic was caused by access a nil pointer

MegaByte875 · 2023-11-15T04:47:52Z

#399

QingZ11 added the type/question Type: question about the product label Oct 20, 2023

MegaByte875 closed this as completed Nov 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Panic while upgrading from 1.4.2 to 1.6.3 #345

Panic while upgrading from 1.4.2 to 1.6.3 #345

jjsimps commented Oct 13, 2023

jjsimps commented Oct 13, 2023

MegaByte875 commented Oct 15, 2023

jjsimps commented Oct 17, 2023

MegaByte875 commented Oct 18, 2023

MegaByte875 commented Nov 15, 2023

Panic while upgrading from 1.4.2 to 1.6.3 #345

Panic while upgrading from 1.4.2 to 1.6.3 #345

Comments

jjsimps commented Oct 13, 2023

jjsimps commented Oct 13, 2023

MegaByte875 commented Oct 15, 2023

jjsimps commented Oct 17, 2023

MegaByte875 commented Oct 18, 2023

MegaByte875 commented Nov 15, 2023