Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic while upgrading from 1.4.2 to 1.6.3 #345

Closed
jjsimps opened this issue Oct 13, 2023 · 5 comments
Closed

Panic while upgrading from 1.4.2 to 1.6.3 #345

jjsimps opened this issue Oct 13, 2023 · 5 comments
Labels
type/question Type: question about the product

Comments

@jjsimps
Copy link
Contributor

jjsimps commented Oct 13, 2023

Describe the bug (required)

I1013 20:26:21.848083       1 controller.go:118]  "msg"="Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference" "NebulaCluster"={"name":"sm-nebula","namespace":"starwatch-core"} "controller"="nebulacluster" "controllerGroup"="apps.nebula-graph.io" "controllerKind"="NebulaCluster" "name"="sm-nebula" "namespace"="starwatch-core" "reconcileID"="a485d21d-70c3-4726-b590-1eb5b7ce198a"
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x1556228]

goroutine 913 [running]:
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
	sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:119 +0x1fa
panic({0x1d5c2a0, 0x3461650})
	runtime/panic.go:884 +0x213
github.com/vesoft-inc/nebula-operator/apis/apps/v1alpha1.generateAgentContainer({_, _}, _)
	github.com/vesoft-inc/nebula-operator/[email protected]/apps/v1alpha1/nebulacluster_common.go:310 +0x1028
github.com/vesoft-inc/nebula-operator/apis/apps/v1alpha1.generateContainers({0x24aa478?, 0xc001355b60}, 0xc0007c2240)
	github.com/vesoft-inc/nebula-operator/[email protected]/apps/v1alpha1/nebulacluster_common.go:544 +0x1533
github.com/vesoft-inc/nebula-operator/apis/apps/v1alpha1.generateStatefulSet({0x24aa478, 0xc001355b60}, 0xc0007c2240)
	github.com/vesoft-inc/nebula-operator/[email protected]/apps/v1alpha1/nebulacluster_common.go:563 +0x19c
github.com/vesoft-inc/nebula-operator/apis/apps/v1alpha1.generateWorkload({0x24aa478, 0xc001355b60}, {{0xc0014497c0, 0x4}, {0xc0014497cc, 0x2}, {0xc0007e4ea0, 0xb}}, 0x1?)
	github.com/vesoft-inc/nebula-operator/[email protected]/apps/v1alpha1/nebulacluster_common.go:686 +0x4be
github.com/vesoft-inc/nebula-operator/apis/apps/v1alpha1.(*metadComponent).GenerateWorkload(0xc000da0930?, {{0xc0014497c0, 0x4}, {0xc0014497cc, 0x2}, {0xc0007e4ea0, 0xb}}, 0xc00036c5e0?)
	github.com/vesoft-inc/nebula-operator/[email protected]/apps/v1alpha1/nebulacluster_metad.go:369 +0x65
github.com/vesoft-inc/nebula-operator/pkg/controller/component.(*metadCluster).syncMetadWorkload(0xc000da0930, 0xc000f11400)
	github.com/vesoft-inc/nebula-operator/pkg/controller/component/metad_cluster.go:100 +0x41b
github.com/vesoft-inc/nebula-operator/pkg/controller/component.(*metadCluster).Reconcile(0x2494368?, 0xc000f11400)
	github.com/vesoft-inc/nebula-operator/pkg/controller/component/metad_cluster.go:65 +0x4b
github.com/vesoft-inc/nebula-operator/pkg/controller/nebulacluster.(*defaultNebulaClusterControl).updateNebulaCluster(0xc0009b6a00, 0xc000f11400)
	github.com/vesoft-inc/nebula-operator/pkg/controller/nebulacluster/nebula_cluster_control.go:112 +0x87
github.com/vesoft-inc/nebula-operator/pkg/controller/nebulacluster.(*defaultNebulaClusterControl).UpdateNebulaCluster(0xc0009b6a00, 0xc000f11400)
	github.com/vesoft-inc/nebula-operator/pkg/controller/nebulacluster/nebula_cluster_control.go:85 +0x86
github.com/vesoft-inc/nebula-operator/pkg/controller/nebulacluster.(*ClusterReconciler).syncNebulaCluster(...)
	github.com/vesoft-inc/nebula-operator/pkg/controller/nebulacluster/nebula_cluster_controller.go:192
github.com/vesoft-inc/nebula-operator/pkg/controller/nebulacluster.(*ClusterReconciler).Reconcile(0xc000299c20, {0x24943d8, 0xc000167e00}, {{{0xc00036c5e0?, 0x0?}, {0xc00036c5d0?, 0x40de67?}}})
	github.com/vesoft-inc/nebula-operator/pkg/controller/nebulacluster/nebula_cluster_controller.go:169 +0x5a2
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x24943d8?, {0x24943d8?, 0xc000167e00?}, {{{0xc00036c5e0?, 0x1c93b60?}, {0xc00036c5d0?, 0x40f926?}}})
	sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:122 +0xc8
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0009b6aa0, {0x2494330, 0xc000baf8b0}, {0x1ddbce0?, 0xc000e1e160?})
	sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:323 +0x377
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0009b6aa0, {0x2494330, 0xc000baf8b0})
	sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:274 +0x1d9
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
	sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235 +0x85
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
	sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:231 +0x587

Your Environments (required)

Currently running a cluster deployed using v3.4 nebula graph using 1.4.2 operator. When I upgraded to 1.6.3, the controller pod keeps crashing with this error. Please note that I DID apply the new CRDs.

Expected behavior

No crashes

@jjsimps
Copy link
Contributor Author

jjsimps commented Oct 13, 2023

Looking into the code, it seems like it's panicking when it's trying to access the agent's resource object. And this isn't defined for previous versions of the schema. I had to upgrade nebula-cluster to 1.6.3 as well (where the agent portion) of the cluster object is added.

@MegaByte875
Copy link
Contributor

    if nc.IsBREnabled() || nc.IsLogRotateEnabled() {

Did you enable BR or log rotate?

@jjsimps
Copy link
Contributor Author

jjsimps commented Oct 17, 2023

I think I have log rotate enabled, but I think it's related to accessing the agent's resource object (since that seems to be what's on line 310 of nebulacluster_common

@MegaByte875
Copy link
Contributor

Yes, spec.Agent is a pointer type, the panic was caused by access a nil pointer

@QingZ11 QingZ11 added the type/question Type: question about the product label Oct 20, 2023
@MegaByte875
Copy link
Contributor

#399

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/question Type: question about the product
Projects
None yet
Development

No branches or pull requests

3 participants