Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

snapshotter panic in volumegroupsnapshot processing #1123

Open
jmccormick2001 opened this issue Jul 30, 2024 · 2 comments · May be fixed by #1152
Open

snapshotter panic in volumegroupsnapshot processing #1123

jmccormick2001 opened this issue Jul 30, 2024 · 2 comments · May be fixed by #1152
Assignees
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@jmccormick2001
Copy link

What happened:
got this in the 8.0.1 snapshotter while developing our CSI drivers volumegroupsnapshot logic, even though my code
is problematic and not complete, I would expect the snapshotter to not panic.

0730 20:20:28.055041 1 snapshot_controller.go:595] Removed protection finalizer from volume snapshot content snapcontent-13f1a3900670d351195c9d814444c5957f3cc787f79e36f934eef7f669bff83b-2024-07-30-7.52.43
I0730 20:20:28.055064 1 util.go:264] storeObjectUpdate updating content "snapcontent-13f1a3900670d351195c9d814444c5957f3cc787f79e36f934eef7f669bff83b-2024-07-30-7.52.43" with version 2385585
E0730 20:20:28.055275 1 groupsnapshot_helper.go:157] could not sync group snapshot content "groupsnapcontent-b09c477c-0e10-4900-888a-e5d676fbe236": failed to delete group snapshot "groupsnapcontent-b09c477c-0e10-4900-888a-e5d676fbe236", err: cannot delete group snapshot content groupsnapcontent-b09c477c-0e10-4900-888a-e5d676fbe236. No snapshots found in the group snapshot
I0730 20:20:28.055297 1 groupsnapshot_helper.go:79] Failed to sync group snapshot content "groupsnapcontent-b09c477c-0e10-4900-888a-e5d676fbe236", will retry again: failed to delete group snapshot "groupsnapcontent-b09c477c-0e10-4900-888a-e5d676fbe236", err: cannot delete group snapshot content groupsnapcontent-b09c477c-0e10-4900-888a-e5d676fbe236. No snapshots found in the group snapshot
I0730 20:20:28.055328 1 event.go:377] Event(v1.ObjectReference{Kind:"VolumeGroupSnapshotContent", Namespace:"", Name:"groupsnapcontent-b09c477c-0e10-4900-888a-e5d676fbe236", UID:"392530a9-7938-4e96-aecd-7ad7c7c93452", APIVersion:"groupsnapshot.storage.k8s.io/v1alpha1", ResourceVersion:"2304318", FieldPath:""}): type: 'Warning' reason: 'GroupSnapshotDeleteError' Failed to delete group snapshot
E0730 20:20:28.056504 1 groupsnapshot_helper.go:157] could not sync group snapshot content "groupsnapcontent-cf45513b-9cf3-4c36-9566-b2c15889decc": failed to delete group snapshot "groupsnapcontent-cf45513b-9cf3-4c36-9566-b2c15889decc", err: cannot delete group snapshot content groupsnapcontent-cf45513b-9cf3-4c36-9566-b2c15889decc. No snapshots found in the group snapshot
I0730 20:20:28.056521 1 groupsnapshot_helper.go:79] Failed to sync group snapshot content "groupsnapcontent-cf45513b-9cf3-4c36-9566-b2c15889decc", will retry again: failed to delete group snapshot "groupsnapcontent-cf45513b-9cf3-4c36-9566-b2c15889decc", err: cannot delete group snapshot content groupsnapcontent-cf45513b-9cf3-4c36-9566-b2c15889decc. No snapshots found in the group snapshot
I0730 20:20:28.056545 1 event.go:377] Event(v1.ObjectReference{Kind:"VolumeGroupSnapshotContent", Namespace:"", Name:"groupsnapcontent-cf45513b-9cf3-4c36-9566-b2c15889decc", UID:"494c793f-7d00-4b4c-acb4-b917716d5570", APIVersion:"groupsnapshot.storage.k8s.io/v1alpha1", ResourceVersion:"2314362", FieldPath:""}): type: 'Warning' reason: 'GroupSnapshotDeleteError' Failed to delete group snapshot
I0730 20:20:28.057393 1 snapshot_controller_base.go:206] enqueued "snapcontent-13f1a3900670d351195c9d814444c5957f3cc787f79e36f934eef7f669bff83b-2024-07-30-7.52.43" for sync
I0730 20:20:28.057414 1 snapshot_controller_base.go:247] syncContentByKey[snapcontent-13f1a3900670d351195c9d814444c5957f3cc787f79e36f934eef7f669bff83b-2024-07-30-7.52.43]
I0730 20:20:28.057428 1 snapshot_controller_base.go:367] content "snapcontent-13f1a3900670d351195c9d814444c5957f3cc787f79e36f934eef7f669bff83b-2024-07-30-7.52.43" deleted
E0730 20:20:28.059942 1 runtime.go:79] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 53 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic({0x19f7de0, 0x2dbdf00})
/workspace/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:75 +0x85
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc0004d20f0?})
/workspace/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:49 +0x6b
panic({0x19f7de0?, 0x2dbdf00?})
/go/pkg/csiprow.XXXXhhNCeB/go-1.22.3/src/runtime/panic.go:770 +0x132
github.com/kubernetes-csi/external-snapshotter/v8/pkg/sidecar-controller.(*csiSnapshotSideCarController).deleteCSIGroupSnapshotOperation(0xc000180a20, 0xc00040e000)
/workspace/pkg/sidecar-controller/groupsnapshot_helper.go:256 +0x4c8
github.com/kubernetes-csi/external-snapshotter/v8/pkg/sidecar-controller.(*csiSnapshotSideCarController).syncGroupSnapshotContent(0xc000180a20, 0xc00040e000)
/workspace/pkg/sidecar-controller/groupsnapshot_helper.go:184 +0x47d
github.com/kubernetes-csi/external-snapshotter/v8/pkg/sidecar-controller.(*csiSnapshotSideCarController).updateGroupSnapshotContentInInformerCache(0xc000180a20, 0xc00040e000)
/workspace/pkg/sidecar-controller/groupsnapshot_helper.go:150 +0x12d
github.com/kubernetes-csi/external-snapshotter/v8/pkg/sidecar-controller.(*csiSnapshotSideCarController).syncGroupSnapshotContentByKey(0xc000180a20, {0xc000059980, 0x35})
/workspace/pkg/sidecar-controller/groupsnapshot_helper.go:102 +0x707
github.com/kubernetes-csi/external-snapshotter/v8/pkg/sidecar-controller.(*csiSnapshotSideCarController).groupSnapshotContentWorker(0xc000180a20)
/workspace/pkg/sidecar-controller/groupsnapshot_helper.go:75 +0xf3
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x30?)
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/backoff.go:226 +0x33
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc00080a340, {0x1f58d40, 0xc0003e5dd0}, 0x1, 0xc0000b77a0)
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/backoff.go:227 +0xaf
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc00080a340, 0x0, 0x0, 0x1, 0xc0000b77a0)
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/backoff.go:204 +0x7f
k8s.io/apimachinery/pkg/util/wait.Until(...)
/workspace/vendor/k8s.io/apimachinery/pkg/util/wait/backoff.go:161
created by github.com/kubernetes-csi/external-snapshotter/v8/pkg/sidecar-controller.(*csiSnapshotSideCarController).Run in goroutine 76
/workspace/pkg/sidecar-controller/snapshot_controller_base.go:187 +0x3d0
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x17de408]

snapshotter sidecar panic

What you expected to happen:
just throw errors

How to reproduce it:
construct CreateVolumeGroupSnapshotResponse with invalid snapshots or remove snapshots on the storage device or have resources not cleaned up properly due to errors.

Anything else we need to know?:

Environment:

  • Driver version: 8.0.1
  • Kubernetes version (use kubectl version): v1.30.1+k3s1
  • OS (e.g. from /etc/os-release): ubuntu
  • Kernel (e.g. uname -a):
  • Install tools: k3s
  • Others:
@manishym
Copy link
Contributor

manishym commented Sep 4, 2024

/assign

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
4 participants