-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
/tmp/csi-mount is sometimes not cleaned up after "Node Service"."should work" test which fails further tests #196
Comments
The underlying problem is that once a test has failed, csi-sanity won't always be able to clean up. For example, if the driver leaves a mounted volume behind, then the usual IMHO the right solution is to run each test with its own mount and staging directory. Do you agree? This is already possible when using the Go API (just provide your own create/delete functions which dynamically allocate temp directories), but not when using the csi-sanity command. Are you using the command? Which version? |
Seems like the problem above is that if unmount fails (which happens because it is the second unmount attempt and the folder is already unmounted) then the directory is not deleted. And the test itself is not failing. I'd suggest attempting to remove the folder even if unmounting fails.
|
alexanderKhaustov <[email protected]> writes:
> The underlying problem is that once a test has failed, csi-sanity won't always be able to clean up. For example, if the driver leaves a mounted volume behind, then the usual `os.RemoveAll` will fail. So the latest code doesn't even try anything more than `os.Remove` and if that fails, any following `os.Mkdir` will fail.
Seems like the problem above is that if unmount fails (which happens
because it is the second unmount attempt and the folder is already
unmounted) then the directory is not deleted. And the test itself is
not failing. I'd suggest attempting to remove the folder even if
unmounting fails.
Sorry, I don't follow. If unmounting succeeded, why does removing the
folder fail? `os.Remove` is called.
Also, you are saying that "unmount fails because it is the second
unmount attempt and the folder is already unmounted". This sounds like
the driver isn't idempotent? It's not an error to call
NodeUnpublishVolume twice.
> Are you using the command? Which version?
I've removed the test setup but it was a recent master version, the
day of the post or the previous one
Were you using the csi-sanity command?
|
It seemed that remove hasn't been called until after the second unmount attempt and maybe not at all. Still, your objection that driver seems to perform non-idempotently sounds reasonable. I'll look into it once more. Thanks!
Yes, I've been running the tests via the command |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@fejta-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
d24254f Merge pull request kubernetes-csi#202 from xing-yang/kind_0.14.0 0faa3fc Update to Kind v0.14.0 images ef4e1b2 Merge pull request kubernetes-csi#201 from xing-yang/add_1.24_image 4ddce25 Add 1.24 Kind image 7fe5149 Merge pull request kubernetes-csi#200 from pohly/bump-kubernetes-version 70915a8 prow.sh: update snapshotter version 31a3f38 Merge pull request kubernetes-csi#199 from pohly/bump-kubernetes-version 7577454 prow.sh: bump Kubernetes to v1.22.0 d29a2e7 Merge pull request kubernetes-csi#198 from pohly/csi-test-5.0.0 41cb70d prow.sh: sanity testing with csi-test v5.0.0 c85a63f Merge pull request kubernetes-csi#197 from pohly/fix-alpha-testing b86d8e9 support Kubernetes 1.25 + Ginkgo v2 ab0b0a3 Merge pull request kubernetes-csi#192 from andyzhangx/patch-1 7bbab24 Merge pull request kubernetes-csi#196 from humblec/non-alpha e51ff2c introduce control variable for non alpha feature gate configuration ca19ef5 Merge pull request kubernetes-csi#195 from pohly/fix-alpha-testing 3948331 fix testing with latest Kubernetes 9a0260c fix boilerplate header git-subtree-dir: release-tools git-subtree-split: d24254f
introduce control variable for non alpha feature gate configuration
Here's csi-sanity output:
Node Service
should work
/home/akhaustov/go/src/github.com/kubernetes-csi/csi-test/pkg/sanity/node.go:625
STEP: reusing connection to CSI driver at unix:///tmp/csi.sock
STEP: reusing connection to CSI driver controller at unix:///tmp/csi.sock
STEP: creating mount and staging directories
STEP: creating a single node writer volume
STEP: getting a node id
STEP: controller publishing volume
STEP: node staging volume
STEP: publishing the volume on a node
STEP: cleaning up calling nodeunpublish
STEP: cleaning up calling nodeunstage
STEP: cleaning up calling controllerunpublishing
STEP: cleaning up deleting the volume
cleanup: deleting sanity-node-full-35BAA099-984D9C20 = ef3gu6pu151v67thndsc
cleanup: warning: NodeUnpublishVolume: rpc error: code = Internal desc = Could not unmount "/tmp/csi-mount": Unmount failed: exit status 32
Unmounting arguments: /tmp/csi-mount
Output: umount: /tmp/csi-mount: not mounted
cleanup: warning: ControllerUnpublishVolume: rpc error: code = InvalidArgument desc = Disk unpublish operation failed
• [SLOW TEST:16.717 seconds]
Node Service
/home/akhaustov/go/src/github.com/kubernetes-csi/csi-test/pkg/sanity/tests.go:44
should work
/home/akhaustov/go/src/github.com/kubernetes-csi/csi-test/pkg/sanity/node.go:625
ListSnapshots [Controller Server]
should return appropriate values (no optional values added)
/home/akhaustov/go/src/github.com/kubernetes-csi/csi-test/pkg/sanity/controller.go:1579
STEP: reusing connection to CSI driver at unix:///tmp/csi.sock
STEP: reusing connection to CSI driver controller at unix:///tmp/csi.sock
STEP: creating mount and staging directories
• Failure in Spec Setup (BeforeEach) [0.001 seconds]
ListSnapshots [Controller Server]
/home/akhaustov/go/src/github.com/kubernetes-csi/csi-test/pkg/sanity/tests.go:44
should return appropriate values (no optional values added) [BeforeEach]
/home/akhaustov/go/src/github.com/kubernetes-csi/csi-test/pkg/sanity/controller.go:1579
failed to create target directory
Unexpected error:
<*os.PathError | 0xc0003db170>: {
Op: "mkdir",
Path: "/tmp/csi-mount",
Err: 0x11,
}
mkdir /tmp/csi-mount: file exists
occurred
/home/akhaustov/go/src/github.com/kubernetes-csi/csi-test/pkg/sanity/sanity.go:222
Here's the call order from the driver:
I0426 15:01:54.622765 214412 node.go:73] StageVolume: volume="ef3gu6pu151v67thndsc" operation finished
I0426 15:01:54.625078 214412 node.go:185] PublishVolume(volume_id:"ef3gu6pu151v67thndsc" staging_target_path:"/tmp/csi-staging" target_path:"/tmp/csi-mount/target" volume_capability:<mount:<> access_mode:<mode:SINGLE_NODE_WRITER > > )
I0426 15:01:54.625350 214412 node.go:361] PublishVolume: creating dir /tmp/csi-mount/target
I0426 15:01:54.625459 214412 node.go:366] PublishVolume: mounting /tmp/csi-staging at /tmp/csi-mount/target
I0426 15:01:54.625498 214412 mount_linux.go:138] Mounting cmd (systemd-run) with arguments ([--description=Kubernetes transient mount for /tmp/csi-mount/target --scope -- mount -o bind /tmp/csi-staging /tmp/csi-mount/target])
I0426 15:01:54.636746 214412 mount_linux.go:138] Mounting cmd (systemd-run) with arguments ([--description=Kubernetes transient mount for /tmp/csi-mount/target --scope -- mount -o bind,remount /tmp/csi-staging /tmp/csi-mount/target])
I0426 15:01:54.665978 214412 node.go:236] UnpublishVolume(volume_id:"ef3gu6pu151v67thndsc" target_path:"/tmp/csi-mount/target" )
I0426 15:01:54.666051 214412 node.go:249] UnpublishVolume: unmounting /tmp/csi-mount/target
I0426 15:01:54.666076 214412 mount_linux.go:203] Unmounting /tmp/csi-mount/target
I0426 15:01:54.694340 214412 node.go:139] UnstageVolume(volume_id:"ef3gu6pu151v67thndsc" staging_target_path:"/tmp/csi-staging" )
I0426 15:01:54.695213 214412 node.go:174] UnstageVolume: unmounting /tmp/csi-staging
I0426 15:01:54.695229 214412 mount_linux.go:203] Unmounting /tmp/csi-staging
I0426 15:01:54.716692 214412 controller.go:198] ControllerUnpublishVolume(volume_id:"ef3gu6pu151v67thndsc" node_id:"ef3rphg9mh60js6h4tpt" )
I0426 15:02:00.075137 214412 controller.go:223] ControllerUnpublishVolume: volume ef3gu6pu151v67thndsc detached from node ef3rphg9mh60js6h4tpt
I0426 15:02:00.075743 214412 controller.go:80] DeleteVolume(VolumeId=ef3gu6pu151v67thndsc)
I0426 15:02:03.614579 214412 node.go:236] UnpublishVolume(volume_id:"ef3gu6pu151v67thndsc" target_path:"/tmp/csi-mount" )
I0426 15:02:03.614705 214412 node.go:249] UnpublishVolume: unmounting /tmp/csi-mount
I0426 15:02:03.614753 214412 mount_linux.go:203] Unmounting /tmp/csi-mount
E0426 15:02:03.616796 214412 node.go:252] Could not unmount "/tmp/csi-mount": Unmount failed: exit status 32
Unmounting arguments: /tmp/csi-mount
Output: umount: /tmp/csi-mount: not mounted
I0426 15:02:03.617488 214412 node.go:139] UnstageVolume(volume_id:"ef3gu6pu151v67thndsc" staging_target_path:"/tmp/csi-staging" )
I0426 15:02:03.618271 214412 node.go:166] UnstageVolume: /tmp/csi-staging target not mounted
I0426 15:02:03.618700 214412 controller.go:198] ControllerUnpublishVolume(volume_id:"ef3gu6pu151v67thndsc" node_id:"ef3rphg9mh60js6h4tpt" )
E0426 15:02:03.814387 214412 controller.go:215] Disk unpublish operation failed: request-id = 05f937e6-6e5d-453e-ad2a-e1a49724a569 rpc error: code = InvalidArgument desc = Request validation error: Cannot find disk in instance by specified disk ID.
I0426 15:02:03.815061 214412 controller.go:80] DeleteVolume(VolumeId=ef3gu6pu151v67thndsc)
Note that the failing NodeUnpublishVolume is in fact the second one after the successfull previous one
The text was updated successfully, but these errors were encountered: