Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create and delete a large number of snapshots leaving stale snapshots in the backend #446

Closed
Madhu-1 opened this issue Jun 26, 2019 · 5 comments · Fixed by #1160
Closed
Labels
bug Something isn't working component/rbd Issues related to RBD

Comments

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Jun 26, 2019

Describe the bug

create and delete a large number of snapshots leaving stale snapshots in the backend

Environment details

  • Image/version of Ceph CSI driver: canary
  • helm chart version
  • Kubernetes cluster version:1.14.2
  • Logs

rbd plugin logs
rbd.log

snapshotter logs

snap.log

Steps to reproduce

Steps to reproduce the behavior:

  1. Setup details: '...'
  2. Deployment to trigger the issue '....'
  3. See error

Actual results

created 50 snapshots and deleted 50 snapshots, there are around 14 stale snapshots in the backend

Expected behavior

once we delete Kube snapshots there should not be any stale snapshots in the backend

@Madhu-1 Madhu-1 added the bug Something isn't working label Jun 26, 2019
@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Jun 26, 2019

@ShyamsundarR @humblec PTAL

@ShyamsundarR
Copy link
Contributor

There are few things happening here, primarily because the requests are timing out.

The snapshotter sidecar does NOT retry taking snapshots, instead it tries once and if the call times out, the volumesnapshot object gets an error. In the background the snapshot is actually created by the first RPC to request the same, but kubernetes does not have a record of the same (i.e the SnapID). Thus, when the snapshot is deleted, kubernetes only deletes the volumesnapshot object and does not invoke DeleteSnapshot RPC against the plugin. This hence leaks snapshots.

The lock fix patch (#443) alleviates the problem, as it speeds up the operations (including snapshots) but there are still corner cases that may occur leaking snapshots.

BTW, Similarly there can be corner cases in PVC->PV creates and deletes, that can leak images as kubernetes never recorded a success from the plugin. I have seen this happen when working on the performance improvements with the locks. It needs some more analysis, but there are cases when on large RPC response times we may leak an image.

There is the ReadyToUse flag with snapshots that can help, as the CO is supposed to keep retrying once a response is sent with this set to false, to ensure this becomes ready at some point in time. We can only do this post creating the RADOS maps for the snapshot and possibly taking the snapshot, so if these calls take time we will again get back to leaking snapshots. Also, the sidecar code needs to be checked if this is retried as expected by the specification.

Looking at the logs you provided, there are some discrepancies in call numbers (i.e Create/Delete/RPC success etc.) which explains the leaks, but to me it looks like you attempted to invoke delete before the snapshots were created as well.

In my test, I attempted creating 25 snapshots using the attached script, which actually waits for the ReadyToUse flag to become true (which essentially means the volumesnapshotcontent object was created, which is done post a success by CreateSnapshot is received), and then I invoke the delete script.
snap-create-perf.sh.txt
snap-delete-perf.sh.txt

BUT, even with the above and without the lock fix patch, only 9 snapshots were marked ready and had volumesnapshotcontent objects created, for the remaining 16 (I created 25) the snapshots were created in the background but as the calls timed out kubernetes never sent the delete request. IOW, we leaked 16 snapshots in the create phase itself, and delete did not have any role to play post that.

Also, I tried an experiment adding a sleep to the CreateSnapshot call (of 65 seconds, as the timeout for the CreateSnapshot is 60 seconds in the snapshotter sidecar and not 10 seconds which is what the documentation for the same states), this call, as expected timed out on the sidecar and the snapshot was never recorded by kubernetes, but as expected the image was created in the background.

We need to understand what to do next, and also possibly raise this with CSI/kube folks to understand expectations here and how to handle not losing state.

@ShyamsundarR
Copy link
Contributor

We possibly need the snapshotter side car to implement a more robust timeout and retry mechanism like the provisioner does: https://github.com/kubernetes-csi/external-provisioner#csi-error-and-timeout-handling

@ShyamsundarR
Copy link
Contributor

We possibly need the snapshotter side car to implement a more robust timeout and retry mechanism like the provisioner does: https://github.com/kubernetes-csi/external-provisioner#csi-error-and-timeout-handling

On further thought, snapshot may not be able to retry endlessly (owing to concepts like freezing/thawing workloads using the volume, prior and post the snapshot, and IO to the volume cannot be held back for long duration's). It may hence require some other form of fix in the kubernetes snapshotter. Will start a discussion there.

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Jul 1, 2019

We possibly need the snapshotter side car to implement a more robust timeout and retry mechanism like the provisioner does: https://github.com/kubernetes-csi/external-provisioner#csi-error-and-timeout-handling

On further thought, snapshot may not be able to retry endlessly (owing to concepts like freezing/thawing workloads using the volume, prior and post the snapshot, and IO to the volume cannot be held back for long duration's). It may hence require some other form of fix in the kubernetes snapshotter. Will start a discussion there.

@ShyamsundarR can you point me to the snapshot discussion if it already started?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working component/rbd Issues related to RBD
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants