[BUG] PVC is mistakenly deleted when the controller reads stale `deletionTimestamp` information #316

srteam2020 · 2021-04-14T20:13:40Z

Bug Report

We find that in a HA k8s cluster, casskop could mistakenly delete the PVC of the cassandra pod after casskop experiencing a restart. After diagnosis and inspection, we find it is caused by potential staleness in some of the apiserver.

More concretely, if casskop receives a stale update event from an apiserver saying "CassandraCluster has non-nil deletion timestamp", the controller will delete the PVCs for each DataCenter and Rack belong to current CassandraCluster by calling DeletePVCs(ReconcileCassandraCluster, dcName, rackName). Inside DeletePVCs, PVCs to be deleted are fetched with Cluster, DataCenter, and Rack name. Note that the stale event actually comes from the deletion of a previous CassandraCluster sharing the same name (but with a different UID) as the currently running one, so the PVC of the existing CassandraCluster will be listed and deleted mistakenly.

One potential approach to fix this is to label each PVC the UID of the CassandraCluster when creating them, and list the PVC using the UID of the CassandraCluster in ListPVC to ensure that we always delete the right PVC.

What did you do?
We list the concrete steps we took that led to this issue:

Create a CassandraCluster cc with DeletePVC enabled. PVC will be created in the cluster.
Delete cc. Apiserver1 will send the update events with a non-nil deletion timestamp to the controller and the controller will trigger DeletePVCs() to delete related PVC for each rack. Meanwhile, apiserver2 is partitioned so its watch cache stops at the moment that cc is tagged with a deletion timestamp.
Create the CassandraCluster with the same name cc again. Now the CassandraCluster gets back with a different UID, so as its PVC. However, apiserver2 still holds the stale view that cc has a non-nil deletion timestamp and is about to be deleted.
The controller restarts after a node failure and talks to the stale apiserver2. Reading the stale update events from apiserver2 that cc has a deletion timestamp, the controller lists all the PVC belonging to the currently running cc (as mentioned above) and deletes them all.

What did you expect to see?
The controller should not try to delete the PVC even when reading stale deletionTimestamp.

What did you see instead? Under which circumstances?
The controllers deletes the PVC after the restart as mentioned in the above steps.

Environment

casskop version:
f87c8e0 (master branch)
Kubernetes version information:
1.18.9
Cassandra version:
3.11

Possible Solution
We are willing to help fix this bug by issuing a PR.

As mentioned above, the bug can be avoided by labeling each PVC with the UID of the CassandraCluster and listing PVC with UID. Each CassandraCluster will always have a different UID even with the same name. So in this case the PVCs belonging to the newly created CassandraCluster will not be deleted by the stale events of the old one.

The text was updated successfully, but these errors were encountered:

cscetbon · 2021-04-27T02:14:42Z

Hey @srteam2020 really interesting finding. I've never been able to experience such a behavior but I don't think I've used the HA feature which is part of 1.19 IIRC. If you can provide a PR it'd be great ! Let me know if some help is needed.

srteam2020 · 2021-04-28T20:35:47Z

Hey @cscetbon
Thanks for your confirmation. We will send a PR for this issue soon.

Label the cassandracluster uid to the PVC and checks the uid before deleting PVC

This reverts commit 2b265f0.

srteam2020 mentioned this issue May 1, 2021

Fix #316 by labeling the cluster uid to each PVC #322

Merged

cscetbon closed this as completed in #322 May 7, 2021

cscetbon pushed a commit that referenced this issue May 7, 2021

Fix #316 by labeling the cluster uid to each PVC (#322)

2b265f0

Label the cassandracluster uid to the PVC and checks the uid before deleting PVC

cscetbon added a commit that referenced this issue Jul 16, 2021

Revert "Fix #316 by labeling the cluster uid to each PVC (#322)"

fbe27f4

This reverts commit 2b265f0.

cscetbon added a commit that referenced this issue Jul 16, 2021

Revert "Fix #316 by labeling the cluster uid to each PVC (#322)"

8b63b06

This reverts commit 2b265f0.

cscetbon added a commit that referenced this issue Jul 16, 2021

Revert "Fix #316 by labeling the cluster uid to each PVC (#322)" (#344)

2934e20

cscetbon added a commit that referenced this issue Jul 16, 2021

Revert "Fix #316 by labeling the cluster uid to each PVC (#322)" (#343)

cf38285

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] PVC is mistakenly deleted when the controller reads stale `deletionTimestamp` information #316

[BUG] PVC is mistakenly deleted when the controller reads stale `deletionTimestamp` information #316

srteam2020 commented Apr 14, 2021 •

edited

Loading

cscetbon commented Apr 27, 2021 •

edited

Loading

srteam2020 commented Apr 28, 2021

[BUG] PVC is mistakenly deleted when the controller reads stale deletionTimestamp information #316

[BUG] PVC is mistakenly deleted when the controller reads stale deletionTimestamp information #316

Comments

srteam2020 commented Apr 14, 2021 • edited Loading

Bug Report

cscetbon commented Apr 27, 2021 • edited Loading

srteam2020 commented Apr 28, 2021

[BUG] PVC is mistakenly deleted when the controller reads stale `deletionTimestamp` information #316

[BUG] PVC is mistakenly deleted when the controller reads stale `deletionTimestamp` information #316

srteam2020 commented Apr 14, 2021 •

edited

Loading

cscetbon commented Apr 27, 2021 •

edited

Loading