Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Reading stale pod information can lead to undesired PVC deletion #407

Open
srteam2020 opened this issue Mar 20, 2021 · 2 comments
Open
Assignees
Labels
bug Something isn't working

Comments

@srteam2020
Copy link
Contributor

srteam2020 commented Mar 20, 2021

Describe the bug
We find that reading stale pod information from the apiserver will make the controller accidentally delete the PVC used by the current pod. More concretely, if we scale down and scale up a CassandraDataCenter, the controller (after a restart) may read a stale view of the pods and believe one of the cassandra pod is going to be deleted (due to the previous scale down). The controller will try to delete the PVC used by the "to-be-deleted pod", while the PVC is actually used by the current Cassandra pod. Currently, the controller lists the PVC using the datacenter, datacenterUID, and cluster. All three fields remain unchanged after scaling down/up, so the controller is not able to differentiate between the PVC used by the "to-be-deleted pod" and the current pod. One potential solution is to use the pod UID to list the PVC, so that when seeing stale pod information the controller will never try to delete the PVC held by the current pod.

This issue is a little bit similar to #402, but the corresponding solution #403 (list PVC using CassandraDataCenter UID) can only help to differentiate between CassandraDataCenters that share the same name, which cannot help when reading stale pod information. As mentioned above, using pod UID to differentiate between pods sharing the same name when listing the PVC can help here.

To Reproduce
The issue happened in a HA k8s cluster:

  1. Create a CassandraDataCenter cdc with nodes=2 and deletePVCs=true. There will be two cassandra pods ca1 and ca2 now. The controller is talking to apiserver1 now.
  2. Scale cdc down (by setting nodes=1). ca2 and its PVC will be deleted. Meanwhile, apiserver2 gets partitioned so its watch cache stops at the moment that ca2 is tagged with a deletion timestamp.
  3. Scale cdc up (by setting nodes=2). Now a new ca2 and its PVC is back. Note that the new ca2 shares the same name as the previously deleted one, so as its PVC.
  4. After experiencing a crash, the restarted controller talks to the stale apiserver2. From apiserver2's watch cache, the controller finds that ca2 is tagged with a deletion timestamp. The controller cannot differentiate between the previous PVC (which is already deleted) used by the old ca2 and the existing one. As a result, the controller will delete ca2's PVC, which is unexpected behavior.

Expected behavior
The controller should not delete the PVC if the pod is not going to be deleted. This can be avoided by using pod UID to list the PVC as mentioned before. Each pod always has a unique UID even sharing the same name with others, so the controller will not mistakenly delete PVC which belongs to another pod.

Additional context
We are willing to file a patch for this issue, similar to what we did in #403.

@srteam2020 srteam2020 added the bug Something isn't working label Mar 20, 2021
@smiklosovic
Copy link
Collaborator

smiklosovic commented Mar 21, 2021

Your patch is always welcomed @srteam2020 !

@smiklosovic
Copy link
Collaborator

Hi @srteam2020 , I was wondering if you do not want to say hi to us (we would love to say hi to you!) in a more private manner. Do not hesitate to reach me at stefan dot miklosovic at instaclustr dot com with anything (but not only) Cassandra related, we like to talk to people, we dont try to push anything on you, dont worry :)

srteam2020 added a commit to srteam2020/cassandra-operator that referenced this issue Mar 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants