Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add missing requeue to Datacenter decommission #322

Merged
merged 4 commits into from
Apr 21, 2022

Conversation

burmanm
Copy link
Contributor

@burmanm burmanm commented Apr 19, 2022

What this PR does:
Decommissioning datacenter is missing a requeue after scaling down is done, but the pods haven't been removed yet. This can cause the operator to remove PVCs before Cassandra has properly shutdown (it has 30s wait after decommission).

Which issue(s) this PR fixes:
Fixes #323

Checklist

  • Changes manually tested
  • Automated Tests added/updated
  • Documentation added/updated
  • CHANGELOG.md updated (not required for documentation PRs)
  • CLA Signed: DataStax CLA

@burmanm burmanm changed the title Add logging to catch the flakiness of decommission_dc test Add missing requeue to Datacenter decommission Apr 19, 2022
@burmanm burmanm marked this pull request as ready for review April 19, 2022 12:32
@burmanm burmanm requested a review from a team as a code owner April 19, 2022 12:32
@jsanda
Copy link
Contributor

jsanda commented Apr 20, 2022

@burmanm can you please create an issue for this?

@@ -70,6 +70,8 @@ func (rc *ReconciliationContext) ProcessDeletion() result.ReconcileResult {
// Exiting to let other parts of the process take care of the decommission
return result.Continue()
}
// How could we have pods if we've decommissioned everything?
return result.RequeueSoon(5)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you adding the requeue here to handle the scenario where len(dcs) == 1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would we requeue in that case? If len(dcs) == 1, we go and delete the DC.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not obvious to me why the need for the requeue here. Can you explain?

@burmanm burmanm merged commit addd528 into k8ssandra:master Apr 21, 2022
burmanm added a commit that referenced this pull request May 12, 2022
* Add requeue if we still have pods although decommission has succeeded

* CHANGELOG

(cherry picked from commit addd528)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

K8SSAND-1465 ⁃ Datacenter decommission is missing a requeue to prevent early deletion
2 participants