-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create health condition for degraded status / quorum issues #360
Conversation
dcPatch := client.MergeFrom(rc.Datacenter.DeepCopy()) | ||
|
||
if rc.isClusterDegraded() || !rc.isClusterHealthy() { | ||
updated = rc.setCondition( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the cluster is degraded or unhealthy, should we just requeue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already do that elsewhere, this is just to add the status (as I thought the DBPE-2283 wish was). If we requeue here, then we could never recover from the wrong status.
6e73b7f
to
7e1425a
Compare
@@ -328,6 +328,7 @@ const ( | |||
DatacenterRollingRestart DatacenterConditionType = "RollingRestart" | |||
DatacenterValid DatacenterConditionType = "Valid" | |||
DatacenterDecommission DatacenterConditionType = "Decommission" | |||
DatacenterHealthy DatacenterConditionType = "Healthy" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some comments explaining this?
* Add DatacenterHealth condition * Modify the DatacenterHealth status update to occur on the starting nodes part * Fix rebase * Add description to DatacenterHealthy * Add lint to the Makefile, remove unused isDegraded (cherry picked from commit b498b7e)
What this PR does:
In case CassandraDatacenter is under some operation that degrades the cluster or QUORUM health check fails, update Condition on the Status to indicate this. This is to make user aware that certain other operations are not going to be successful.
Which issue(s) this PR fixes:
Fixes #376
Checklist