[Enhancement] New condition to ensure all etcd's join a single cluster #595
Labels
kind/enhancement
Enhancement, improvement, extension
lifecycle/stale
Nobody worked on this for 6 months (will further age)
Enhancement (What you would like to be added):
As of today, all
etcd-druid
conditions rely on all pods running and the etcd cluster being reachable. We log a successful etcd cluster as long as this is true and all etcd's are running.It is a rare possibility, but if old PVCs exist, it may happen that all etcd's do not join the same cluster but may form multiple clusters, all connected to the same service. In this case,
etcd-druid
sees that all pods and running and will assume a successful cluster.We need a way for
etcd-druid
to ensures that all the etcd's join the same cluster and log the result of this check.Motivation (Why is this needed?):
This is needed as all pods are reachable via the same service and if there are multiple clusters, data will be split between them and will lead to data inconsistencies.
Approach/Hint to the implement solution (optional):
My proposal right now would be to add a new condition to the etcd status. We would check all renewed leases and ensure that there is only one leader. The condition is logged and it can come to an operators attention so that it can be fixed.
When we introduce member state, this functionality can be moved there.
The text was updated successfully, but these errors were encountered: