You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently when you update the RedisFailover settings (either sentinels or the main Redis nodes), the operator performs a rolling update. Each pod is updated and then Kubernetes waits for the readiness probe to pass before moving on.
The problem with this is that the readiness probe doesn't indicate that the node has joined the cluster and is actually an active node.
I propose the following changes in the statefulset or deployment YAML unless you think there's a better way to solve this:
This makes sure that only one node is down during a rolling update and that it waits 120 seconds after the readiness probe passes before moving onto the next pod. There's no guarantee that the node will actually be ready after that time but it's better than not having it.
I think that could be an option, but it would cause the startup of a redis-failover to be much slower.
Another option could be to set a better configuration on the readiness/liveness probes to only set as running a Sentinel that it's not checking itself, but again, would cause to have a slower startup (but safer).
What do you think it's better?
I'll try to do some tests with this. If you want to do it too and send a PR, it will be highly appreciated.
Currently when you update the RedisFailover settings (either sentinels or the main Redis nodes), the operator performs a rolling update. Each pod is updated and then Kubernetes waits for the readiness probe to pass before moving on.
The problem with this is that the readiness probe doesn't indicate that the node has joined the cluster and is actually an active node.
I propose the following changes in the statefulset or deployment YAML unless you think there's a better way to solve this:
This makes sure that only one node is down during a rolling update and that it waits 120 seconds after the readiness probe passes before moving onto the next pod. There's no guarantee that the node will actually be ready after that time but it's better than not having it.
What do you think?
cc: @jchanam
The text was updated successfully, but these errors were encountered: