Redis/Sentinel Update Ordering #89

rhefner1 · 2018-09-06T13:30:39Z

Expected behaviour

Update of redis-operator that is close to zero-downtime.

Actual behaviour

When I updated my redis-operator to 0.5.1, it did the following:

Made changes to both the sentinel deployment and Redis statefulset.
Both begin a rolling update (and the +60 second changes from [DEVOPS-823] Improve update process #86 worked perfectly)
In my case, redis-0 was the current master. When it was taken down, a failover wasn't possible since the sentinels had also just been restarted and were still discovering the current master. This caused about 2-3 minutes of downtime for the check/heal process to get things back in order.

If it were just my changes (like changing resources), I would just work around that and deploy them separately. But in this case, it was a new version of the operator that I don't have any control over. Anyone who did that update would experience 2-3 mins of downtime in every Redis HA cluster they have (and all at the same time).

I'm not sure what the right answer is here. Maybe do the sentinel update first and then wait for all of them to discover the current master and then proceed? It seems like it might be easier though to do the redis upgrade first and when that is completely rolled out, then do the sentinel upgrade. That way you don't have to figure out when all of the sentinels are ready. ¯\_(ツ)_/¯

cc: @jchanam

The text was updated successfully, but these errors were encountered:

jchanam · 2018-09-11T09:51:33Z

Hi @rhefner1,

You're right, I've upgraded the operator recently too and I had the same downtime and since them, I'm thinking about how to improve the whole process and try to get to a zero downtime.

I'll try to have an approach soon. Until then, if you want to propose something, I'll be happy to discuss any PR and improve this 👍

ese · 2019-12-11T23:42:26Z

The update policy has been update in latest release. Check it out and feel free to reopen if the problem persists

jchanam added the enhancement label Sep 11, 2018

jchanam self-assigned this Sep 11, 2018

rhefner1 mentioned this issue Oct 16, 2018

Redis-operator per namespace or one for whole cluster? #78

Closed

rhefner1 mentioned this issue Feb 22, 2019

How to set client-output-buffer-limit? #118

Closed

ese closed this as completed Dec 11, 2019

mariusstaicu mentioned this issue May 3, 2022

Redis update process #401

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redis/Sentinel Update Ordering #89

Redis/Sentinel Update Ordering #89

rhefner1 commented Sep 6, 2018 •

edited

Loading

jchanam commented Sep 11, 2018

ese commented Dec 11, 2019

Redis/Sentinel Update Ordering #89

Redis/Sentinel Update Ordering #89

Comments

rhefner1 commented Sep 6, 2018 • edited Loading

Expected behaviour

Actual behaviour

jchanam commented Sep 11, 2018

ese commented Dec 11, 2019

rhefner1 commented Sep 6, 2018 •

edited

Loading