You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In my case, redis-0 was the current master. When it was taken down, a failover wasn't possible since the sentinels had also just been restarted and were still discovering the current master. This caused about 2-3 minutes of downtime for the check/heal process to get things back in order.
If it were just my changes (like changing resources), I would just work around that and deploy them separately. But in this case, it was a new version of the operator that I don't have any control over. Anyone who did that update would experience 2-3 mins of downtime in every Redis HA cluster they have (and all at the same time).
I'm not sure what the right answer is here. Maybe do the sentinel update first and then wait for all of them to discover the current master and then proceed? It seems like it might be easier though to do the redis upgrade first and when that is completely rolled out, then do the sentinel upgrade. That way you don't have to figure out when all of the sentinels are ready. ¯\_(ツ)_/¯
You're right, I've upgraded the operator recently too and I had the same downtime and since them, I'm thinking about how to improve the whole process and try to get to a zero downtime.
I'll try to have an approach soon. Until then, if you want to propose something, I'll be happy to discuss any PR and improve this 👍
Expected behaviour
Update of redis-operator that is close to zero-downtime.
Actual behaviour
When I updated my redis-operator to 0.5.1, it did the following:
If it were just my changes (like changing resources), I would just work around that and deploy them separately. But in this case, it was a new version of the operator that I don't have any control over. Anyone who did that update would experience 2-3 mins of downtime in every Redis HA cluster they have (and all at the same time).
I'm not sure what the right answer is here. Maybe do the sentinel update first and then wait for all of them to discover the current master and then proceed? It seems like it might be easier though to do the redis upgrade first and when that is completely rolled out, then do the sentinel upgrade. That way you don't have to figure out when all of the sentinels are ready.
¯\_(ツ)_/¯
cc: @jchanam
The text was updated successfully, but these errors were encountered: