-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fallback to poller replication lag if heartbeat lag fails #207
Fallback to poller replication lag if heartbeat lag fails #207
Conversation
Signed-off-by: Eduardo J. Ortega U <[email protected]>
Signed-off-by: Eduardo J. Ortega U <[email protected]>
// rt.mode == tabletenv.Poller or fallback after heartbeat error | ||
mysqlLag, mysqlErr = rt.poller.Status() | ||
if fallbackToPoller && mysqlErr != nil { | ||
return 0, errFallback |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ejortegau what if we just return rt.poller.Status()
here? This would give you a more useful error although it wouldn't be clear there was a "fallback"
Wrapping the mysqlErr
with errFallback
might work too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to explicitly return a different error when fallback fails. I will do some wrapping, though
Signed-off-by: Eduardo J. Ortega U <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggested one typo-fix in the error, but LGTM 👍
Co-authored-by: Tim Vaillancourt <[email protected]> Signed-off-by: Eduardo J. Ortega U. <[email protected]>
if heartbeatLag, heartbeatErr = rt.hr.Status(); heartbeatErr == nil { | ||
return heartbeatLag, heartbeatErr | ||
} | ||
fallbackToPoller = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need this boolean? If either of the case statements are met, we would return out anyways
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then the condition below on line 158 can just be
if mysqlErr != nil {...}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so. The intention is to raise different errors in case rt.poller.Status()
failed depending on whether fallback was used or not.
…beat_poller_replication_tracker
* Fallback to poller replication lag if heartbeat lag fails Signed-off-by: Eduardo J. Ortega U <[email protected]> * Try to make CI pipeline happy Signed-off-by: Eduardo J. Ortega U <[email protected]> * Address PR comments Signed-off-by: Eduardo J. Ortega U <[email protected]> * Fix typo Co-authored-by: Tim Vaillancourt <[email protected]> Signed-off-by: Eduardo J. Ortega U. <[email protected]> --------- Signed-off-by: Eduardo J. Ortega U <[email protected]> Signed-off-by: Eduardo J. Ortega U. <[email protected]> Co-authored-by: Tim Vaillancourt <[email protected]>
* Fallback to poller replication lag if heartbeat lag fails * Try to make CI pipeline happy * Address PR comments * Fix typo --------- Signed-off-by: Eduardo J. Ortega U <[email protected]> Signed-off-by: Eduardo J. Ortega U. <[email protected]> Co-authored-by: Eduardo J. Ortega U <[email protected]>
Description
This PR should allow to fallback to poller replication lag tracker in case using the heartbeat lag tracker fails. It is meant as a temporary change to allow us to move to heartbeat lag. The reason this is needed is because moving a shard from poller to heartbeat lag tracking can lead to replicas taken out of service. For example:
With the changes in this PR, if there are issues getting the lag from the heartbeat, replication tracker uses poller lag instead, which is what we are currently using.
Once we are fully migrated to using heartbeats, these changes can be reverted.