-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
detect binlog purged and report to Drainer #1017
base: master
Are you sure you want to change the base?
Conversation
No release note, Please follow https://github.com/pingcap/community/blob/master/contributors/release-note-checker.md |
No release note, Please follow https://github.com/pingcap/community/blob/master/contributors/release-note-checker.md |
/run-all-tests tidb=release-4.0 tikv=release-4.0 pd=release-4.0 |
/run-all-tests tidb=release-4.0 tikv=release-4.0 pd=release-4.0 |
/run-all-tests tidb=release-4.0 tikv=release-4.0 pd=release-4.0 |
p.logger.Error("some binlogs have been purged in pump") | ||
}) | ||
|
||
time.Sleep(time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not exit directly? since isBinlogPurged
never set back to false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if exit directly, then systemd will restart it again (but that's meaningless). if not exist, log and metrics can still keep working.
for isBinlogPurged
, the user should handle it manually and re-establish the replication.
return | ||
} | ||
|
||
select { | ||
case values <- value: | ||
log.Debug("send value success") | ||
case <-time.After(gcMaxBlockTime): | ||
// do not update `last` anymore. | ||
log.Warn("can not send the binlog for a long time, will try to read again", zap.Duration("duration", gcMaxBlockTime), zap.Int64("current TS", decodeTSKey(iter.Key()))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we send error to errs
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
send the binlog for a long time
often caused by a slow drainer. if we send errors, the replication will be broken and try to send again, that's should make it worse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, if drainer encounter the purge error , all pump will reach this warning 🤔️, but both drainer and pump still running. until the user manually fix it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
? not all pump, but only this one. and pump should still be running because may other drainers can still work normally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, I know what you mean. any advice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rest LGTM
@csuzhangxc: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
What problem does this PR solve?
detect binlog purged and report to Drainer.
fix TOOL-2236 (internal only)
What is changed and how it works?
Check List
Tests
Code changes
Side effects
Related changes
Release note