-
Notifications
You must be signed in to change notification settings - Fork 289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvclient(ticdc): fix kvclient takes too long time to recover (#3612) #3660
kvclient(ticdc): fix kvclient takes too long time to recover (#3612) #3660
Conversation
Signed-off-by: ti-chi-bot <[email protected]>
[REVIEW NOTIFICATION] This pull request has not been approved. To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
@ti-chi-bot: This cherry pick PR is for a release branch and has not yet been approved by release team. To merge this cherry pick, it must first be approved by the collaborators. AFTER it has been approved by collaborators, please ping the release team in a comment to request a cherry pick review. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/invite |
/run-kafka-integration-test |
@ti-chi-bot: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This is an automated cherry-pick of #3612
close #3191
close flaky test in kvclient: #2694 #3302 #2349 #2688 #2747
What problem does this PR solve?
What is changed and how it works?
Reason
We use a tikv node which has about 3k region leaders as our comparison object. After test in normal case and abnormal case, we got follow results:
But, pd and tikv only need about 30s to tag a node 'disconnect' and elect a new leader, so a reasonable timespan to recover is about:
Result
Check List
Tests
Related changes
Release note