This repository has been archived by the owner on Jul 24, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 101
BR should tolerate a small amount of tikv crash #980
Labels
Comments
shuijing198799
changed the title
br should tolerate a small amount of tikv crash
BR should tolerate a small amount of tikv crash
Apr 6, 2021
Should we treat it as a bug? |
overvenus
added
type/bug
Something isn't working
and removed
type/feature-request
New feature or request
labels
Apr 12, 2021
Analyze: During the
During step 2, any disconnected stores would terminate the whole backup procedure: failed to connect to store isn't treated as retryable error. Solution: In theory, we can ignore all ‘failed to connect to store’ errors during step 2. (Because those could finally be retried in step 3.) |
To be clear, after the above PR, the current implementation will tolerate a single tikv node down in a cluster with 3 copies. The tikv design can't tolerate 2 kv down with the same region. |
Cloud you cherry-pick this bug to 4.0.X? |
@YuJuncen Please add this issue and pr to the v4.0 bug triage doc |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Feature Request
Describe your feature request related problem:
In the case of 9 tikv nodes, when a tikv crashes and cannot get up again, br will not work and report the log
Describe the feature you'd like:
When there is a problem with a small amount of tikv, but the cluster can work normally, should br continue to work instead of failing directly?
The text was updated successfully, but these errors were encountered: