-
Notifications
You must be signed in to change notification settings - Fork 411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ddl: Do not physical drop table after tiflash replica is set to 0 #9440
ddl: Do not physical drop table after tiflash replica is set to 0 #9440
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: gengliqi, Lloyd-Pottiger The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
In response to a cherrypick label: new pull request created to branch |
) (#9441) close #9438 ddl: Do not physical drop table after tiflash replica is set to 0 To avoid a potential data loss issue when altering tiflash replica Co-authored-by: JaySon-Huang <[email protected]>
What problem does this PR solve?
Issue Number: close #9438
Problem Summary:
In v8.1.0 and v8.1.1, if the tiflash replica num is set to 0,
applyDropTable(database_id, table_id, "SetTiFlashReplica-0")
will be executed and add atombstone_ts
to the IStorage instance.https://github.com/pingcap/tiflash/blob/v8.1.1/dbms/src/TiDB/Schema/SchemaBuilder.cpp#L392-L407
If all the regions are removed from the tiflash instance, and the
tombstone_ts
exceeds the gc_safepoint, then we will generate aInterpreterDropQuery
to physically drop the IStorage instance.https://github.com/pingcap/tiflash/blob/v8.1.1/dbms/src/TiDB/Schema/SchemaSyncService.cpp#L304-L354
However, there could be a chance that data loss due to a concurrent issue:
SchemaSyncService::gcImpl
, a table is judge as both "tombstone_ts exceed the gc_safepoint" and "no region peer exists". SoInterpreterDropQuery
is generatedInterpreterDropQuery
get executed.InterpreterDropQuery
get executed, and all the data in theStorageDeltaMerge
get physically removed. But the region is still exist in the raft-layer. And the query result after that will meet data loss.What is changed and how it works?
Check List
Tests
Side effects
Documentation
Release note