-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up the DDL execution speed of TiDB backup data restore #27036
Comments
The bottleneck of creating table is to waiting the schema version changed. Lines 362 to 367 in e262e59
Currently, BR uses an internal interface named for _, t := range tables {
RunInTxn(func(txn) {
m := meta.New(txn)
schemaVesrion := m.CreateTable(t)
m.UpdateSchema(schemaVersion)
})
waitSchemaToSync() // <- This would notify and then
// waiting for all other TiDB nodes are synced with the latest schema version.
} If possible, we can move that I/O bounded slow operation out of the for loop, like: RunInTxn(func(txn) {
for _, t := range tables {
m := meta.New(txn)
schemaVesrion := m.CreateTable(t)
m.UpdateSchema(schemaVersion)
})
}
waitSchemaToSync() // <- only one time of waiting. |
This need TiDB to provide a batch version of // CreateTablesWithInfo creates many tables via the information.
// If the AutoID / AutoRandomID was set in the table info,
// those of the new created table would be rebased to that.
// (optional to implement) Try to use the original table ID filled in the table info,
// only at conflict try to alloc new ID when tryRetainID is `true`. Otherwise report an error.
func (d *DDL) CreateTablesWithInfo(ctx sessionctx.Context,
schema model.CIStr,
infos []*model.TableInfo,
onExist OnExist,
tryRetainID bool) error |
Feature Request
Is your feature request related to a problem? Please describe:
Describe the feature you'd like:
The cluster has 6 TiB data, 30k tables, and 11 TiKVs. When I use BR to backup and restore the cluster, I find that the speed is particularly slow. After investigation, BR can only create 2 tables per second, the entire store speed takes nearly 4h, and the execution time spent creating tables is close to 4h. It can be seen that the execution speed of ddl is the bottleneck in the this scenario.
Describe alternatives you've considered:
In order to speed up the restore speed, I hope TiDB to speed up the execution of DDL. In this scenario, BR can create tables at 200 tables/s
In terms of compatibility, Binlog/CDC is required to be able to replicate the table schema of these tables, and the restored data does not need to be replicated temporarily
The text was updated successfully, but these errors were encountered: