Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up the DDL execution speed of TiDB backup data restore #27036

Closed
3 of 4 tasks
IANTHEREAL opened this issue Aug 9, 2021 · 2 comments · Fixed by #33026
Closed
3 of 4 tasks

Speed up the DDL execution speed of TiDB backup data restore #27036

IANTHEREAL opened this issue Aug 9, 2021 · 2 comments · Fixed by #33026
Assignees
Labels
type/feature-request Categorizes issue or PR as related to a new feature.

Comments

@IANTHEREAL
Copy link
Contributor

IANTHEREAL commented Aug 9, 2021

Feature Request

Is your feature request related to a problem? Please describe:

Describe the feature you'd like:

The cluster has 6 TiB data, 30k tables, and 11 TiKVs. When I use BR to backup and restore the cluster, I find that the speed is particularly slow. After investigation, BR can only create 2 tables per second, the entire store speed takes nearly 4h, and the execution time spent creating tables is close to 4h. It can be seen that the execution speed of ddl is the bottleneck in the this scenario.

Describe alternatives you've considered:

In order to speed up the restore speed, I hope TiDB to speed up the execution of DDL. In this scenario, BR can create tables at 200 tables/s
In terms of compatibility, Binlog/CDC is required to be able to replicate the table schema of these tables, and the restored data does not need to be replicated temporarily

@IANTHEREAL IANTHEREAL added the type/feature-request Categorizes issue or PR as related to a new feature. label Aug 9, 2021
@YuJuncen
Copy link
Contributor

YuJuncen commented Sep 15, 2021

The bottleneck of creating table is to waiting the schema version changed.
Each time we creating table, we should at least wait for CheckVersFirstWaitTime.

tidb/ddl/util/syncer.go

Lines 362 to 367 in e262e59

func (s *schemaVersionSyncer) OwnerCheckAllVersions(ctx context.Context, latestVer int64) error {
startTime := time.Now()
time.Sleep(CheckVersFirstWaitTime)
notMatchVerCnt := 0
intervalCnt := int(time.Second / checkVersInterval)
updatedMap := make(map[string]struct{})

Currently, BR uses an internal interface named CreateTableWithInfo to creating table, which creating table and wait the schema changing one-by-one, omitting the sync of the ddl job between BR and leader, the procedure of creating one table would be like this:

for _, t := range tables {
  RunInTxn(func(txn) {
    m := meta.New(txn)
    schemaVesrion := m.CreateTable(t)
    m.UpdateSchema(schemaVersion)
  })
  waitSchemaToSync() // <- This would notify and then 
  // waiting for all other TiDB nodes are synced with the latest schema version. 
}

If possible, we can move that I/O bounded slow operation out of the for loop, like:

RunInTxn(func(txn) {
  for _, t := range tables {
    m := meta.New(txn)
    schemaVesrion := m.CreateTable(t)
    m.UpdateSchema(schemaVersion)
  })
}
waitSchemaToSync() // <- only one time of waiting. 

@YuJuncen
Copy link
Contributor

YuJuncen commented Sep 15, 2021

This need TiDB to provide a batch version of CreateTableWithInfo, which create tables in bulk with one DDL job, might seem like this:

// CreateTablesWithInfo creates many tables via the information.
// If the AutoID / AutoRandomID was set in the table info, 
// those of the new created table would be rebased to that.
// (optional to implement) Try to use the original table ID filled in the table info, 
// only at conflict try to alloc new ID when tryRetainID is `true`. Otherwise report an error.
func (d *DDL) CreateTablesWithInfo(ctx sessionctx.Context,
		schema model.CIStr,
		infos []*model.TableInfo,
		onExist OnExist,
		tryRetainID bool) error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/feature-request Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants