snapshot (ticdc): reduce list tables time consumption (#11095) #11125

ti-chi-bot · 2024-05-17T02:21:20Z

This is an automated cherry-pick of #11095

What problem does this PR solve?

Issue Number: close #11124

ref #11109

What is changed and how it works?

Use [ListSimpleTables](pingcap/tidb@master/pkg/meta/meta.go#L1026) to retrieve TableNameInfo. This only includes the name and ID of a table, making it smaller than TableInfo.
Apply a filter to the retrieved TableNameInfos to locate the tables of interest.
Use [GetTable](pingcap/tidb@master/pkg/meta/meta.go#L1219) to acquire the schema of the selected tables.

This approach can reduce time costs by minimizing the amount of data that needs to be loaded.

Check List

Tests

Unit test
Covered by existed unit tests.
Manual test (add detailed scripts or steps below)

Test Environment

1 TiDB cluster(4000 tables in database test), 1 TiCDC.

Test Result

Create 100 changefeeds using the configuration provided below:

[filter]
rules = ['test.*100*','test.*101*','test.*102*','test.*103*']

Each changefeed will replicate 56 tables.

Before this PR, when the CDC server was restarted, the lag for changefeeds increased to approximately 1.5 minutes.

It took around 2.5 seconds to initialize the schema snapshot for each changefeed, as shown in the log below:

[2024/05/14 18:25:55.952 +08:00] [INFO] [snapshot.go:219] ["schema snapshot created"] [changefeed=default/test-74] [currentTs=449755804227862532] [cost=2.523837041]

After implementing this PR, the lag for changefeeds increased to about 50 seconds when the CDC server was restarted.

It now takes approximately 1.4 seconds to initialize the schema snapshot for each changefeed. The log for this will be shown below.

[2024/05/14 18:20:53.578 +08:00] [INFO] [snapshot.go:233] ["schema snapshot created"] [changefeed=default/test-15] [currentTs=449755753175580674] [cost=1.418892458]

However, if the changefeed aims to replicate all 4000 tables in db, this PR might be slightly slower by 0.5s compared to the version without this PR, as it needs to load the raw table schemas twice. This issue can be addressed by solution 2, as elaborated in #11109, which I will implement later.

Questions

Will it cause performance regression or break compatibility?

Do you need to update user documentation, design documentation or monitoring documentation?

Release note

Reduce the time consumption of changefeed initialization.

Signed-off-by: ti-chi-bot <[email protected]>

Signed-off-by: dongmen <[email protected]>

ti-chi-bot · 2024-05-20T09:16:59Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: asddongmen

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [asddongmen]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

asddongmen · 2024-05-21T01:25:32Z

/retest

This is an automated cherry-pick of pingcap#11095

75f9778

Signed-off-by: ti-chi-bot <[email protected]>

ti-chi-bot added lgtm release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. type/cherry-pick-for-release-7.5 This PR is cherry-picked to release-7.5 from a source PR. labels May 17, 2024

ti-chi-bot mentioned this pull request May 17, 2024

snapshot (ticdc): reduce list tables time consumption #11095

Merged

ti-chi-bot assigned asddongmen May 17, 2024

ti-chi-bot bot added the do-not-merge/cherry-pick-not-approved label May 17, 2024

ti-chi-bot added the cherry-pick-approved Cherry pick PR approved by release team. label May 20, 2024

ti-chi-bot bot removed the do-not-merge/cherry-pick-not-approved label May 20, 2024

resolve conflict

55ccbf7

Signed-off-by: dongmen <[email protected]>

asddongmen approved these changes May 20, 2024

View reviewed changes

ti-chi-bot bot added the approved label May 20, 2024

ti-chi-bot bot merged commit f337609 into pingcap:release-7.5 May 21, 2024
13 checks passed

3AceShowHand mentioned this pull request May 24, 2024

Speed up create table when the number of tables is relatively large, and reduce memory usage pingcap/tidb#49370

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

snapshot (ticdc): reduce list tables time consumption (#11095) #11125

snapshot (ticdc): reduce list tables time consumption (#11095) #11125

ti-chi-bot commented May 17, 2024

ti-chi-bot bot commented May 20, 2024

asddongmen commented May 21, 2024

snapshot (ticdc): reduce list tables time consumption (#11095) #11125

snapshot (ticdc): reduce list tables time consumption (#11095) #11125

Conversation

ti-chi-bot commented May 17, 2024

What problem does this PR solve?

What is changed and how it works?

Check List

Tests

Questions

Will it cause performance regression or break compatibility?

Do you need to update user documentation, design documentation or monitoring documentation?

Release note

ti-chi-bot bot commented May 20, 2024

asddongmen commented May 21, 2024