Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ddl_puller(ticdc): discard DDLs that irrelevant with changefeed #6446

Merged
merged 37 commits into from
Sep 2, 2022

Conversation

asddongmen
Copy link
Contributor

@asddongmen asddongmen commented Jul 24, 2022

What problem does this PR solve?

Issue Number: ref #6447 #3607

What is changed and how it works?

After this PR, TiCDC filters out DDLs that are not related to changefeed in ddl_puller, which can reduces the latency and memory consumption of replicating DDLs.

It is worth noting that before this PR, for rename table DDL, cdc will filter it with the new table name. After this PR, cdc will filter rename table by its old table name, which is more in line with the semantics of the filter rule.

For example:
----------------------------------
filter:  test.t1(white list)
DDL1:  rename test.t1 to test.t11
DDL2:  rename test.t11 to test.t1
----------------------------------
Before this pr:
CDC will ignore DDL1, sink DDL2.
It will cause upstream and downstream inconsistent.

Result: 
1. Upstream has table 't11',  but downstream still holds table 't1'.
2. When sink DDL2, if downstream don't have the table 't11' before, it may raise error in CDC. 
(This changefeed don't care about the 't11' actually)
----------------------------------
After this pr:
CDC will sink DDL1,  report error DDL2.

Result: 
1. Both upstream and downstream will have table 't11'.
2. CDC reports error for DDL2. More details in following explaination.
(This changefeed don't care about the table 't11' before. User need confirm the error before sinking DDL2)
------------------------------------

And, if the user wants to rename a table with an old table name that does not match filter.rule to a new table name that matches filter.rule, then an error will be reported directly.

For example, we have filter.rule = ['test.t1'], and there are tables name test.t11 in the upstream TiDB:

  1. excute rename table 'test.t11' to 'test.t1' will report an error.

In addition, if you rename multiple tables in a single one rename table DDL, TiCDC will apply the rule mention above to every single subordinate rename table DDL jobs.

For example, we have filter.rule = ['test.t1', test.t3], and there are tables name test.t1, test.t2 in the upstream TiDB:

  1. excute rename table 'test.t1' to 'test.t11', 'test.t2' to 'test.t22', the rename table 'test.t1' will be replicated and 'test.t2' to 'test.t22'will be filtered out sincetest.t1matchfilter.rulebuttest.t2` does not.
  2. excuet rename table 'test.t2' to 'test.t3', 'test.t1' to 'test.t11' will report an error, since test.t2 does not match filter.rule but test.t3 does, TiCDC do not allow such behavior.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  1. start cdc server
  2. create a chengfeed with filter.rule = ['test1.t1', 'test1.t2', 'test1.t3', 'test1.t4']
  3. create t1, t2, t3, t4, t5, t6 in upstream tidb.
  4. exec : rename table t1 to t11 (replicated)
  5. exec : rename table t11 to t1 (error)
➜  bin git:(filter_ddl) ✗ ./cdc cli changefeed list
[
  {
    "id": "test-1",
    "namespace": "default",
    "summary": {
      "state": "error",
      "tso": 435389327982985217,
      "checkpoint": "2022-08-19 11:07:58.939",
      "error": {
        "addr": "127.0.0.1:8300",
        "code": "CDC:ErrSyncRenameTablesFailed",
        "message": "[CDC:ErrSyncRenameTablesFailed]table's old name is not in filter rule, and its new name in filter rule table id '108', ddl query: [rename table t11 to t1], it's an unexpected behavior, if you want to replicate this table, please add its old name to filter rule."
      }
    }
  }
]
  1. exec: rename table t2 to t22, t3 to t33 (replicated)
  2. exec: rename table t22 to t222, t33 to t333 (discard)
  3. exec: rename table t222 to t2, t333 to t3 (error)
➜  bin git:(filter_ddl) ✗ ./cdc cli changefeed list
[
  {
    "id": "test-1",
    "namespace": "default",
    "summary": {
      "state": "error",
      "tso": 435389397673967617,
      "checkpoint": "2022-08-19 11:12:24.789",
      "error": {
        "addr": "127.0.0.1:8300",
        "code": "CDC:ErrSyncRenameTablesFailed",
        "message": "[CDC:ErrSyncRenameTablesFailed]table's old name is not in filter rule, and its new name in filter rule table id '110', ddl query: [rename table t222 to t2, t333 to t3], it's an unexpected behavior, if you want to replicate this table, please add its old name to filter rule."
      }
    }
  }
]
  1. exec: rename table t4 to t44, t5 to t55
    It will be splited into two parts, the rename table t4 to t44 part is replicated and t5 to t55 is ignored.

You can learn more discussion detail in this internal doc: https://pingcap.feishu.cn/wiki/wikcnJ6SaB5cGLeG5OrM4nf6MPg

Questions

Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?

Release note

Improve replicating performance by discard DDLs that irrelevant with a changefeed.

@ti-chi-bot
Copy link
Member

ti-chi-bot commented Jul 24, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • maxshuang
  • zhaoxinyu

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added do-not-merge/needs-linked-issue release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Jul 24, 2022
@ti-chi-bot ti-chi-bot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Jul 24, 2022
@asddongmen asddongmen changed the title schema(ticdc): ingore ddl by filter [WIP]schema(ticdc): ingore ddl by filter Jul 24, 2022
@ti-chi-bot ti-chi-bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 24, 2022
@asddongmen
Copy link
Contributor Author

/run-integration-tests

@ti-chi-bot ti-chi-bot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. do-not-merge/needs-linked-issue labels Jul 24, 2022
@asddongmen asddongmen added the type/enhancement The issue or PR belongs to an enhancement. label Jul 24, 2022
@codecov-commenter
Copy link

codecov-commenter commented Jul 24, 2022

Codecov Report

Merging #6446 (0d5272f) into master (7c5032c) will increase coverage by 0.0058%.
The diff coverage is 51.8737%.

Additional details and impacted files
Flag Coverage Δ
cdc 67.2666% <62.9411%> (-0.1103%) ⬇️
dm 52.0500% <36.0184%> (+0.0514%) ⬆️
engine 62.6212% <64.9269%> (-0.0142%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

@@               Coverage Diff                @@
##             master      #6446        +/-   ##
================================================
+ Coverage   59.9224%   59.9283%   +0.0058%     
================================================
  Files           783        789         +6     
  Lines         89519      90121       +602     
================================================
+ Hits          53642      54008       +366     
- Misses        31168      31369       +201     
- Partials       4709       4744        +35     

@ti-chi-bot ti-chi-bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jul 26, 2022
@asddongmen
Copy link
Contributor Author

/run-integration-tests

1 similar comment
@asddongmen
Copy link
Contributor Author

/run-integration-tests

@asddongmen asddongmen changed the title [WIP]schema(ticdc): ingore ddl by filter [WIP]ddl_puller(ticdc): discard DDLs that unrelated with changefeed Jul 26, 2022
@asddongmen
Copy link
Contributor Author

/run-integration-tests

@asddongmen
Copy link
Contributor Author

/run-integration-tests

@ti-chi-bot ti-chi-bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jul 26, 2022
@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Aug 22, 2022
@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 27, 2022
@ti-chi-bot ti-chi-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 1, 2022
@asddongmen
Copy link
Contributor Author

/run-integration-tests

@asddongmen
Copy link
Contributor Author

/run-all-tests

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Sep 2, 2022
@asddongmen
Copy link
Contributor Author

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 0d5272f

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Sep 2, 2022
@ti-chi-bot
Copy link
Member

@asddongmen: Your PR was out of date, I have automatically updated it for you.

At the same time I will also trigger all tests for you:

/run-all-tests

If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@asddongmen
Copy link
Contributor Author

/run-leak-tests

2 similar comments
@asddongmen
Copy link
Contributor Author

/run-leak-tests

@asddongmen
Copy link
Contributor Author

/run-leak-tests

@ti-chi-bot ti-chi-bot merged commit e2d1f6e into pingcap:master Sep 2, 2022
@asddongmen asddongmen deleted the filter_ddl branch September 2, 2022 06:41
@ti-chi-bot ti-chi-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Sep 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/ddl DDL component. component/puller Puller component. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. status/ptal Could you please take a look? type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants