puller(ticdc): always split update kv entries in sink safe mode (#11224) #11656
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is an automated cherry-pick of #11224
What problem does this PR solve?
Issue Number: close #11231
What is changed and how it works?
After introduce #10919, we choose to just split some update kv entries in puller module when changefeed starts and avoid split any update events in sink module.
This makes it possible to meet duplicate entry error during normal run and cause changefeed to restart.
Although after restart, puller can split the conflict update events and changefeed can continue to run normally. Some customer may be unhappy with this behavior if their workload has many conflict data which cause changefeed restart occasionally. So we need a workaround to avoid restart.
This pr introduce a type
PullerSplitUpdateMode
to describe how puller handle update kv entries. We keep all split logic unchanged as in #10919, and introduce a new behaviour:When mysql sink is in safe mode, we set
PullerSplitUpdateMode
toPullerSplitUpdateModeAlways
. This means to split all update kv entries in puller. So if the customer don't want changefeed to restart, they can configsafe-mode
to true to avoid the restart.Check List
Tests
Questions
Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?
Release note