Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle snapshot in parallel if there is only one big region in raftstore v2 #8108

Merged
merged 45 commits into from
Oct 10, 2023

Conversation

CalvinNeo
Copy link
Member

@CalvinNeo CalvinNeo commented Sep 18, 2023

What problem does this PR solve?

Issue Number: close #8081

Problem Summary:

In this PR, we also reorganized some codes in KVStore.cpp and Region.cpp.

What is changed and how it works?

Interface changes:

  1. Add an interface fn_get_config_json to get final config on Proxy's side.
  2. Add an interface fn_approx_size to get approximate size of different cf only for tablet sst reader.
  3. Add an interface fn_get_split_keys to try to split a raft snapshot into splits_count parts, according to write cf. It may result in less or even zero parts if the size of the snapshot is small. The returned keys are not evenly distributed.

The feature only works when in raftstore v2 and there is few concurrent prehandling.

The whole process:

  1. SSTReader calls proxy for approximate size, if it meets some threshold, we will start parallel prehandling.
  2. SSTFilesToBlockInputStream compute how may splits we want to divide.
  3. SSTReader ask proxy for all split keys, the whole process will fallback to single thread if proxy can't return enough keys.
  4. For all count split parts, we will create another count - 1 to handle in parallel.
  5. If exception or other failure happens, the sub task will throw or return the error. If one sub task fails, all other prehandling subtask will be aborted.
  6. Otherwise, it will return ingest files.
  7. The caller thread will handle the head(leftmost) split, and wait for all other threads join. It will aggregate outputs of all subtasks. The outputs includes ingested files(in disk) and uncommitted key-value pairs(in memory)

How to split tasks by split keys?
Given a split_key[i], a subtask who runs split_id=i will handle range in (pk(split_key[i]), pk(split_key[i+1])]. Which means it will skip all versions of pk of split_key[i], and accepts all versions of pk of split_key[i+1].

The order

region start -- head split(will also do final agg) -- split key 0 -- split 0 -- split key 1 -- split 1 -- split key n -- split n -- region end

Performance

item master this pr
splits 1(always) 3
lines 11095216 10474970
interval between two ddls 2min 1min20s
item master this pr
splits 1(always) 4
lines - 20856771
interval between two ddls - 1min21s

img_v2_85a6a738-2024-42a5-b11f-da9e78b1da5g

Most of the time are spent on pd scheduling and ddl. If we only count the prehandle time, then we have

parallel cost
4 image
1 8dcd5f79-c03c-4c72-88a5-cf0a0c7b8205

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
    use sysbench to generate data for 300s, then add tiflash replica
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

None

Signed-off-by: CalvinNeo <[email protected]>
Signed-off-by: CalvinNeo <[email protected]>
@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Sep 18, 2023
@ti-chi-bot ti-chi-bot bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Sep 18, 2023
z
Signed-off-by: CalvinNeo <[email protected]>
@ti-chi-bot ti-chi-bot bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Sep 19, 2023
@ti-chi-bot ti-chi-bot bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Sep 20, 2023
@CalvinNeo CalvinNeo changed the title Handle snapshot in parallel if there is only one big region in raftstore v2 [dnm]Handle snapshot in parallel if there is only one big region in raftstore v2 Sep 20, 2023
Signed-off-by: CalvinNeo <[email protected]>
Signed-off-by: CalvinNeo <[email protected]>
Signed-off-by: CalvinNeo <[email protected]>
Signed-off-by: CalvinNeo <[email protected]>
z
Signed-off-by: CalvinNeo <[email protected]>
Signed-off-by: CalvinNeo <[email protected]>
Signed-off-by: CalvinNeo <[email protected]>
@CalvinNeo CalvinNeo changed the title [dnm]Handle snapshot in parallel if there is only one big region in raftstore v2 Handle snapshot in parallel if there is only one big region in raftstore v2 Oct 8, 2023
@CalvinNeo CalvinNeo changed the title Handle snapshot in parallel if there is only one big region in raftstore v2 [dnm]Handle snapshot in parallel if there is only one big region in raftstore v2 Oct 8, 2023
@CalvinNeo
Copy link
Member Author

/hold address todo(split), merge proxy's pr, performance test

@ti-chi-bot ti-chi-bot bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 8, 2023

void SSTFilesToBlockInputStream::readPrefix()
{
// We have to initialize sst readers at an earlier stage,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is before readPrefix

@@ -100,7 +103,8 @@ void SSTFilesToBlockInputStream::readPrefix()
make_inner_func,
ssts_write,
log,
region->getRange());
region->getRange(),
soft_limit.has_value() ? soft_limit.value().split_id : DM::SSTScanSoftLimit::HEAD_SPLIT);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If soft_limit is not set, then it is the only split, which is the legacy case

Signed-off-by: CalvinNeo <[email protected]>
@ti-chi-bot ti-chi-bot bot added needs-1-more-lgtm Indicates a PR needs 1 more LGTM. approved labels Oct 10, 2023
Copy link
Contributor

@JaySon-Huang JaySon-Huang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ti-chi-bot ti-chi-bot bot added the lgtm label Oct 10, 2023
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Oct 10, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: JaySon-Huang, JinheLin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [JaySon-Huang,JinheLin]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot removed the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Oct 10, 2023
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Oct 10, 2023

[LGTM Timeline notifier]

Timeline:

  • 2023-10-10 06:37:52.683059455 +0000 UTC m=+1120670.270169586: ☑️ agreed by JinheLin.
  • 2023-10-10 06:42:20.073018665 +0000 UTC m=+1120937.660128808: ☑️ agreed by JaySon-Huang.

Signed-off-by: CalvinNeo <[email protected]>
Signed-off-by: CalvinNeo <[email protected]>
@CalvinNeo CalvinNeo changed the title [dnm]Handle snapshot in parallel if there is only one big region in raftstore v2 Handle snapshot in parallel if there is only one big region in raftstore v2 Oct 10, 2023
@CalvinNeo
Copy link
Member Author

/unhold

@ti-chi-bot ti-chi-bot bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 10, 2023
Signed-off-by: CalvinNeo <[email protected]>
@ti-chi-bot ti-chi-bot bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 10, 2023
Signed-off-by: CalvinNeo <[email protected]>
@ti-chi-bot ti-chi-bot bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Oct 10, 2023
f
Signed-off-by: CalvinNeo <[email protected]>
@CalvinNeo
Copy link
Member Author

/run-all-tests

1 similar comment
@CalvinNeo
Copy link
Member Author

/run-all-tests

@CalvinNeo
Copy link
Member Author

/unhold

@ti-chi-bot ti-chi-bot bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 10, 2023
@ti-chi-bot ti-chi-bot bot merged commit f497795 into pingcap:master Oct 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Parallel prehandle snapshot to speed up catch up with TiKV large region
3 participants