Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clean up segments with large space amplification ratio generated by logical split #5860

Merged
merged 8 commits into from
Sep 14, 2022

Conversation

lidezhu
Copy link
Contributor

@lidezhu lidezhu commented Sep 13, 2022

What problem does this PR solve?

Issue Number: close #5817

Problem Summary: After logical split, the result segments will ref a DTFile whose range is almost twice as larger than the corresponding segments and these cause space amplification problem which cannot be ignored.

What is changed and how it works?

  1. calculate the ratio of the valid_size/valid_bytes of StableValueSpace to the size/bytes of underlying DTFile and trigger a bg gc if the ratio is too low.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  1. enable logical split and load ch-benchmark;
  2. restart to check whether the store size change;
  3. patch new binary to check whether the store size change;
    After patch, the bg gc start to clean the segments generated by logical split and the store size begin to decrease:
    image
    image

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

None

@ti-chi-bot
Copy link
Member

ti-chi-bot commented Sep 13, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • breezewish
  • flowbehappy

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added do-not-merge/needs-linked-issue release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Sep 13, 2022
@lidezhu lidezhu force-pushed the merge-delta-for-logical-split branch from dfaf59b to b0a4070 Compare September 13, 2022 09:25
@lidezhu lidezhu force-pushed the merge-delta-for-logical-split branch from b0a4070 to 75f51fa Compare September 13, 2022 09:26
Copy link
Contributor

@flowbehappy flowbehappy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lidezhu
Copy link
Contributor Author

lidezhu commented Sep 13, 2022

/run-all-tests

@sre-bot
Copy link
Collaborator

sre-bot commented Sep 13, 2022

Coverage for changed files

Filename                           Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover    Branches   Missed Branches     Cover
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
DeltaMergeStore_InternalBg.cpp         319               255    20.06%          12                 4    66.67%         423               241    43.03%         154               111    27.92%
Segment.h                               40                 6    85.00%          25                 5    80.00%          34                 6    82.35%           4                 2    50.00%
tests/gtest_segment.cpp                445               115    74.16%          22                 1    95.45%         463                31    93.30%         130                82    36.92%
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                                  804               376    53.23%          59                10    83.05%         920               278    69.78%         288               195    32.29%

Coverage summary

Functions  MissedFunctions  Executed  Lines   MissedLines  Cover
18722      8126             56.60%    216615  83475        61.46%

full coverage report (for internal network access only)


LOG_FMT_TRACE(log, "valid_rows [{}], valid_bytes [{}] total_rows [{}] total_bytes [{}]", valid_rows, valid_bytes, total_rows, total_bytes);

return (valid_rows < total_rows * (1 - invalid_data_ratio_threshold)) || (valid_bytes < total_bytes * (1 - invalid_data_ratio_threshold));
Copy link
Member

@breezewish breezewish Sep 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean, if we have logical split enabled, then logical split will be nearly useless (and becomes very similar to physical split), because all segments split-out by logical split usually only contains 50% data (which < the 70% threshold)?

Consider the workload:

  1. Write a lot of data to one segment
  2. Segment is logical split into two
  3. (No further writes to these two segment)

In this PR, the two segment each contains 50% data, so that they both trigger GC.

However, actually in this workload GC will not reclaim further spaces, because all data in the underlying DTFile is "useful".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but the heavy work is moved to background gc thread. And I thought this is exactly the solution that we talked about previously(

Copy link
Member

@breezewish breezewish Sep 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, but may be a better way would be counting by DTFiles, for example, check whether a DTFile is < 70% utilized. When a DTFile is < 70% utilized, it means delta-merging all segments who use this DTFile can result in reclaiming 30% space. Otherwise delta-merging these segments will not have notable benefits.

However, the newly proposed check could be more complex (it definitely needs iterating all segments for multiple rounds) and I'm not sure whether covering this case is useful. @JaySon-Huang @flowbehappy What do you think?

Copy link
Contributor Author

@lidezhu lidezhu Sep 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. This is a better and more complex solution.

Copy link
Contributor Author

@lidezhu lidezhu Sep 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a check for whether this segment share any DTFile with neighbor segments before check the invalid data ratio.
So for segments generated by logical split, it will only do clean work when there is no neighbor sharing the same DTFile.

For DTFile which is shared by two segments, this is exactly the correct behavior we want.

For DTFile which is shared by more than two segments,

  1. this is really a rare case, because it is hard to do yet another logical split on a segment which is generated by logical split.(Check Segment::getSplitPointFast);
  2. even it happens, if one of the multiple segments is updated and on longer ref the shared DTFile, it is beneficial to do the clean work on the remaining segments;

So I think the current solution is good enough for our purpose.

Copy link
Member

@breezewish breezewish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lidezhu
Copy link
Contributor Author

lidezhu commented Sep 13, 2022

/run-all-tests

@lidezhu lidezhu force-pushed the merge-delta-for-logical-split branch 2 times, most recently from 70ba578 to 935e997 Compare September 13, 2022 14:22
@sre-bot
Copy link
Collaborator

sre-bot commented Sep 13, 2022

Coverage for changed files

Filename                           Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover    Branches   Missed Branches     Cover
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
DeltaMergeStore_InternalBg.cpp         335               259    22.69%          13                 4    69.23%         459               258    43.79%         170               116    31.76%
Segment.h                               40                 5    87.50%          25                 4    84.00%          34                 5    85.29%           4                 2    50.00%
tests/gtest_segment.cpp                447               115    74.27%          22                 1    95.45%         472                31    93.43%         130                82    36.92%
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                                  822               379    53.89%          60                 9    85.00%         965               294    69.53%         304               200    34.21%

Coverage summary

Functions  MissedFunctions  Executed  Lines   MissedLines  Cover
18723      8126             56.60%    216660  83459        61.48%

full coverage report (for internal network access only)

@lidezhu lidezhu force-pushed the merge-delta-for-logical-split branch 2 times, most recently from 5d408ba to e2b792f Compare September 14, 2022 01:08
@ti-chi-bot ti-chi-bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Sep 14, 2022
@lidezhu lidezhu force-pushed the merge-delta-for-logical-split branch from e2b792f to 50fb4f1 Compare September 14, 2022 01:33
@lidezhu lidezhu force-pushed the merge-delta-for-logical-split branch from 50fb4f1 to d3f00d8 Compare September 14, 2022 01:34
@lidezhu
Copy link
Contributor Author

lidezhu commented Sep 14, 2022

/run-all-tests

if (seg)
{
const auto & dm_files = seg->getStable()->getDMFiles();
for (const auto & file : dm_files)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW I'm curious, when will we have multi DTfiles in the stable currently?

Copy link
Contributor Author

@lidezhu lidezhu Sep 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually there is just one DTFile now. But we keep the interface like this to avoid the assumption that there is only one DTFile. So we can support multiple DTFiles in a stable easier in the future if needed.

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Sep 14, 2022
@sre-bot
Copy link
Collaborator

sre-bot commented Sep 14, 2022

Coverage for changed files

Filename                           Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover    Branches   Missed Branches     Cover
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
DeltaMergeStore_InternalBg.cpp         343               261    23.91%          13                 4    69.23%         483               262    45.76%         176               118    32.95%
Segment.h                               40                 5    87.50%          25                 4    84.00%          34                 5    85.29%           4                 2    50.00%
StableValueSpace.cpp                   121                21    82.64%          16                 1    93.75%         306                46    84.97%          78                23    70.51%
StableValueSpace.h                      18                 7    61.11%          14                 3    78.57%          35                24    31.43%           4                 4     0.00%
tests/gtest_segment.cpp                452               115    74.56%          22                 1    95.45%         490                31    93.67%         130                82    36.92%
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                                  974               409    58.01%          90                13    85.56%        1348               368    72.70%         392               229    41.58%

Coverage summary

Functions  MissedFunctions  Executed  Lines   MissedLines  Cover
18724      8127             56.60%    216730  83502        61.47%

full coverage report (for internal network access only)

@lidezhu
Copy link
Contributor Author

lidezhu commented Sep 14, 2022

/merge

@ti-chi-bot
Copy link
Member

@lidezhu: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 308b9c4

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Sep 14, 2022
@lidezhu
Copy link
Contributor Author

lidezhu commented Sep 14, 2022

/run-unit-test

@sre-bot
Copy link
Collaborator

sre-bot commented Sep 14, 2022

Coverage for changed files

Filename                           Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover    Branches   Missed Branches     Cover
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
DeltaMergeStore_InternalBg.cpp         343               261    23.91%          13                 4    69.23%         483               262    45.76%         176               118    32.95%
Segment.h                               40                 5    87.50%          25                 4    84.00%          34                 5    85.29%           4                 2    50.00%
StableValueSpace.cpp                   121                21    82.64%          16                 1    93.75%         306                46    84.97%          78                23    70.51%
StableValueSpace.h                      18                 7    61.11%          14                 3    78.57%          35                24    31.43%           4                 4     0.00%
tests/gtest_segment.cpp                452               115    74.56%          22                 1    95.45%         490                31    93.67%         130                82    36.92%
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                                  974               409    58.01%          90                13    85.56%        1348               368    72.70%         392               229    41.58%

Coverage summary

Functions  MissedFunctions  Executed  Lines   MissedLines  Cover
18746      8125             56.66%    217121  83510        61.54%

full coverage report (for internal network access only)

@ti-chi-bot ti-chi-bot merged commit 1fdea3e into pingcap:master Sep 14, 2022
@lidezhu lidezhu deleted the merge-delta-for-logical-split branch September 14, 2022 04:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Clean up segments referencing a small subset of DTFile pack to reduce space amplification
5 participants