-
Notifications
You must be signed in to change notification settings - Fork 412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clean up segments with large space amplification ratio generated by logical split #5860
clean up segments with large space amplification ratio generated by logical split #5860
Conversation
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
dfaf59b
to
b0a4070
Compare
b0a4070
to
75f51fa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/run-all-tests |
Coverage for changed files
Coverage summary
full coverage report (for internal network access only) |
|
||
LOG_FMT_TRACE(log, "valid_rows [{}], valid_bytes [{}] total_rows [{}] total_bytes [{}]", valid_rows, valid_bytes, total_rows, total_bytes); | ||
|
||
return (valid_rows < total_rows * (1 - invalid_data_ratio_threshold)) || (valid_bytes < total_bytes * (1 - invalid_data_ratio_threshold)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean, if we have logical split enabled, then logical split will be nearly useless (and becomes very similar to physical split), because all segments split-out by logical split usually only contains 50% data (which < the 70% threshold)?
Consider the workload:
- Write a lot of data to one segment
- Segment is logical split into two
- (No further writes to these two segment)
In this PR, the two segment each contains 50% data, so that they both trigger GC.
However, actually in this workload GC will not reclaim further spaces, because all data in the underlying DTFile is "useful".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but the heavy work is moved to background gc thread. And I thought this is exactly the solution that we talked about previously(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure, but may be a better way would be counting by DTFiles, for example, check whether a DTFile is < 70% utilized. When a DTFile is < 70% utilized, it means delta-merging all segments who use this DTFile can result in reclaiming 30% space. Otherwise delta-merging these segments will not have notable benefits.
However, the newly proposed check could be more complex (it definitely needs iterating all segments for multiple rounds) and I'm not sure whether covering this case is useful. @JaySon-Huang @flowbehappy What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. This is a better and more complex solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a check for whether this segment share any DTFile with neighbor segments before check the invalid data ratio.
So for segments generated by logical split, it will only do clean work when there is no neighbor sharing the same DTFile.
For DTFile which is shared by two segments, this is exactly the correct behavior we want.
For DTFile which is shared by more than two segments,
- this is really a rare case, because it is hard to do yet another logical split on a segment which is generated by logical split.(Check
Segment::getSplitPointFast
); - even it happens, if one of the multiple segments is updated and on longer ref the shared DTFile, it is beneficial to do the clean work on the remaining segments;
So I think the current solution is good enough for our purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except for the comment in https://github.com/pingcap/tiflash/pull/5860/files#r969439210, the rest LGTM.
/run-all-tests |
70ba578
to
935e997
Compare
Coverage for changed files
Coverage summary
full coverage report (for internal network access only) |
5d408ba
to
e2b792f
Compare
e2b792f
to
50fb4f1
Compare
…e check data out of range
50fb4f1
to
d3f00d8
Compare
/run-all-tests |
if (seg) | ||
{ | ||
const auto & dm_files = seg->getStable()->getDMFiles(); | ||
for (const auto & file : dm_files) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW I'm curious, when will we have multi DTfiles in the stable currently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually there is just one DTFile now. But we keep the interface like this to avoid the assumption that there is only one DTFile. So we can support multiple DTFiles in a stable easier in the future if needed.
Coverage for changed files
Coverage summary
full coverage report (for internal network access only) |
/merge |
@lidezhu: It seems you want to merge this PR, I will help you trigger all the tests: /run-all-tests You only need to trigger If you have any questions about the PR merge process, please refer to pr process. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
This pull request has been accepted and is ready to merge. Commit hash: 308b9c4
|
/run-unit-test |
Coverage for changed files
Coverage summary
full coverage report (for internal network access only) |
What problem does this PR solve?
Issue Number: close #5817
Problem Summary: After logical split, the result segments will ref a
DTFile
whose range is almost twice as larger than the corresponding segments and these cause space amplification problem which cannot be ignored.What is changed and how it works?
StableValueSpace
to the size/bytes of underlyingDTFile
and trigger a bg gc if the ratio is too low.Check List
Tests
After patch, the bg gc start to clean the segments generated by logical split and the store size begin to decrease:
Side effects
Documentation
Release note