-
Notifications
You must be signed in to change notification settings - Fork 411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
flush cache before segment merge #4959
flush cache before segment merge #4959
Conversation
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
@lidezhu: This cherry pick PR is for a release branch and has not yet been approved by release team. To merge this cherry pick, it must first be approved by the collaborators. AFTER it has been approved by collaborators, please ping the release team in a comment to request a cherry pick review. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
And we should also pick this bug-fix for this patch version #4746
be63fce
to
29116a7
Compare
What problem does this PR solve?
Issue Number: ref ##4956
Problem Summary:
When do segment split, we try to copy the tail column files in the delta layer of the original segment to the new result
segments. So the new segments may contain data that doesn't belong to its segment range.
And this is ok for most cases, because the redundant data will be filtered out by the segment range when serve the read requests to the segment. So the redundant is invisible in almost all cases.
But when do segment merge later, if the previous redundant data is still not flushed to disk, it will be directly copied to the new merged segment again.
So the redundant data in each segment become visible again after segment merge which may cause potential data incorrectness.
What is changed and how it works?
Flush cache before every merge operation. So the potential unsaved data will be filtered out by the segment range when do merge.
Check List
Tests
Side effects
Documentation
Release note