-
Notifications
You must be signed in to change notification settings - Fork 411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Yield CPU for concurrent flush and concurrent mergeDelta (#5410) #5424
Yield CPU for concurrent flush and concurrent mergeDelta (#5410) #5424
Conversation
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
The merge conflict is caused by #5296 not merged. |
@breezewish can you resolve the conflicts and make this pr merged? |
fba64c5
to
af90192
Compare
Signed-off-by: Wish <[email protected]>
af90192
to
f1025db
Compare
/run-all-tests |
Coverage for changed files
Coverage summary
full coverage report (for internal network access only) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/merge |
@breezewish: It seems you want to merge this PR, I will help you trigger all the tests: /run-all-tests You only need to trigger If you have any questions about the PR merge process, please refer to pr process. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
This pull request has been accepted and is ready to merge. Commit hash: f1025db
|
@ti-chi-bot: Your PR was out of date, I have automatically updated it for you. At the same time I will also trigger all tests for you: /run-all-tests If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
/run-unit-test |
/run-all-tests |
Coverage for changed files
Coverage summary
full coverage report (for internal network access only) |
This is an automated cherry-pick of #5410
What problem does this PR solve?
Issue Number: close #5409
What is changed and how it works?
Add sleep for
flushCache
andmergeDeltaBySegment
:When
flushCache
is retrying, wait backoff will be 5ms ~ 100ms (considering that existing flushCache usually takes short time to finish).When
mergeDeltaBySegment
is retrying, wait backoff will be 50ms ~ 1s (considering that split-prepare could take several seconds to finish).Check List
Tests
To test with the fix, I introduced a
splitEachSegment
debug function locally to manually trigger a split:The test case is to trigger the split for a 1GB segment, and then perform a mergeDelta at the same time.
Before the fix (using release v6.1):
when there are both split (takes 10s) and mergeDelta (takes 20s in total, blocked by split for 10s), there are 211K retries in 10s when the mergeDelta is blocked:
The CPU usage is around 200% during the split+mergeDelta:
After the fix:
there are only 14 retry attempts with exp backoff:
The CPU usage keeps around 100% (first 11s for split, next 10s for mergeDelta):
Note: As there is maximum 1s backoff, the CPU usage dropped for a short while when split was finished and the mergeDelta was not yet started.
Side effects
Documentation
Release note