-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-25880 remove files from filesCompacting when clear compaction queue #3261
base: master
Are you sure you want to change the base?
Conversation
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you see a problem from our not removing files from compacting list? Bad metrics or something?
A few notes in below. Thanks.
Oh, for the future, the subject of the PR should lead off w/ the JIRA that covers the PR. Just an FYI. You add the JIRA in the comment.... so we can fix on merge...
} | ||
} | ||
|
||
public List<HStoreFile> getFilesCompacting() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for test only?
See hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java for how methods like this are marked with a @RestrictedApi annotation.
It looks like this method and the one above it can be package-private rather than public?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. Thanks for your advise. I will fix it.
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplit.java
Show resolved
Hide resolved
Thanks for taking a look on my PR. For system compaction, selectNow will be false, which means files will be added to the compacting list when the compaction task is actually executed. So it is safe to just clear the workQueue of compaction thread pool executor. However, for user-triggered compaction, selectNow will be true, which means files are already added to the compacting list when we put the compaction request to the queue. When CompactionPolicy selects candidate files for minor compaction, files in compacting list will be excluded. If we just clear the workQueue of compaction thread pool executor, files of these compaction requests are still in compacting list. These files will not be selected by CompactionPolicy and compacted any more unless we force a major compaction. So I think maybe we should remove those files from compacting list when we clear the compaction queue. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplit.java
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplit.java
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplit.java
Show resolved
Hide resolved
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions but LGTM. You'll need to get the go from @guangxuCheng Thanks.
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplit.java
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplit.java
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
Show resolved
Hide resolved
continue; | ||
} | ||
CompactionRunner runner = (CompactionRunner) runnable; | ||
if (runner.compaction != null && runner.compaction.hasSelection()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No data race here? Not sure, just asking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking a look on this PR.
I don't think there is data race here. It's safe. Thanks.
@Apache9 @guangxuCheng hi, any questions here? |
Can we pick this back up? This is actually a big issue. If you clear compaction queues, you're almost guaranteed to enter a case where new compactions for those regions cannot proceed. Because the compaction selection will see many of the files of the Stores as already compacting, so will be skipped. |
Have you faced this problem in your production? @bbeaudreault We can pick this up I think. |
Yea, we just ran into this twice recently in our production. We didn't realize that compactions weren't working until a while later when we got alerted for too many storefiles. The only solution was to move the regions off then back on. I took a quick look and prior to adding to workQueue, we add to filesCompacting. So if we remove from workQueue, we have to remove from filesCompacting. Otherwise those files will never be compacted until the region moves away from the RS. |
We have this patch in our internal branch for a long time, and the effect is in line with expectations. So I think we can pick this up too. |
@bbeaudreault This PR has some conflicts with the master branch. I can fix PR if you need. And if you have new or more ideas, it's completely ok for me to close this PR and you can submit new one. I'll be happy to help review it. Thanks. |
Thanks @frostruan. I don't have any other feedback here, I just saw that it stalled. If you have time to finish it, that'd be great. If not I can ask someone at my company to resubmit it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code to me looks exactly how I'd handle it. Lgtm once merge conflict fixed and pre-commit passes
If anyone at your company interested in this, please feel free to submit new PR. I'll be happy to help review. Thanks. |
When clear compaction queues, we just clear the workQueue of ThreadPoolExecutor, but files in compaction request are still in filesCompacting list. maybe we should clear it also.
https://issues.apache.org/jira/browse/HBASE-25880