-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-27227 Long running heavily filtered scans hold up too many ByteBuffAllocator buffers #4940
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
c2ede4b
to
737b6dc
Compare
This comment was marked as outdated.
This comment was marked as outdated.
737b6dc
to
997a8e7
Compare
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
db87f12
to
4f3dd81
Compare
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
4f3dd81
to
eb19aee
Compare
…BuffAllocator buffers
eb19aee
to
8d2fee2
Compare
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
@@ -426,6 +426,8 @@ private boolean nextInternal(List<Cell> results, ScannerContext scannerContext) | |||
// Used to check time limit | |||
LimitScope limitScope = LimitScope.BETWEEN_CELLS; | |||
|
|||
checkpoint(State.START); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is just for RegionScannerImpl.filter is not null we should checkpoint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point. i think i can do that, let me look into it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved the initial checkpoint call into the constructor, so we only have to do it once. The other checkpoint calls have been changed to a new checkpointIfFiltering
method which returns early if filter == null
.
The initial checkpoint call is still important because it enables retainBlock()
functionality in StoreScanner. Someone could submit a scan with addColumn(...)
which looks for 1 column in rows with many columns. In which case retainBlock()
in StoreScanner would still be very useful. I added a comment to explain that.
Thanks for looking!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bbeaudreault , thank you very much for detailed reply. Overall LGTM, it is really a very insightful PR, just have one suggestion , FYI.
- Should we set
HFileScannerImpl.lastCheckpointIndex
to 0 when initializing ? so we could simplify the
if (shouldRetainBlock || lastCheckpointIndex < 0)
inHFileScannerImpl.handlePrevBlock
to
if (shouldRetainBlock)
, after all, when we start to scan,
HFileScannerImpl.lastCheckpointIndex
is always >=0, no matter there is filter or not, and we could also
removecheckpoint(State.START)
inRegionScannerImpl
ctor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again @comnetwork.
I agree ideally we could do what you say. The reason I have the complexity is because I'm concerned about use-cases of HFileScanner or StoreFileScanner which don't go through StoreScanner. For example HFilePrettyPrinter, bulk load verification, etc.
The lastCheckpointIndex < 0
check ensures that we only honor shouldRetainBlock
if a call to checkpoint(State)
has occurred. The contract is that if you are using checkpointing, you also need to call retainBlock()
. If you are not using checkpointing, you don't need to call retainBlock().
Failing to call retainBlock()
would result in blocks being released too early, so I only want this to apply for StoreScanner, which is the only place we currently do checkpointing.
A better approach might be to add a new boolean checkpointEnabled
in HFileScannerImpl constructor. This is more explicit but involves adding boolean arguments to many various getScanner methods. I can give this a shot, or also open to other ideas if you have them.
Let me know if this wasn't clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@comnetwork I just pushed a commit which adds a boolean checkpointingEnabled
to the creation of HFileScannerImpl. This involves minor modifications to many levels of getScanner(...)
calls. I added the new param everywhere necessary, defaulting to false
everywhere. This retains the old behavior for all usages. I then updated just StoreScanner to pass true
, as this is the only place we want to enable this behavior right now.
This is a bunch more small changes, but is probably the safest route. If there were a bug where we accidentally passed false, we'd just be reverting to the original behavior and losing the optimization. You can see how it affects things in HFileReaderImpl.
Let me know what you think. I can revert that commit or we can keep it. As a result I was able to simplify some of the checkpointing (default to 0 instead of -1, no need to call checkpoint on new scanners).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bbeaudreault , thank you for the elaborate work and I agree with your point that adding a new boolean checkpointEnabled
make the logic more clear.
The contract is that if you are using checkpointing, you also need to call retainBlock(). If you are not using checkpointing, you don't need to call retainBlock().
I agree with the first part of the sentence, but have doubts about the second. From you code, I think for scan by RegionScannnerImpl
, retainBlock
is always needed to release blocks early, only when there is filter(especially row level filter),we need checkpoint further to narrow the blocks which should be retained after a row is filtered?I think we could just only use retainBlock
for user scan which specifying columns but not has filters, so we don't need to call checkpoint
. If you agree with my point, I think the variable name boolean checkpointingEnabled
is not very appropriate, maybe earlyReleaseBlockEnabled
is more indicative of intentions?
…ng. Add another test/docs
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
…ely on implicit enabling via checkpoint
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
Maybe a simpler solution is to just return the result(maybe empty) to client when we have already hold up too many buffers? We have a ScannerContext to limit the time and size, we can add a new buffer size to the ScannerContext? |
You might be right. That was my initial implementation actually, but I decided to try releasing them instead because I had this checkpointing idea. I admit that it's getting complicated as is. I'm going to leave this open for now, but I will do some tests and open a separate PR to show how it would work with ScannerContext. |
I've pushed #4967 which handles this using ScannerContext. I still need to add tests, but it's working on one of our test clusters. |
@Apache9 my other PR which does this using ScannerContext is ready for review. #4967. I'm going to close this PR for now, though may revisit in the future if we ever need to do something more. Thanks for reviewing this to you and @comnetwork |
prevBlocks
list. These blocks are currently only ever released inshipped()
orclose()
.checkpoint(State.START)
. This sets state in StoreScanner and HFileReaderImpl.retainBlock()
whenever a cell is added to the result list. This causes the block to be added to prevBlocks once it's fully scanned, otherwise it gets eagerly released. If no cells are pulled from a block at this level, we can easily immediately release it.checkpoint(State.FILTERED)
, which releases any blocks accumulated since the last checkpoint. This accounts for cases where filters clear the result list after being applied.filterRowKey
.filterRowCells
orfilterRow
.I've added extensive tests in TestBlockEvictionFromClient. These verify the expected number of retained block for various filter cases, ensures the returned results are accurate, and additionally forces block cache corruption on any early released blocks. The corruption would cause the test to fail if any necessary blocks were wrongly released, because the rpc shipper tries serializing cells from buffers whose content has changed.
I also added unit tests to StoreScanner and HFileReaderImpl for the relevant smaller changes there.