-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-26018 Perf improvement in L1 cache - Optimistic call to buffer.retain() #3407
Conversation
virajjasani
commented
Jun 21, 2021
•
edited
Loading
edited
- CHM#computeIfPresent takes lock on bucket but CHM#get is lockless
- Atomically retaining refCount is coming up bit expensive in terms of performance
- When we see aggressive cache hits for meta blocks (with major blocks in cache), we would want to get away from coarse grained locking
- Treat cache read API as optimistic read
FYI @ben-manes |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
LruCachedBlock cb = map.get(cacheKey); | ||
if (cb != null) { | ||
try { | ||
cb.getBuffer().retain(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And if it is purge from cache by a background thread, we'll have a cb w/ a non-zero refcount that is not in the cache? Will that work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is purged from cache by:
protected long evictBlock(LruCachedBlock block, boolean evictedByEvictionProcess) {
LruCachedBlock previous = map.remove(block.getCacheKey()); =======> removed from map
if (previous == null) {
return 0;
}
updateSizeMetrics(block, true);
long val = elements.decrementAndGet();
if (LOG.isTraceEnabled()) {
long size = map.size();
assertCounterSanity(size, val);
}
if (block.getBuffer().getBlockType().isData()) {
dataBlockElements.decrement();
}
if (evictedByEvictionProcess) {
// When the eviction of the block happened because of invalidation of HFiles, no need to
// update the stats counter.
stats.evicted(block.getCachedTime(), block.getCacheKey().isPrimary());
if (victimHandler != null) {
victimHandler.cacheBlock(block.getCacheKey(), block.getBuffer());
}
}
// Decrease the block's reference count, and if refCount is 0, then it'll auto-deallocate. DO
// NOT move this up because if do that then the victimHandler may access the buffer with
// refCnt = 0 which is disallowed.
previous.getBuffer().release(); ============================> buffer released
return block.heapSize();
}
Based on above mentioned eviction code, we have below mentioned possibilities when eviction and getBlock happens for the same block at the same time:
- getBlock retrieves block from map, eviction removes it from map, eviction does release(), getBlock does retain() and encounters IllegalRefCount Exception, we handler it with this patch and treat it as cache miss.
- getBlock retrieves block from map, eviction removes it from map, getBlock does retain(), eviction does release(). Since getBlock retain() was successful, it proceeds as successful cache hit, which happens even today with computeIfPresent. Subsequent getBlock call will return null as block was evicted previously.
- eviction removes from map, getBlock gets null, it's clear cache miss.
I think we seem good here. WDYT @saintstack?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for possibility#2 in above, we stand a chance where buffer with non-zero refCount is not in the cache. I see, let me see what alternatives we have for this case.
Although I still think that same case can happen even today.
getBlock does retain() which will bring refCount of BB to 2, while getBlock is busy updating stats, eviction thread can evict block from cache and it does release() which will bring refCount of BB to 1. So even in this case, we can positive refCount buffer which is evicted from cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#1 sounds good.
#2 yeah, it can get interesting. The computeIfPresent made reasoning easier for sure.
Running w/ #get instead of #computeIfPresent -- even though it incorrect -- changed the locking profile of a loaded process; before the change, the blockage in computeIfPresent was the biggest blockage. After, biggest locking consumer was elsewhere and much more insignificant percentage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @saintstack.
After, biggest locking consumer was elsewhere and much more insignificant percentage
Does this mean we can kind of ignore this case (assuming objects not in cache will get GC'ed regardless of their netty based refCount)? Still thinking about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@virajjasani Thats an interesting idea. Whether onheap or offheap, if no references -- i.e. not tied to a pool -- then they should get GC'd. Does the CB get returned to the cache when done?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this more.... I don't think we can do your trick afterall.
The refcounting is not for the cache, it is for a backing pool of memory used reading data in from hdfs into the cache. When we evict a block from the cache, we call #release on the memory. If the refcount is zero, the memory is released and can be reused in the backing pool. If #release is called and the #refcount is not zero, we just decrement the refcount.
A cached buffer item detached from the cache still needs to have its #release called w/ refcount at zero so the backing memory gets readded to the pool.
So it seems to me. What you think @virajjasani
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the CB get returned to the cache when done?
You mean if CB gets returned to L1 cache (CHM) after it's buffer has served read request? Yes, that's the case (unless I misunderstood the question)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Sorry, forgot to submit my comment from a good while ago)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A cached buffer item detached from the cache still needs to have its #release called w/ refcount at zero so the backing memory gets readded to the pool.
Yeah I think this makes sense. Let me get back to this in case I find some better and obvious way to improve perf and get some YCSB results.
@virajjasani In what context? What test case? What does "bit expensive" mean? Can we see the data? |