-
Notifications
You must be signed in to change notification settings - Fork 466
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
db: SeekPrefixGE lazy positioning, or GetPrefix #2002
Comments
Added the consider-23.2 label because this and avoiding bloom filter block reads (by elevating the mvcc props to the fileMetadata) seems especially beneficial with disaggregated storage, where each new table read requires reading from remote storage. |
I'm a bit confused by
since I thought this is trying to avoid reading the bloom filter of that file. Maybe I am misreading the sentence. |
Ah yeah, when I started writing the issue I was thinking we'd perform the bloom filter check before returning the synthetic key and we'd only be saving the block loads on the tables that the filter fails to exclude, but I think it makes more sense to reverse the order to try to avoid the bloom filter check as well.
Yeah, that's right |
This may be helpful with #3230 for workloads that are frequently reading and updating recently-written keys. It would shift the working set of data towards the higher levels of the LSM (which are also smaller), making it more likely that a read finds that everything it needs is already in either the block cache or OS page cache. |
I have not thought through the design space here in detail, but it seems possible to use MVCC metadata about sstables (eg, the computed block properties) to avoid reading files that contain older versions of a key during a SeekPrefixGE. The goal would be to reduce block reads during MVCCGets, making MVCCGets performance profile more similar to a pebble Get.
A design, just to serve as an illustrative example:
<prefix>@<sstable's max mvcc timestamp>
key if the bloom filter indicates the file may contain the keyif we elevated MVCC timestamps into the
*fileMetadata
, it seems like we could even avoid some of the bloom filter reads and table loads.Somewhat related to #2182.
Jira issue: PEBBLE-142
The text was updated successfully, but these errors were encountered: