-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: perform point key-value reads where possible #46758
Comments
Thanks for taking a look at this! I did an informal experiment with this a while ago and didn't record my results, but I don't think I saw anything like 2%. We'll put it into the backlog, although there are a few performance improvements (lookup join stuff, removal of extra planning latency) we're planning that will probably take priority. |
Thanks Jordan! |
I finally got around to writing this up in #53939 after seeing it come up again in TPC-C. |
In his work with the separated lock table, @sumeerbhola is introducing another case where point reads will be optimized over ranged reads. With his changes, point reads will be able to use bloom filters when searching the persistent portion of the lock table. This will not be possible for ranged reads. |
(Adding to the earlier list of performance benefits) A scan over a span that has at most 1 key, but does not know it, needs to iterate over all the versions (or eventually do a second expensive seek), looking for the non-existent next key in the span. A Get can do a single seek, read the key, and be done. Regarding the lock table, if locks are rare and the BatchRequest has a large number of ScanRequests, it may actually be faster than a similar number of GetRequests, since the first scan will seek past all the spans being requested and the later seeks will get optimized, due to the Pebble improvements in cockroachdb/pebble#951 and cockroachdb/pebble#947. But we can always change a Get to a Scan inside the storage package (or below) if we find any significant performance benefit, but we can't do the reverse. |
@nvanbenschoten do you have an intuition about whether it would be better, in the case of a fetch of a row with 2 (or 3 or 4 or n) column families, to issue n GetRequests, one per family, or 1 ScanRequest? |
This is an interesting question! I don't have much intuition about which would be better or where the crossover point lies. Let's run some experiments and let that guide our decision here. |
While I've got your attention, do you have an idea about the workload that would be mostly likely to showcase a difference between GetRequest and ScanRequest? I've been playing with KV and have seen nearly zero noticeable difference in performance. |
You've been playing with an empty database, right? That might be contributing, as a major benefit of point reads is that they can use bloom filters to reduce read amplification once an LSM fills out. This effect will also be more pronounced once the entire data set no longer fits in RAM. I know it's slow, but consider throwing a few iterations of |
The KV API exposes both a ranged
ScanRequest
read operation and a pointGetRequest
read operation. Currently, SQL'srow.Fetcher
only ever usesScanRequest
s, even when looking up a discrete set of keys. We should explore updating it to useGetRequest
s when possible.For a multitude of reasons, we expect point KV operations to perform marginally better than ranged KV operations, even in cases where the SQL layer knows that the ranged operation operates over a fixed number of keys. Here's a non-exhaustive list of reasons why:
MVCCGet
, unlikeMVCCScan
, is able to use a prefix iterator, which allows it to make use of RocksDB/Pebble bloom filters. There may also be other wins at the storage layer, like avoiding prefetchingWe've known that these could add up to a performance win on simple workloads for some time. However, (to my knowledge) we've never actually made any measurements. Inspired by #46752, I hacked up a test using this patch. I then ran
kv95
on a single-node cluster with and without the load-gen change that enables the use ofGetRequest
instead ofScanRequest
. This led to the following improvement in throughput:This seems significant enough to justify further investigation.
I think there's also a similar issue to be filed about
DeleteRequest
vsDeleteRangeRequest
.cc. @jordanlewis for triage.
The text was updated successfully, but these errors were encountered: