-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: issue GetRequests from SQL when possible #61583
Conversation
So far, this looks to make almost no difference, but I haven't spent time doing a careful benchmark, and I don't know the best/worst cases to test against. I ran KV95 on a single gceworker node for 60 seconds on empty data directories before and after the patch. After the patch:
Before the patch:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty cool stuff! It's the first time I looked into this code, but I still have some comments :)
Reviewed 5 of 5 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @nvanbenschoten, and @solongordon)
pkg/kv/kvclient/kvcoord/dist_sender.go, line 617 at r1 (raw file):
case *roachpb.QueryIntentRequest, *roachpb.EndTxnRequest, *roachpb.GetRequest: // Accepted point requests that can be in batches with limit.
super nit: it'd be nice to keep indentation the same here and a few lines above.
pkg/sql/row/kv_batch_fetcher.go, line 266 at r1 (raw file):
// spans. for i := 1; i < len(spans); i++ { prevKey := spans[i-1].EndKey
nit: I'd do s/prevKey/prevEndKey/g
and s/curKey/curEndKey/g
.
pkg/sql/row/kv_batch_fetcher.go, line 272 at r1 (raw file):
curKey := spans[i].EndKey if curKey == nil { curKey = spans[i-1].Key
I think this should be spans[i].Key
, no?
fd32a0c
to
b183da7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten, @solongordon, and @yuzefovich)
pkg/kv/kvclient/kvcoord/dist_sender.go, line 617 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
super nit: it'd be nice to keep indentation the same here and a few lines above.
Done.
pkg/sql/row/kv_batch_fetcher.go, line 272 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
I think this should be
spans[i].Key
, no?
Done. Thanks for noticing, this was broken.
b183da7
to
044dab1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 35 of 36 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @nvanbenschoten, @solongordon, and @yuzefovich)
pkg/sql/distsql_physical_planner.go, line 883 at r2 (raw file):
span := spans[i] if len(span.EndKey) == 0 { span = roachpb.Span{
Can you add a comment why we need to do this?
pkg/sql/span_builder_test.go, line 67 at r2 (raw file):
prefixLen: 2, numNeededFamilies: 1, canSplit: true,
Why is there a change here? Do we now split in more cases than before?
pkg/sql/physicalplan/fake_span_resolver.go, line 82 at r2 (raw file):
} /*
Leftover?
044dab1
to
d7941d3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten, @solongordon, and @yuzefovich)
pkg/sql/distsql_physical_planner.go, line 883 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Can you add a comment why we need to do this?
Done.
pkg/sql/span_builder_test.go, line 67 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Why is there a change here? Do we now split in more cases than before?
Yes, the idea here is that we now "split" in all cases where we can produce a single-key lookup. This example has just a single column family to look up, so we "split" the span into a single GetRequest for that column family.
pkg/sql/physicalplan/fake_span_resolver.go, line 82 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Leftover?
Done.
28232e8
to
d33e395
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 9 of 11 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @nvanbenschoten, @solongordon, and @yuzefovich)
pkg/sql/colfetcher/colbatch_scan.go, line 278 at r3 (raw file):
spans := s.spans[:0] specSpans := spec.Spans // We need to check that, if this scan is for a single row, and this scan
This seems like a leftover comment?
pkg/sql/opt/exec/execbuilder/testdata/select, line 1771 at r3 (raw file):
query T SELECT message FROM [SHOW TRACE FOR SESSION] WHERE message LIKE 'querying next range% at /Table/74/1%' OR
nit: is the first %
needed?
6b7ad1b
to
78d76b6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten, @solongordon, and @yuzefovich)
pkg/sql/colfetcher/colbatch_scan.go, line 278 at r3 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
This seems like a leftover comment?
Done.
pkg/sql/opt/exec/execbuilder/testdata/select, line 1771 at r3 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: is the first
%
needed?
No, fixed
@tbg or @nvanbenschoten, PTAL at the first commit, which teaches Consider a 4 column family table and a query that skips an inner column family.
This PR handles this request by creating a ScanRequest for I suppose we could fix this in another way, like teaching the distsender or fetcher to be smarter somehow, but that seems inelegant - to me it feels natural that |
78d76b6
to
ce5f97f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fantastic! Thanks for pushing this change along and apologies for the delayed review.
Reviewed 4 of 36 files at r2, 2 of 11 files at r3, 38 of 38 files at r4, 36 of 36 files at r5.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @solongordon, @tbg, and @yuzefovich)
pkg/kv/kvserver/replica_evaluate_test.go, line 130 at r4 (raw file):
}, }, { // A batch limited to return only one key. Throw in a Get which will come
Nice! This looks like a good change. Can we add some cases for when the Get request comes first and exceeds the byte/key limit?
pkg/kv/kvserver/batcheval/cmd_get.go, line 38 at r4 (raw file):
var intent *roachpb.Intent var err error if h.MaxSpanRequestKeys < 0 || h.TargetBytes < 0 {
Let's leave a comment here about how the batch must have already exhausted its key/byte limit and how this mirrors the behavior of MVCCScan with a negative MaxKeys
or TargetBytes
.
pkg/kv/kvserver/batcheval/cmd_get.go, line 38 at r4 (raw file):
var intent *roachpb.Intent var err error if h.MaxSpanRequestKeys < 0 || h.TargetBytes < 0 {
I think it would be cleaner to structure this as an early return, since nothing that follows is needed in this case.
pkg/kv/kvserver/batcheval/cmd_get_test.go, line 28 at r4 (raw file):
key := roachpb.Key([]byte{'a'}) _, err := Get(ctx, nil, CommandArgs{ Header: roachpb.Header{TargetBytes: -1},
Can we split this into three cases, one with a negative TargetBytes
, one with a negative MaxSpanRequestKeys
, and one with neither set?
pkg/sql/row/kv_batch_fetcher.go, line 361 at r5 (raw file):
} } else { scans := make([]struct {
It's unfortunate that this allocation is now wasted in some cases and also that it can't be used for the Get
case. I wonder what we can do for this. Would it be crazy to loop through the spans ahead of time (outside of this reverse
block), count the point operations, allocate an appropriately sized gets
slice, and then do some math to accurately compute the size of these allocations?
pkg/sql/row/kv_batch_fetcher.go, line 517 at r5 (raw file):
case *roachpb.GetResponse: // TODO(jordan): do we have to worry about the IntentValue field on this // response?
No, we shouldn't need to. The value is only included for lower consistency-level reads.
ce5f97f
to
2cd87ed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Nathan, it's RFAL.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten, @solongordon, @tbg, and @yuzefovich)
pkg/sql/row/kv_batch_fetcher.go, line 361 at r5 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
It's unfortunate that this allocation is now wasted in some cases and also that it can't be used for the
Get
case. I wonder what we can do for this. Would it be crazy to loop through the spans ahead of time (outside of thisreverse
block), count the point operations, allocate an appropriately sizedgets
slice, and then do some math to accurately compute the size of these allocations?
Done.
2cd87ed
to
68e2647
Compare
f3232bb
to
475d551
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 2 of 0 LGTMs obtained (waiting on @jordanlewis, @nvanbenschoten, @solongordon, and @tbg)
pkg/ccl/logictestccl/testdata/logic_test/multi_region, line 1477 at r10 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
Fixed. I added this change as a separate commit.
Thanks, looks great!
pkg/sql/rowenc/index_encoding.go, line 387 at r13 (raw file):
// input span is for a table that has just a single column family, so that the // caller can have a precise key to send via a GetRequest if desired. //
[nit] worth mentioning that the returned Spans that represent a single key would not have EndKey=nil
.
pkg/sql/rowenc/index_encoding.go, line 389 at r13 (raw file):
// // The function accepts a slice of spans to append to. func SplitSpanIntoFamilySpans(
[nit] It would be a bit more precise if we passed just the Key here instead of a Span.
Previously, bytes/key limits were ignored in GetRequest. Now, they are respected. Release note: None
Previously, SQL unconditionally emitted ScanRequests for every read that it produced, even if we were completely sure that the scan could contain only a single key that was fully known at request time. This was less efficient than necessary, because ScanRequests cause the KV layer to do more work than GetRequests. Here are the reasons why GetRequests are more efficient: - MVCCGet, unlike MVCCScan, is able to use a prefix iterator, which allows it to make use of RocksDB/Pebble bloom filters. There may also be other wins at the storage layer, like avoiding prefetching - various data structures have optimizations and fast-paths for single-key operations (e.g. timestamp cache, latch manager, refresh span tracking) - various code paths are simpler in the point operation case - a point operation implies only needing a single key allocation instead of a start and end key allocation - a point operation implies only needing to return up to a single result, which means that we can avoid some indirection in various places - A scan over a span that has at most 1 key, but does not know it, needs to iterate over all the versions (or eventually do a second expensive seek), looking for the non-existent next key in the span. A Get can do a single seek, read the key, and be done. Release note (performance improvement): SQL will now emit GetRequests when possible to KV instead of always emitting ScanRequests. This manifests as a modest performance improvement for some workloads.
Previously, a single-key span (a roachpb.Span with only a StartKey and without an EndKey) was displayed like a normal span, without anything after a dash. For example, a single-key span starting at `/Table/11/1/foo` would display as `/Table/11/1/foo-`. Now, a single-key span is shown without a dash. Release note (sql change): single-key spans in explain and explain(distsql) are now shown without a misleading dash after them.
475d551
to
907ee18
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 2 of 0 LGTMs obtained (waiting on @jordanlewis, @nvanbenschoten, @RaduBerinde, @solongordon, and @tbg)
pkg/sql/rowenc/index_encoding.go, line 387 at r13 (raw file):
Previously, RaduBerinde wrote…
[nit] worth mentioning that the returned Spans that represent a single key would not have
EndKey=nil
.
Actually, returned spans that represent a single key will have nil EndKeys! I've clarified in the comment.
pkg/sql/rowenc/index_encoding.go, line 389 at r13 (raw file):
Previously, RaduBerinde wrote…
[nit] It would be a bit more precise if we passed just the Key here instead of a Span.
Done.
TFTRs! bors r+ |
Build succeeded: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 2 of 0 LGTMs obtained
pkg/sql/rowenc/index_encoding.go, line 387 at r13 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
Actually, returned spans that represent a single key will have nil EndKeys! I've clarified in the comment.
Yeah, I don't know what that not
was doing in there, I gotta stop editing my comments haha
After cockroachdb#61583, we emit GetRequests from SQL when possible. We forgot to update the stats collection code that tracks the number of bytes read from KV when GetRequests were emitted; this is now corrected. Release note: None
After cockroachdb#61583, we emit GetRequests from SQL when possible. We forgot to update the stats collection code that tracks the number of bytes read from KV when GetRequests were emitted; this is now corrected. Release note: None
After cockroachdb#61583, we emit GetRequests from SQL when possible. We forgot to update the stats collection code that tracks the number of bytes read from KV when GetRequests were emitted; this is now corrected. Release note: None
After cockroachdb#61583, we emit GetRequests from SQL when possible. We forgot to update the stats collection code that tracks the number of bytes read from KV when GetRequests were emitted; this is now corrected. Release note: None
After cockroachdb#61583, we emit GetRequests from SQL when possible. We forgot to update the stats collection code that tracks the number of bytes read from KV when GetRequests were emitted; this is now corrected. Release note: None
After cockroachdb#61583, we emit GetRequests from SQL when possible. We forgot to update the stats collection code that tracks the number of bytes read from KV when GetRequests were emitted; this is now corrected. Release note: None
Closes #46758.
Previously, SQL unconditionally emitted ScanRequests for every read that
it produced, even if we were completely sure that the scan could contain
only a single key that was fully known at request time.
This was less efficient than necessary, because ScanRequests cause the
KV layer to do more work than GetRequests.
Here are the reasons why GetRequests are more efficient:
allows it to make use of RocksDB/Pebble bloom filters. There may also
be other wins at the storage layer, like avoiding prefetching
single-key operations (e.g. timestamp cache, latch manager, refresh
span tracking)
of a start and end key allocation
result, which means that we can avoid some indirection in various
places
to iterate over all the versions (or eventually do a second expensive
seek), looking for the non-existent next key in the span. A Get can do
a single seek, read the key, and be done.
Release note (performance improvement): SQL will now emit GetRequests
when possible to KV instead of always emitting ScanRequests. This
manifests as a modest performance improvement for some workloads.