-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvstreamer: optimize singleRangeBatch.Less #82382
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The performance - memory tradeoff seems fine. I have some questions about the benchmarks.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @michae2 and @yuzefovich)
-- commits
line 33 at r2:
What are the results for the new Ordering (as opposed to NoOrdering) benchmarks?
pkg/bench/bench_test.go
line 1209 at r1 (raw file):
defer log.Scope(b).Close(b) ForEachDB(b, func(b *testing.B, db *sqlutils.SQLRunner) { db.Exec(b, `CREATE TABLE t1 (a INT)`)
Should a
be a primary key to meet the "equality columns are key" specification?
pkg/bench/bench_test.go
line 1210 at r1 (raw file):
ForEachDB(b, func(b *testing.B, db *sqlutils.SQLRunner) { db.Exec(b, `CREATE TABLE t1 (a INT)`) db.Exec(b, `INSERT INTO t1 SELECT generate_series(1, 1000)`)
Even though a
is not a primary key in t1
, does this insert order make it inherently in order in the storage layer? That seems like it would affect benchmark performance. Maybe that's ok, but it'd be good to be aware of.
Code quote:
SELECT * FROM t1 INNER LOOKUP JOIN t2 ON t1.a = t2.a
pkg/bench/bench_test.go
line 1233 at r1 (raw file):
for i := 0; i < b.N; i++ { db.Exec(b, `SELECT count(b) FROM t1 INNER LOOKUP JOIN t2 ON t1.a = t2.a GROUP BY t1.a ORDER BY t1.a`)
Why does this query need the count and GROUP BY?
Code quote:
GROUP BY t1.a
2fbfafc
to
229ddd1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @michae2 and @rharding6373)
Previously, rharding6373 (Rachael Harding) wrote…
What are the results for the new Ordering (as opposed to NoOrdering) benchmarks?
Those didn't change because we don't sort singleRangeBatch.reqs
in the InOrder mode, so Less
is not used.
pkg/bench/bench_test.go
line 1209 at r1 (raw file):
Previously, rharding6373 (Rachael Harding) wrote…
Should
a
be a primary key to meet the "equality columns are key" specification?
It doesn't really matter as long as the target index is unique. Here is the comment from JoinReaderSpec
:
// If set, the lookup columns form a key in the target table and thus each
// lookup has at most one result.
LookupColumnsAreKey bool `protobuf:"varint,8,opt,name=lookup_columns_are_key,json=lookupColumnsAreKey" json:"lookup_columns_are_key"`
pkg/bench/bench_test.go
line 1210 at r1 (raw file):
Previously, rharding6373 (Rachael Harding) wrote…
Even though
a
is not a primary key int1
, does this insert order make it inherently in order in the storage layer? That seems like it would affect benchmark performance. Maybe that's ok, but it'd be good to be aware of.
Hm, I don't think it matters much since the main lookup query will perform the lookup join without any information about the ordering (since no ordering is provided by t1
), and the query doesn't have the ORDER BY, so we will internally sort the requests (in the OutOfOrder
mode of the streamer) to get pebble-level optimizations.
pkg/bench/bench_test.go
line 1233 at r1 (raw file):
Previously, rharding6373 (Rachael Harding) wrote…
Why does this query need the count and GROUP BY?
Good point, this is not needed, removed.
229ddd1
to
9bd3042
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made some minor improvements.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @michae2 and @rharding6373)
pkg/bench/bench_test.go
line 1210 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Hm, I don't think it matters much since the main lookup query will perform the lookup join without any information about the ordering (since no ordering is provided by
t1
), and the query doesn't have the ORDER BY, so we will internally sort the requests (in theOutOfOrder
mode of the streamer) to get pebble-level optimizations.
Your point matters for the Ordering benchmarks where the streamer (at least currently) doesn't sort requests according to their keys, so with the current version of this PR we'd be benchmarking the best case (when the ordering provided by the streamer's user matches the ordering of keys). I think it's ok though, at least for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I too have some questions about the benchmarks, but no really substantive complaints.
Reviewed 3 of 3 files at r3, 2 of 2 files at r4, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @rharding6373 and @yuzefovich)
pkg/kv/kvclient/kvstreamer/streamer.go
line 549 at r4 (raw file):
} else { r.reqsKeys = make([]roachpb.Key, len(r.reqs)) }
unimportant code golfing side comment: I've been thinking about this common little idiom. Wouldn't append
handle this for us? In other words, wouldn't it be a little simpler to write something like:
r.reqsKeys = reqsKeysScratch[:0]
for i := range r.reqs {
r.reqsKeys = append(r.reqsKeys, r.reqs[i].GetInner().Header().Key)
}
sort.Sort(&r)
reqsKeysScratch = r.reqsKeys
r.reqsKeys = nil
pkg/bench/bench_test.go
line 1209 at r3 (raw file):
defer log.Scope(b).Close(b) ForEachDB(b, func(b *testing.B, db *sqlutils.SQLRunner) { db.Exec(b, `CREATE TABLE t1 (a INT)`)
Do we need to control the distribution of these tables during the benchmark? What if the KV layer initiates a split of one of them during the benchmark? Should we be explicit about whether this is testing local or distributed execution?
pkg/bench/bench_test.go
line 1245 at r3 (raw file):
db.Exec(b, `CREATE TABLE t1 (a INT)`) db.Exec(b, `INSERT INTO t1 SELECT generate_series(1, 1000)`) db.Exec(b, `CREATE TABLE t2 (a INT, INDEX (a))`)
Is the only difference between the EqColsAreKey benchmarks and these benchmarks a lookup join into a PK vs a lookup join into a secondary index? Are those different in the execution engine? I would expect them to be the same. Do we need different benchmarks?
pkg/bench/bench_test.go
line 1250 at r3 (raw file):
for i := 0; i < b.N; i++ { db.Exec(b, `SELECT t2.a FROM t1 INNER LOOKUP JOIN t2 ON t1.a = t2.a`)
nit: This uses SELECT t2.a
instead of SELECT *
like the other queries. Seems like it should also use SELECT *
to return the same number of columns, for an apples-to-apples comparison.
This commit adds benchmarks for lookup joins, both when equality columns are and are not key, both with and without maintaining ordering. Release note: None
This commit optimizes `singleRangeBatch.Less` method which is used when sorting the requests inside of these objects in the OutOfOrder mode (which is needed to get the low-level Pebble speedups) by storing the start keys explicitly instead of performing a couple of function calls on each `Less` invocation. ``` name old time/op new time/op delta IndexJoin/Cockroach-24 6.30ms ± 1% 5.78ms ± 1% -8.31% (p=0.000 n=10+10) IndexJoin/MultinodeCockroach-24 8.01ms ± 1% 7.51ms ± 1% -6.28% (p=0.000 n=10+10) name old alloc/op new alloc/op delta IndexJoin/Cockroach-24 1.55MB ± 0% 1.57MB ± 0% +0.98% (p=0.000 n=9+10) IndexJoin/MultinodeCockroach-24 2.28MB ± 2% 2.30MB ± 1% ~ (p=0.400 n=10+9) name old allocs/op new allocs/op delta IndexJoin/Cockroach-24 8.16k ± 1% 8.13k ± 1% ~ (p=0.160 n=10+10) IndexJoin/MultinodeCockroach-24 12.7k ± 1% 12.6k ± 0% ~ (p=0.128 n=10+9) ``` ``` name old time/op new time/op delta LookupJoinEqColsAreKeyNoOrdering/Cockroach-24 6.89ms ± 1% 6.43ms ± 1% -6.65% (p=0.000 n=10+10) LookupJoinEqColsAreKeyNoOrdering/MultinodeCockroach-24 8.03ms ± 1% 7.48ms ± 2% -6.92% (p=0.000 n=10+10) LookupJoinNoOrdering/Cockroach-24 9.21ms ± 3% 8.82ms ± 5% -4.23% (p=0.007 n=10+10) LookupJoinNoOrdering/MultinodeCockroach-24 11.9ms ± 3% 11.5ms ± 3% -3.36% (p=0.002 n=9+10) name old alloc/op new alloc/op delta LookupJoinEqColsAreKeyNoOrdering/Cockroach-24 1.81MB ± 1% 1.84MB ± 0% +1.23% (p=0.000 n=10+10) LookupJoinEqColsAreKeyNoOrdering/MultinodeCockroach-24 2.50MB ± 2% 2.54MB ± 1% +1.76% (p=0.004 n=10+10) LookupJoinNoOrdering/Cockroach-24 1.89MB ± 0% 1.91MB ± 1% +1.09% (p=0.000 n=9+9) LookupJoinNoOrdering/MultinodeCockroach-24 2.37MB ± 2% 2.42MB ± 4% +1.85% (p=0.010 n=10+9) name old allocs/op new allocs/op delta LookupJoinEqColsAreKeyNoOrdering/Cockroach-24 10.8k ± 0% 10.8k ± 1% ~ (p=0.615 n=10+10) LookupJoinEqColsAreKeyNoOrdering/MultinodeCockroach-24 15.1k ± 1% 15.0k ± 0% ~ (p=0.101 n=10+10) LookupJoinNoOrdering/Cockroach-24 13.3k ± 1% 13.3k ± 1% ~ (p=0.549 n=10+9) LookupJoinNoOrdering/MultinodeCockroach-24 17.3k ± 1% 17.3k ± 1% ~ (p=0.460 n=10+8) ``` Release note: None
9bd3042
to
76175be
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTRs!
I think Rachael is onboard with the change too, so I'll go ahead and merge.
bors r+
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @michae2 and @rharding6373)
pkg/kv/kvclient/kvstreamer/streamer.go
line 549 at r4 (raw file):
Previously, michae2 (Michael Erickson) wrote…
unimportant code golfing side comment: I've been thinking about this common little idiom. Wouldn't
append
handle this for us? In other words, wouldn't it be a little simpler to write something like:r.reqsKeys = reqsKeysScratch[:0] for i := range r.reqs { r.reqsKeys = append(r.reqsKeys, r.reqs[i].GetInner().Header().Key) } sort.Sort(&r) reqsKeysScratch = r.reqsKeys r.reqsKeys = nil
Good point, I like it, done.
pkg/bench/bench_test.go
line 1209 at r3 (raw file):
Previously, michae2 (Michael Erickson) wrote…
Do we need to control the distribution of these tables during the benchmark? What if the KV layer initiates a split of one of them during the benchmark? Should we be explicit about whether this is testing local or distributed execution?
These queries follow the pattern of this file where we insert small amount of data, so we know it'll be a single range setup. I think that's ok since the goal of the benchmarks is the "speed of light" check. It's highly unlikely that the KV will split any of these tables into separate ranges since they are just so small, and the benchmarks don't run long enough so that QPS based splitting would kick in.
pkg/bench/bench_test.go
line 1245 at r3 (raw file):
Previously, michae2 (Michael Erickson) wrote…
Is the only difference between the EqColsAreKey benchmarks and these benchmarks a lookup join into a PK vs a lookup join into a secondary index? Are those different in the execution engine? I would expect them to be the same. Do we need different benchmarks?
The important difference is a lookup join into a unique index (which PK is) vs a lookup join into a non-unique index. The uniqueness property guarantees that a single lookup will produce at most one row, and in the execution engine we use different strategies because of this property. In particular, in the non-streamer code path we parallelize KV batches IFF we have this "at most single row lookup" property. This is the case with the index joins as well as lookup joins when equality columns form a key. In the streamer code-path, the behavior is also different because we propagate this property as Hints.SingleRowLookup
hint, and the streamer can simplify some of the bookkeeping if that is true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @michae2, @rharding6373, and @yuzefovich)
pkg/bench/bench_test.go
line 1209 at r3 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
These queries follow the pattern of this file where we insert small amount of data, so we know it'll be a single range setup. I think that's ok since the goal of the benchmarks is the "speed of light" check. It's highly unlikely that the KV will split any of these tables into separate ranges since they are just so small, and the benchmarks don't run long enough so that QPS based splitting would kick in.
Makes sense!
pkg/bench/bench_test.go
line 1245 at r3 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
The important difference is a lookup join into a unique index (which PK is) vs a lookup join into a non-unique index. The uniqueness property guarantees that a single lookup will produce at most one row, and in the execution engine we use different strategies because of this property. In particular, in the non-streamer code path we parallelize KV batches IFF we have this "at most single row lookup" property. This is the case with the index joins as well as lookup joins when equality columns form a key. In the streamer code-path, the behavior is also different because we propagate this property as
Hints.SingleRowLookup
hint, and the streamer can simplify some of the bookkeeping if that is true.
Ahhh, right, uniqueness! Thank you!
Build succeeded: |
bench: add benchmarks for lookup joins
This commit adds benchmarks for lookup joins, both when equality columns
are and are not key, both with and without maintaining ordering.
Release note: None
kvstreamer: optimize singleRangeBatch.Less
This commit optimizes
singleRangeBatch.Less
method which is used whensorting the requests inside of these objects in the OutOfOrder mode
(which is needed to get the low-level Pebble speedups) by storing the
start keys explicitly instead of performing a couple of function calls
on each
Less
invocation.Addresses: #82159
Release note: None