Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rowexec: improve LEFT SEMI/ANTI lookup join performance #49790

Closed
2 tasks done
asubiotto opened this issue Jun 2, 2020 · 3 comments
Closed
2 tasks done

rowexec: improve LEFT SEMI/ANTI lookup join performance #49790

asubiotto opened this issue Jun 2, 2020 · 3 comments
Assignees
Labels
A-sql-execution Relating to SQL execution. C-performance Perf of queries or internals. Solution not expected to change functional behavior.

Comments

@asubiotto
Copy link
Contributor

asubiotto commented Jun 2, 2020

Note: throughout this issue I refer to the lookup side (i.e. the rows we're looking up results for) as the left side and the looked up side (i.e. the table we're performing lookups against) as the right side.

LEFT SEMI/ANTI joins are joins where only the values from the left side need to be output. Currently, the joinReader creates a ScanRequest for each row in the left side, which returns values unnecessarily from the right side. In an ideal world, we'd only want to check if there is a matching row for a row on the left side. The reason the performance of these cases is important is that they're used for FK checks, performance improvements here could be observed in a variety of workloads.

We could do a couple of optimizations:

  • Merge lookups with the same equality columns. Multiple rows on the left side could have the same values for the equality, or join, columns. This means that we only need to perform a lookup for one of these rows to know whether or not there is a match for all other rows.
  • Stop using ScanRequests. There are two alternatives:
    • GetRequests (performance difference is explained here sql: perform point key-value reads where possible #46758). The advantage is that this API already exists in the codebase. The disadvantage is that we'll still be fetching unnecessary values.
    • ExistsRequest (sql: use the new CheckExistsRequest KV operation for FK existence checks #33472). The advantage is that unneeded values are no longer fetched. The disadvantage is that this API is still a prototype and this will currently be the only use of it. The work will probably need to be done rebased on top of the prototype API and demonstrate performance gains in order to justify adding that API to production. Another advantage using ExistsRequest opens the door to caching results at the transaction coordinator layer that will cheapen redundant existence checks (this work is out of scope).

I think it would be worth it to implement both the GetRequests and ExistsRequests and benchmark them against ScanRequests and each other to get an idea for the performance differences.

Some useful benchmarks to see how the performance changes:

  • Small benchmarks in pkg/sql/opt/bench/bench_test.go
  • TPCC with foreign keys enabled.
@blathers-crl
Copy link

blathers-crl bot commented Jun 2, 2020

Hi @asubiotto, please add a C-ategory label to your issue. Check out the label system docs.

While you're here, please consider adding an A- label to help keep our repository tidy.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@asubiotto asubiotto added A-sql-execution Relating to SQL execution. C-performance Perf of queries or internals. Solution not expected to change functional behavior. labels Jun 2, 2020
@asubiotto
Copy link
Contributor Author

From @helenmhe in #53818:

CheckExistsRequest

CheckExistsRequest is currently implemented with a flag checking for types that don’t require right cols, namely left semi/anti joins. Currently in the kv_batch_fetcher, I set ba.Header.MaxSpanRequestKeys and ba.Header.TargetBytes to 0, (not limiting the number of rows returned since we aren’t really returning rows). When testing locally it seemed that this had a large impact on performance (parallel performed ~20% better than batch size limited), but I wasn’t able to recreate anything with a low p-value on gceworker.

Testing

Switching to the inner loop in https://github.com/cockroachdb/cockroach/pull/53669/files#diff-2793c01ca6dd699f8ad1e26f24f700dbR1182 seems to have increased variance, compared to before when I was using -count 100 or so to modify b.N, so despite running faster I don’t know if it’s as desirable for getting statistically significant results? Results with gceworker for some reason tended to have slightly more variance, leading to having very high p-values so unfortunately I didn’t get any worthwhile results on the gceworker after making this change.

The results I got prior to making the benchmark changes are shown below:

name                            old time/op    new time/op    delta
LeftSemiJoin/SingleRow/None-12    13.1ms ±13%    12.9ms ± 9%  -1.73%  (p=0.005 n=86+100)

name                            old alloc/op   new alloc/op   delta
LeftSemiJoin/SingleRow/None-12    4.61MB ± 2%    4.57MB ± 1%  -0.87%  (p=0.000 n=97+98)

name                            old allocs/op  new allocs/op  delta
LeftSemiJoin/SingleRow/None-12     28.4k ± 1%     30.3k ± 1%  +6.85%  (p=0.000 n=90+97)

Along with the corresponding profile call_trees:

Master

pprof_master

CheckExists

pprof_ce

Get

Screen Shot 2020-09-03 at 3 30 16 PM

To force the GetRequest code path in BenchmarkLeftSemiJoin I had to create the foo index as UNIQUE.

Overall from the testing, it was a bit difficult to isolate the difference between using Scan vs CheckExists even in a benchmark designed to stress the left semi join. I’m not entirely sure why fk_test.go wasn’t hitting this codepath as much, but it could be worthwhile also investigating that.

@asubiotto
Copy link
Contributor Author

The conclusion here is that it doesn't seem like it'll be a huge performance win, so closing this issue. We can revisit and use this as a starting point in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-execution Relating to SQL execution. C-performance Perf of queries or internals. Solution not expected to change functional behavior.
Projects
None yet
Development

No branches or pull requests

2 participants