colexecjoin: optimize merge/cross joins #68921

yuzefovich · 2021-08-13T22:16:25Z

colexecjoin: make cross/merge join streaming with regards to left input

This commit refactors the cross and merge join to be streaming with
regards to the left input. Previously, we were using two spilling queues
to consume both inputs first before proceeding to building the cross
product (in case of the merge join this is needed when building from the
buffered group).

That approach is suboptimal because buffering only one side is
sufficient, so this commit switches the cross join builder to operate in
a streaming fashion with regards to the left input. This is done by
building all result rows that correspond to the current left batch
before proceeding to the next left batch and allows us to significantly
reduce amount of copying and, thus, improving the performance.

Fixes: #67816.

Release note: None

colexecjoin: improve probing in the merge joiner with nulls

For non set-operation joins whenever we have nulls in both columns we
can advance both pointers since neither of the rows will have a match.
This commit takes advantage of this observation as well as refactors
(hopefully making it cleaner) the probing mechanism a bit.

Release note: None

colexecjoin: avoid buffering tuples from the right in merge joiner

Depending on the join type, we don't need to fully buffer the tuples
from the right input in order to produce the output. Namely, for
set-operation joins we only need to know the number of right tuples
whereas for LEFT SEMI and RIGHT ANTI we know exactly the behavior of the
builder for the buffered group.

Release note: None

colexecjoin: remove a copy when buffering the right group

Previously, before enqueueing the tuples from the right buffered group
into the spiling queue we would perform a deep-copy. This is an overkill
because the spilling queue itself performs the deep copy. This commit
refactors the enqueueing code to modify the right batch directly to
include only the tuples from the group.

Release note: None

cockroach-teamcity · 2021-08-13T22:16:30Z

This change is

yuzefovich · 2021-08-17T04:54:59Z

The benchmarks are here.

yuzefovich · 2021-08-17T04:56:49Z

@michae2 tagging you as the main reviewer, but if anyone else is interested in diving into the vectorized merge joiner - please let me know :) (I think only Jordan had some context earlier and probably nobody but me still has it :/ )

yuzefovich · 2021-09-09T23:23:53Z

Rebased on top of master. PTAL.

yuzefovich · 2021-09-30T18:49:54Z

cc @rytaft @cucaroach

yuzefovich · 2021-10-06T03:19:42Z

Rebased on top of master just in case.

michae2

Still reading, just noticed one thing so far.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @michae2 and @yuzefovich)

pkg/sql/colexec/colexecjoin/crossjoiner.go, line 45 at r14 (raw file):

"set-op cross joins are invalid"

Are you sure this is true? 😈 For example:

CREATE TABLE t ();
CREATE TABLE u ();
INSERT INTO t DEFAULT VALUES;
INSERT INTO t DEFAULT VALUES;
INSERT INTO u DEFAULT VALUES;
SELECT * FROM t INTERSECT ALL SELECT * FROM u;

Looks like it uses cross join:

[email protected]:26257/defaultdb> EXPLAIN SELECT * FROM t INTERSECT ALL SELECT * FROM u;
                                        info
------------------------------------------------------------------------------------
  distribution: local
  vectorized: true

  • intersect all
  │
  ├── • scan
  │     estimated row count: 2 (100% of the table; stats collected 6 seconds ago)
  │     table: t@t_pkey
  │     spans: FULL SCAN
  │
  └── • scan
        missing stats
        table: u@u_pkey
        spans: FULL SCAN
(14 rows)


Time: 1ms total (execution 1ms / network 0ms)

[email protected]:26257/defaultdb> EXPLAIN (VEC) SELECT * FROM t INTERSECT ALL SELECT * FROM u;
               info
----------------------------------
  │
  └ Node 1
    └ *colexecjoin.crossJoiner
      ├ *colfetcher.ColBatchScan
      └ *colfetcher.ColBatchScan
(5 rows)

yuzefovich

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @michae2)

pkg/sql/colexec/colexecjoin/crossjoiner.go, line 45 at r14 (raw file):

Previously, michae2 (Michael Erickson) wrote…

"set-op cross joins are invalid"

Are you sure this is true? 😈 For example:

CREATE TABLE t ();
CREATE TABLE u ();
INSERT INTO t DEFAULT VALUES;
INSERT INTO t DEFAULT VALUES;
INSERT INTO u DEFAULT VALUES;
SELECT * FROM t INTERSECT ALL SELECT * FROM u;

Looks like it uses cross join:

[email protected]:26257/defaultdb> EXPLAIN SELECT * FROM t INTERSECT ALL SELECT * FROM u;
                                        info
------------------------------------------------------------------------------------
  distribution: local
  vectorized: true

  • intersect all
  │
  ├── • scan
  │     estimated row count: 2 (100% of the table; stats collected 6 seconds ago)
  │     table: t@t_pkey
  │     spans: FULL SCAN
  │
  └── • scan
        missing stats
        table: u@u_pkey
        spans: FULL SCAN
(14 rows)


Time: 1ms total (execution 1ms / network 0ms)

[email protected]:26257/defaultdb> EXPLAIN (VEC) SELECT * FROM t INTERSECT ALL SELECT * FROM u;
               info
----------------------------------
  │
  └ Node 1
    └ *colexecjoin.crossJoiner
      ├ *colfetcher.ColBatchScan
      └ *colfetcher.ColBatchScan
(5 rows)

Indeed, nice catch! Fixed.

It's interesting that the set-op cross join occurs only when the tables don't have any visible columns and the query doesn't explicitly specify hidden columns to be returned.

yuzefovich

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @michae2 and @rytaft)

pkg/sql/colexec/colexecjoin/crossjoiner.go, line 199 at r34 (raw file):