Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: support locality optimized anti join #63044

Merged
merged 3 commits into from
Apr 5, 2021

Conversation

rytaft
Copy link
Collaborator

@rytaft rytaft commented Apr 2, 2021

opt: fix a few omissions for locality optimized search

This commit fixes a few omissions where locality optimized search was
not included when it should have been. These omissions only have a small
impact on the cost so they were not noticeable.

opt: refactor and move some functions for locality optimized search

This commit refactors some functions previously in xform/scan_funcs.go
for GenerateLocalityOptimizedScan and moves them to xform/general_funcs.go
so they can be used to generate locality optimized anti joins in a future
commit.

opt: support locality optimized anti join

Shout out to @RaduBerinde for this idea!

This commit adds a new transformation rule, GenerateLocalityOptimizedAntiJoin.
GenerateLocalityOptimizedAntiJoin converts an anti join into a locality
optimized anti join if possible. A locality optimized anti join is implemented
as a nested pair of anti lookup joins and is designed to avoid communicating
with remote nodes (relative to the gateway region) if at all possible.

A locality optimized anti join can be planned under the following conditions:

  • The anti join can be planned as a lookup join.
  • The lookup join scans multiple spans in the lookup index for each input
    row, with some spans targeting partitions on local nodes (relative to the
    gateway region), and some targeting partitions on remote nodes. It is not
    known which span(s) will contain the matching row(s).

The result of GenerateLocalityOptimizedAntiJoin will be a nested pair of anti
lookup joins in which the first lookup join is an anti join targeting the
local values from the original join, and the second lookup join is an anti
join targeting the remote values. Because of the way anti join is defined, a
row will only be returned by the first anti join if a match is not found
locally. If a match is found, no row will be returned and therefore the second
lookup join will not need to search the remote nodes. This nested pair of anti
joins is logically equivalent to the original, single anti join.

This is a useful optimization if there is locality of access in the workload,
such that rows tend to be accessed from the region where they are located. If
there is no locality of access, using a locality optimized anti join could be
a slight pessimization, since rows residing in remote regions will be found
slightly more slowly than they would be otherwise.

For example, suppose we have a multi-region database with regions 'us-east1',
'us-west1' and 'europe-west1', and we have the following tables and query,
issued from 'us-east1':

  CREATE TABLE parent (
    p_id INT PRIMARY KEY
  ) LOCALITY REGIONAL BY ROW;

  CREATE TABLE child (
    c_id INT PRIMARY KEY,
    c_p_id INT REFERENCES parent (p_id)
  ) LOCALITY REGIONAL BY ROW;

  SELECT * FROM child WHERE NOT EXISTS (
    SELECT * FROM parent WHERE p_id = c_p_id
  ) AND c_id = 10;

Normally, this would produce the following plan:

  anti-join (lookup parent)
   ├── lookup columns are key
   ├── lookup expr: (p_id = c_p_id) AND (crdb_region IN ('europe-west1', 'us-east1', 'us-west1'))
   ├── scan child
   │    └── constraint: /7/5
   │         ├── [/'europe-west1'/10 - /'europe-west1'/10]
   │         ├── [/'us-east1'/10 - /'us-east1'/10]
   │         └── [/'us-west1'/10 - /'us-west1'/10]
   └── filters (true)

but if the session setting locality_optimized_partitioned_index_scan is enabled,
the optimizer will produce this plan, using locality optimized search, both for
the scan of child and for the lookup join with parent. See the rule
GenerateLocalityOptimizedScan for details about how the optimization is applied
for scans.

  anti-join (lookup parent)
   ├── lookup columns are key
   ├── lookup expr: (p_id = c_p_id) AND (crdb_region IN ('europe-west1', 'us-west1'))
   ├── anti-join (lookup parent)
   │    ├── lookup columns are key
   │    ├── lookup expr: (p_id = c_p_id) AND (crdb_region = 'us-east1')
   │    ├── locality-optimized-search
   │    │    ├── scan child
   │    │    │    └── constraint: /13/11: [/'us-east1'/10 - /'us-east1'/10]
   │    │    └── scan child
   │    │         └── constraint: /18/16
   │    │              ├── [/'europe-west1'/10 - /'europe-west1'/10]
   │    │              └── [/'us-west1'/10 - /'us-west1'/10]
   │    └── filters (true)
   └── filters (true)

As long as child.c_id = 10 and the matching row in parent are both located in
'us-east1', the second plan will be much faster. But if they are located in
one of the other regions, the first plan would be slightly faster.

Informs #55185

Release note (performance improvement): The optimizer will now try
to plan anti lookup joins using "locality optimized search". This optimization
applies for anti lookup joins into REGIONAL BY ROW tables (i.e., the right side
of the join is a REGIONAL BY ROW table), and if enabled, it means that the
execution engine will first search locally for matching rows before searching
remote nodes. If a matching row is found in a local node, remote nodes will not
be searched. This optimization may improve the performance of foreign key
checks when rows are inserted or updated in a table that references a foreign
key in a REGIONAL BY ROW table.

@rytaft rytaft requested review from mgartner, RaduBerinde and a team April 2, 2021 22:09
@rytaft rytaft requested a review from a team as a code owner April 2, 2021 22:09
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@rytaft rytaft changed the title Locality opt antijoin opt: support locality optimized anti join Apr 2, 2021
Copy link
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonderful @rytaft !
The release note make it sound as if locality_optimized_partitioned_index_scan is not on by default. You think there's good reason for this?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @RaduBerinde)

@rytaft rytaft force-pushed the locality-opt-antijoin branch from bdabec3 to 41f5605 Compare April 2, 2021 22:51
@rytaft
Copy link
Collaborator Author

rytaft commented Apr 2, 2021

locality_optimized_partitioned_index_scan is on by default -- I'll take it out of the release note to avoid confusion. Thanks!

rytaft added 2 commits April 2, 2021 18:24
This commit fixes a few omissions where locality optimized search was
not included when it should have been. These omissions only have a small
impact on the cost so they were not noticeable.

Release note: None
This commit refactors some functions previously in xform/scan_funcs.go
for GenerateLocalityOptimizedScan and moves them to xform/general_funcs.go
so they can be used to generate locality optimized anti joins in a future
commit.

Release note: None
@rytaft rytaft force-pushed the locality-opt-antijoin branch from 41f5605 to 6862953 Compare April 2, 2021 23:24
Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!!

:lgtm:

Reviewed 1 of 4 files at r1, 4 of 4 files at r2, 14 of 14 files at r3.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @rytaft)


pkg/sql/opt/exec/factory.opt, line 227 at r3 (raw file):

# If LocalityOptimized is true, we are performing a locality optimized search.
# In order for this to work correctly, the execution engine must create a local
# DistSQL plan for the main query (subqueries and postqueries need not be local).

A bit unclear to me why it has to be a local plan.. Lookup joins are planned on whatever nodes their input is. If the table we are reading from is in our region, it would be planned in our region anyway.

In any case, this can be part of a postquery so this is a bit confusing, maybe just say more generically that it needs to be part of a local plan.

Also [nit] this would be better as a comment on the field itself.


pkg/sql/opt/xform/join_funcs.go, line 1132 at r3 (raw file):

// rule to hold two sets of filters: one targeting local partitions and one
// targeting remote partitions.
type LocalAndRemoteLookupExprs struct {

We can't get @mgartner's multi-var syntax soon enough :)


pkg/sql/opt/xform/join_funcs.go, line 1289 at r3 (raw file):

			continue
		}
		col, _ := props.OuterCols.Next(0)

[nit] you can use .SingleColumn()


pkg/sql/opt/xform/rules/join.opt, line 403 at r3 (raw file):

# one of the other regions, the first plan would be slightly faster.
[GenerateLocalityOptimizedAntiJoin, Explore]
(LookupJoin

This is a very clean rule, nicely done!

@rytaft rytaft force-pushed the locality-opt-antijoin branch 2 times, most recently from ee7d040 to 76025b8 Compare April 5, 2021 18:20
Copy link
Collaborator Author

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR! Updated.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @RaduBerinde)


pkg/sql/opt/exec/factory.opt, line 227 at r3 (raw file):

Previously, RaduBerinde wrote…

A bit unclear to me why it has to be a local plan.. Lookup joins are planned on whatever nodes their input is. If the table we are reading from is in our region, it would be planned in our region anyway.

In any case, this can be part of a postquery so this is a bit confusing, maybe just say more generically that it needs to be part of a local plan.

Also [nit] this would be better as a comment on the field itself.

Hmm I see what you're saying. I'm a bit torn since the optimization isn't useful if it's not planned in the local region, but if we have to fetch data from a remote region for a local scan that's also not ideal.... I guess it's better to minimize the number of changes needed on the execution side, so I've removed this.


pkg/sql/opt/xform/join_funcs.go, line 1132 at r3 (raw file):

Previously, RaduBerinde wrote…

We can't get @mgartner's multi-var syntax soon enough :)

+1!


pkg/sql/opt/xform/join_funcs.go, line 1289 at r3 (raw file):

Previously, RaduBerinde wrote…

[nit] you can use .SingleColumn()

Done.


pkg/sql/opt/xform/rules/join.opt, line 403 at r3 (raw file):

Previously, RaduBerinde wrote…

This is a very clean rule, nicely done!

Thanks!

This commit adds a new transformation rule, GenerateLocalityOptimizedAntiJoin.
GenerateLocalityOptimizedAntiJoin converts an anti join into a locality
optimized anti join if possible. A locality optimized anti join is implemented
as a nested pair of anti lookup joins and is designed to avoid communicating
with remote nodes (relative to the gateway region) if at all possible.

A locality optimized anti join can be planned under the following conditions:
 - The anti join can be planned as a lookup join.
 - The lookup join scans multiple spans in the lookup index for each input
   row, with some spans targeting partitions on local nodes (relative to the
   gateway region), and some targeting partitions on remote nodes. It is not
   known which span(s) will contain the matching row(s).

The result of GenerateLocalityOptimizedAntiJoin will be a nested pair of anti
lookup joins in which the first lookup join is an anti join targeting the
local values from the original join, and the second lookup join is an anti
join targeting the remote values. Because of the way anti join is defined, a
row will only be returned by the first anti join if a match is *not* found
locally. If a match is found, no row will be returned and therefore the second
lookup join will not need to search the remote nodes. This nested pair of anti
joins is logically equivalent to the original, single anti join.

This is a useful optimization if there is locality of access in the workload,
such that rows tend to be accessed from the region where they are located. If
there is no locality of access, using a locality optimized anti join could be
a slight pessimization, since rows residing in remote regions will be found
slightly more slowly than they would be otherwise.

For example, suppose we have a multi-region database with regions 'us-east1',
'us-west1' and 'europe-west1', and we have the following tables and query,
issued from 'us-east1':

  CREATE TABLE parent (
    p_id INT PRIMARY KEY
  ) LOCALITY REGIONAL BY ROW;

  CREATE TABLE child (
    c_id INT PRIMARY KEY,
    c_p_id INT REFERENCES parent (p_id)
  ) LOCALITY REGIONAL BY ROW;

  SELECT * FROM child WHERE NOT EXISTS (
    SELECT * FROM parent WHERE p_id = c_p_id
  ) AND c_id = 10;

Normally, this would produce the following plan:

  anti-join (lookup parent)
   ├── lookup columns are key
   ├── lookup expr: (p_id = c_p_id) AND (crdb_region IN ('europe-west1', 'us-east1', 'us-west1'))
   ├── scan child
   │    └── constraint: /7/5
   │         ├── [/'europe-west1'/10 - /'europe-west1'/10]
   │         ├── [/'us-east1'/10 - /'us-east1'/10]
   │         └── [/'us-west1'/10 - /'us-west1'/10]
   └── filters (true)

but if the session setting locality_optimized_partitioned_index_scan is enabled,
the optimizer will produce this plan, using locality optimized search, both for
the scan of child and for the lookup join with parent. See the rule
GenerateLocalityOptimizedScan for details about how the optimization is applied
for scans.

  anti-join (lookup parent)
   ├── lookup columns are key
   ├── lookup expr: (p_id = c_p_id) AND (crdb_region IN ('europe-west1', 'us-west1'))
   ├── anti-join (lookup parent)
   │    ├── lookup columns are key
   │    ├── lookup expr: (p_id = c_p_id) AND (crdb_region = 'us-east1')
   │    ├── locality-optimized-search
   │    │    ├── scan child
   │    │    │    └── constraint: /13/11: [/'us-east1'/10 - /'us-east1'/10]
   │    │    └── scan child
   │    │         └── constraint: /18/16
   │    │              ├── [/'europe-west1'/10 - /'europe-west1'/10]
   │    │              └── [/'us-west1'/10 - /'us-west1'/10]
   │    └── filters (true)
   └── filters (true)

As long as child.c_id = 10 and the matching row in parent are both located in
'us-east1', the second plan will be much faster. But if they are located in
one of the other regions, the first plan would be slightly faster.

Informs cockroachdb#55185

Release note (performance improvement): The optimizer will now try
to plan anti lookup joins using "locality optimized search". This optimization
applies for anti lookup joins into REGIONAL BY ROW tables (i.e., the right side
of the join is a REGIONAL BY ROW table), and if enabled, it means that the
execution engine will first search locally for matching rows before searching
remote nodes. If a matching row is found in a local node, remote nodes will not
be searched. This optimization may improve the performance of foreign key
checks when rows are inserted or updated in a table that references a foreign
key in a REGIONAL BY ROW table.
@rytaft rytaft force-pushed the locality-opt-antijoin branch from 76025b8 to 01ea5a2 Compare April 5, 2021 18:30
@rytaft
Copy link
Collaborator Author

rytaft commented Apr 5, 2021

bors r+

@craig
Copy link
Contributor

craig bot commented Apr 5, 2021

Build succeeded:

@craig craig bot merged commit c3cfb02 into cockroachdb:master Apr 5, 2021
Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 Very cool!

Reviewed 4 of 4 files at r1, 4 of 4 files at r2, 12 of 14 files at r3, 10 of 10 files at r4.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/sql/opt/xform/join_funcs.go, line 1132 at r3 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

+1!

👀 The catalyst for me starting that work was @rytaft's "watch me work" stream when she started working on LOS anti-joins :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants