Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: unnecessary columns are added to the input of update operations #57097

Open
mgartner opened this issue Nov 25, 2020 · 2 comments
Open

opt: unnecessary columns are added to the input of update operations #57097

mgartner opened this issue Nov 25, 2020 · 2 comments
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-sql-queries SQL Queries Team

Comments

@mgartner
Copy link
Collaborator

mgartner commented Nov 25, 2020

In some situations, exec builder is adding unnecessary columns to mutation inputs here.

To reproduce the issue (I've used print statements here for those who don't have a debugger setup):

  1. Add the following lines at the top of the updateNode.processSource function in pkg/sql/update.go:
fmt.Printf("num fetch cols: %d\n", len(u.run.tu.ru.FetchCols))
fmt.Printf("num update cols: %d\n", len(u.run.tu.ru.UpdateCols))
fmt.Printf("num source vals: %d\n", len(sourceVals))
fmt.Printf("sourceVals: %v\n\n", sourceVals)
  1. Run the following logic test with TESTFLAGS='-v':
# LogicTest: local

statement ok
CREATE TABLE groups (
    id INT PRIMARY KEY
);
CREATE TABLE users (
    id INT PRIMARY KEY,
    group_id INT REFERENCES groups ON UPDATE CASCADE,
    b BOOL,
    FAMILY (id, group_id, b)
);

statement ok
INSERT INTO groups VALUES (1);
INSERT INTO users VALUES (10, 1, true), (20, 1, false);
UPDATE groups SET id = 2 WHERE id = 1;
  1. Notice the following lines of output:
num fetch cols: 3
num update cols: 1
num source vals: 5
sourceVals: [10 1 true 2 1]

num fetch cols: 3
num update cols: 1
num source vals: 5
sourceVals: [20 1 false 2 1]

These are printed when the cascading FK update rows are processed. There are 3 fetch cols and 1 update col, but sourceVals includes an unnecessary fifth value. AFAICT this value (1 in this case) is never used.

Jira issue: CRDB-2865

@mgartner mgartner added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Nov 25, 2020
mgartner added a commit to mgartner/cockroach that referenced this issue Nov 25, 2020
This commit fixes several bugs that are present when using foreign key
cascading updates and partial indexes.

Previously, the optimizer was not synthesizing partial index DEL columns
for FK cascading updates. As a result, a partial index on a child table
could become inconsistent with the rows in the primary index,
ultimately resulting in incorrect query results. The optbuilder has
been refactored to project these columns and to reduce the complexity
of doing so. As a result, partial index PUT and DEL columns are now
projected in the same expression, rather than the DEL columns being
projected as far down in the expression tree as possible.

Additionally, the execution engine was unable to handle extraneous
columns that can be added as input to FK cascading updates. These
extraneous columns would be incorrectly interpreted as synthesized
partial index columns. This commit works around this issue by passing a
set of partial indexes that columns have been synthesized for from the
optimizer to the execution engine. The longer term fix is to not produce
these columns (see issue cockroachdb#57097).

Fixes cockroachdb#57085
Fixes cockroachdb#57084

Release justification: This is a critical bug fix to a new feature,
partial indexes.

Release note (bug fix): Previously, updating parent table of a foreign
key relationship with cascading updates could cause errors or
inconsistencies in partial indexes for child tables with partial
indexes. The inconsistency of partial indexes could result in incorrect
query results. This has been fixed.
@RaduBerinde
Copy link
Member

Below is the output from build-cascades. I think it's just the result of doing the join and we didn't bother projecting away what we don't need..

build-cascades
UPDATE groups SET id = 2 WHERE id = 1;
----
root
 ├── update groups
 │    ├── columns: <none>
 │    ├── fetch columns: id:3
 │    ├── update-mapping:
 │    │    └── id_new:5 => id:1
 │    ├── input binding: &1
 │    ├── cascades
 │    │    └── fk_group_id_ref_groups
 │    └── project
 │         ├── columns: id_new:5!null id:3!null crdb_internal_mvcc_timestamp:4
 │         ├── select
 │         │    ├── columns: id:3!null crdb_internal_mvcc_timestamp:4
 │         │    ├── scan groups
 │         │    │    └── columns: id:3!null crdb_internal_mvcc_timestamp:4
 │         │    └── filters
 │         │         └── id:3 = 1
 │         └── projections
 │              └── 2 [as=id_new:5]
 └── cascade
      └── update users
           ├── columns: <none>
           ├── fetch columns: users.id:10 group_id:11 b:12
           ├── update-mapping:
           │    └── id_new:15 => group_id:7
           ├── input binding: &2
           ├── inner-join (hash)
           │    ├── columns: users.id:10!null group_id:11!null b:12 id:14!null id_new:15!null
           │    ├── scan users
           │    │    └── columns: users.id:10!null group_id:11 b:12
           │    ├── select
           │    │    ├── columns: id:14!null id_new:15!null
           │    │    ├── with-scan &1
           │    │    │    ├── columns: id:14!null id_new:15!null
           │    │    │    └── mapping:
           │    │    │         ├──  groups.id:3 => id:14
           │    │    │         └──  id_new:5 => id_new:15
           │    │    └── filters
           │    │         └── id:14 IS DISTINCT FROM id_new:15
           │    └── filters
           │         └── group_id:11 = id:14
           └── f-k-checks
                └── f-k-checks-item: users(group_id) -> groups(id)
                     └── anti-join (hash)
                          ├── columns: id_new:16!null
                          ├── with-scan &2
                          │    ├── columns: id_new:16!null
                          │    └── mapping:
                          │         └──  id_new:15 => id_new:16
                          ├── scan groups
                          │    └── columns: groups.id:17!null
                          └── filters
                               └── id_new:16 = groups.id:17

mgartner added a commit to mgartner/cockroach that referenced this issue Nov 26, 2020
Previously, the optimizer was not synthesizing partial index DEL columns
for FK cascading updates and deletes. As a result, a cascading `UPDATE`
could corrupt a child table's partial index, ultimately resulting in
incorrect query results. A cascading `DELETE` would not corrupt
partial indexes, but unnecessary `DEL` operations would be issued on
the partial index.

The optbuilder has been refactored so that these columns are correctly
projected. Both PUT and DEL columns are now projected in the same
function, `mutationBuilder.projectPartialIndexCols`. This function is
called from principal functions in the optbuilder where CHECK constraint
columns are also projected, like `mutationBuilder.buildUpdate`. In
theory this should make it harder in the future to omit these necessary
projections.

Additionally, the execution engine was unable to handle extraneous
columns that can be added as input to FK cascading updates. These
extraneous columns would be incorrectly interpreted as synthesized
partial index columns. This commit works around this issue by slicing
the source values with an upper bound in `upateNode.processSourceRow`.
The longer term fix is to not produce these columns (see issue cockroachdb#57097).

Fixes cockroachdb#57085
Fixes cockroachdb#57084

Release justification: This is a critical bug fix to a new feature,
partial indexes.

Release note (bug fix): A bug has been fixed that caused errors or
corrupted partial indexes of child tables in foreign key relationships
with cascading `UPDATE`s and `DELETE`s. The corrupt partial indexes
could result in incorrect query results. Any partial indexes on child
tables of foreign key relationships with `ON DELETE CASCADE` or `ON
UPDATE CASCADE` actions may be corrupt and should be dropped and
re-created. This bug was introduce in version 20.2.
mgartner added a commit to mgartner/cockroach that referenced this issue Dec 1, 2020
Previously, the optimizer was not synthesizing partial index DEL columns
for FK cascading updates and deletes. As a result, a cascading `UPDATE`
could corrupt a child table's partial index, ultimately resulting in
incorrect query results. A cascading `DELETE` would not corrupt
partial indexes, but unnecessary `DEL` operations would be issued on
the partial index.

The optbuilder has been refactored so that these columns are correctly
projected. Both PUT and DEL columns are now projected in the same
function, `mutationBuilder.projectPartialIndexCols`. This function is
called from principal functions in the optbuilder where CHECK constraint
columns are also projected, like `mutationBuilder.buildUpdate`. In
theory this should make it harder in the future to omit these necessary
projections.

Additionally, the execution engine was unable to handle extraneous
columns that can be added as input to FK cascading updates. These
extraneous columns would be incorrectly interpreted as synthesized
partial index columns. This commit works around this issue by slicing
the source values with an upper bound in `upateNode.processSourceRow`.
The longer term fix is to not produce these columns (see issue cockroachdb#57097).

Fixes cockroachdb#57085
Fixes cockroachdb#57084

Release justification: This is a critical bug fix to a new feature,
partial indexes.

Release note (bug fix): A bug has been fixed that caused errors or
corrupted partial indexes of child tables in foreign key relationships
with cascading `UPDATE`s and `DELETE`s. The corrupt partial indexes
could result in incorrect query results. Any partial indexes on child
tables of foreign key relationships with `ON DELETE CASCADE` or `ON
UPDATE CASCADE` actions may be corrupt and should be dropped and
re-created. This bug was introduce in version 20.2.
mgartner added a commit to mgartner/cockroach that referenced this issue Dec 1, 2020
Previously, the optimizer was not synthesizing partial index DEL columns
for FK cascading updates and deletes. As a result, a cascading `UPDATE`
could corrupt a child table's partial index, ultimately resulting in
incorrect query results. A cascading `DELETE` would not corrupt
partial indexes, but unnecessary `DEL` operations would be issued on
the partial index.

The optbuilder has been refactored so that these columns are correctly
projected. There are now three functions for projecting PUT columns, DEL
columns, and both PUT and DEL columns, each ensuring that the input
scopes are non-nil. These three functions are called from principal
functions in the optbuilder where CHECK constraint columns are also
projected, like `mutationBuilder.buildUpdate`. In theory this should
make it harder in the future to omit these necessary projections.

Additionally, the execution engine was unable to handle extraneous
columns that can be added as input to FK cascading updates. These
extraneous columns would be incorrectly interpreted as synthesized
partial index columns. This commit works around this issue by slicing
the source values with an upper bound in `updateNode.processSourceRow`.
The longer term fix is to not produce these columns (see issue cockroachdb#57097).

Fixes cockroachdb#57085
Fixes cockroachdb#57084

Release justification: This is a critical bug fix to a new feature,
partial indexes.

Release note (bug fix): A bug has been fixed that caused errors or
corrupted partial indexes of child tables in foreign key relationships
with cascading `UPDATE`s and `DELETE`s. The corrupt partial indexes
could result in incorrect query results. Any partial indexes on child
tables of foreign key relationships with `ON DELETE CASCADE` or `ON
UPDATE CASCADE` actions may be corrupt and should be dropped and
re-created. This bug was introduce in version 20.2.
mgartner added a commit to mgartner/cockroach that referenced this issue Dec 1, 2020
Previously, the optimizer was not synthesizing partial index DEL columns
for FK cascading updates and deletes. As a result, a cascading `UPDATE`
could corrupt a child table's partial index, ultimately resulting in
incorrect query results. A cascading `DELETE` would not corrupt
partial indexes, but unnecessary `DEL` operations would be issued on
the partial index.

The optbuilder has been refactored so that these columns are correctly
projected. There are now three functions for projecting PUT columns, DEL
columns, and both PUT and DEL columns, each ensuring that the input
scopes are non-nil. These three functions are called from principal
functions in the optbuilder where CHECK constraint columns are also
projected, like `mutationBuilder.buildUpdate`. In theory this should
make it harder in the future to omit these necessary projections.

Additionally, the execution engine was unable to handle extraneous
columns that can be added as input to FK cascading updates. These
extraneous columns would be incorrectly interpreted as synthesized
partial index columns. This commit works around this issue by slicing
the source values with an upper bound in `updateNode.processSourceRow`.
The longer term fix is to not produce these columns (see issue cockroachdb#57097).

Fixes cockroachdb#57085
Fixes cockroachdb#57084

Release justification: This is a critical bug fix to a new feature,
partial indexes.

Release note (bug fix): A bug has been fixed that caused errors or
corrupted partial indexes of child tables in foreign key relationships
with cascading `UPDATE`s and `DELETE`s. The corrupt partial indexes
could result in incorrect query results. Any partial indexes on child
tables of foreign key relationships with `ON DELETE CASCADE` or `ON
UPDATE CASCADE` actions may be corrupt and should be dropped and
re-created. This bug was introduce in version 20.2.
craig bot pushed a commit that referenced this issue Dec 1, 2020
57100: opt: fix FK cascades to child tables with partial indexes r=mgartner a=mgartner

Previously, the optimizer was not synthesizing partial index DEL columns
for FK cascading updates and deletes. As a result, a cascading `UPDATE`
could corrupt a child table's partial index, ultimately resulting in
incorrect query results. A cascading `DELETE` would not corrupt
partial indexes, but unnecessary `DEL` operations would be issued on
the partial index.

The optbuilder has been refactored so that these columns are correctly
projected. There are now three functions for projecting PUT columns, DEL
columns, and both PUT and DEL columns, each ensuring that the input
scopes are non-nil. These three functions are called from principal
functions in the optbuilder where CHECK constraint columns are also
projected, like `mutationBuilder.buildUpdate`. In theory this should
make it harder in the future to omit these necessary projections.

Additionally, the execution engine was unable to handle extraneous
columns that can be added as input to FK cascading updates. These
extraneous columns would be incorrectly interpreted as synthesized
partial index columns. This commit works around this issue by slicing
the source values with an upper bound in `updateNode.processSourceRow`.
The longer term fix is to not produce these columns (see issue #57097).

Fixes #57085
Fixes #57084

Release justification: This is a critical bug fix to a new feature,
partial indexes.

Release note (bug fix): A bug has been fixed that caused errors or
corrupted partial indexes of child tables in foreign key relationships
with cascading `UPDATE`s and `DELETE`s. The corrupt partial indexes
could result in incorrect query results. Any partial indexes on child
tables of foreign key relationships with `ON DELETE CASCADE` or `ON
UPDATE CASCADE` actions may be corrupt and should be dropped and
re-created. This bug was introduce in version 20.2.


57323: authors: add Paul Kernfeld to authors r=kernfeld-cockroach a=kernfeld-cockroach

Release note: None

57324: authors: add ricky to authors r=rickystewart a=rickystewart

Release note: None

Co-authored-by: Marcus Gartner <[email protected]>
Co-authored-by: Paul Kernfeld <[email protected]>
Co-authored-by: Ricky Stewart <[email protected]>
@jlinder jlinder added the T-sql-queries SQL Queries Team label Jun 16, 2021
@mgartner mgartner moved this to Backlog (DO NOT ADD NEW ISSUES) in SQL Queries Jul 24, 2023
Copy link

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it in
10 days to keep the issue queue tidy. Thank you for your contribution
to CockroachDB!

@mgartner mgartner moved this from Backlog (DO NOT ADD NEW ISSUES) to New Backlog in SQL Queries Nov 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-sql-queries SQL Queries Team
Projects
Status: Backlog
Development

No branches or pull requests

3 participants