Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: use only required columns in provided ordering for project #86193

Merged
merged 1 commit into from
Aug 18, 2022

Conversation

DrewKimball
Copy link
Collaborator

@DrewKimball DrewKimball commented Aug 16, 2022

Project operators can only pass through their input ordering.
However, the provided ordering may have to be remapped in order to
ensure it only refers to output columns, since the Project can add
and remove columns. The Project uses its internalFDs field to
accomplish the remapping; these are constructed when the Project
is added to the memo by combining the functional dependencies of the
input and the projections.

The problem occurs when transformation rules cause the input of the
Project to "reveal" additional functional dependencies. For example,
one side of a union may be eliminated and the FDs of the remaining side
used in the result. This can cause the Project to output an ordering
that is equivalent to the required ordering according to its own FDs,
but which a parent operator cannot tell is equivalent because its FDs
were calculated before the tranformation rule fired. This can cause
panics later down the line when the provided ordering does not match
up with the required ordering.

In the following example, an exploration rule transforms the join into
two joins unioned together, one over each disjunct. After the
transformation, a normalization rule fires that removes the
t0.rowid IS NULL side because rowids are non-null. This reveals the
t1.rowid = t0.rowid FD, which later causes t0.rowid to be used in
a provided ordering rather than t1.rowid. For the reasons mentioned
above, this later causes a panic when a Project attempts to remap to
the required t1.rowid ordering.

CREATE TABLE t0 (c0 INT);
CREATE TABLE t1 (c0 INT);

SELECT * FROM t0 CROSS JOIN t1
WHERE (t0.rowid IS NULL) OR (t1.rowid IN (t0.rowid))
ORDER BY t1.rowid;

This commit prevents the panic by making Project operators remap the
input provided ordering to use columns from the required ordering
(which are a subset of the output columns). This prevents the disparity
between required and provided orderings that can cause panics down the
line. In the example given above, the t1.rowid column would be chosen
for the provided ordering because it is in the required ordering.

Fixes #85393

Release note (bug fix): fixed a vulnerability in the optimizer that could
cause a panic in rare cases when planning complex queries with ORDER BY.

Release justification: fixes release blocker

@DrewKimball DrewKimball requested a review from a team as a code owner August 16, 2022 00:39
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Nice fix! Thank you for the thorough explanation!

Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball)

@michae2
Copy link
Collaborator

michae2 commented Aug 17, 2022

(I vote we backport this to 22.1 and 21.2 as well.)

@DrewKimball DrewKimball requested a review from rytaft August 18, 2022 08:13
Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Nice!

I agree that we should backport this -- can you add the backport labels, @DrewKimball? Thanks!

Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @DrewKimball)


-- commits line 48 at r1:
Should this have a "bug fix" release note?

Project operators can only pass through their input ordering.
However, the provided ordering may have to be remapped in order to
ensure it only refers to output columns, since the `Project` can add
and remove columns. The `Project` uses its `internalFDs` field to
accomplish the remapping; these are constructed when the `Project`
is added to the memo by combining the functional dependencies of the
input and the projections.

The problem occurs when transformation rules cause the input of the
`Project` to "reveal" additional functional dependencies. For example,
one side of a union may be eliminated and the FDs of the remaining side
used in the result. This can cause the `Project` to output an ordering
that is equivalent to the required ordering according to its own FDs,
but which a parent operator cannot tell is equivalent because its FDs
were calculated before the tranformation rule fired. This can cause
panics later down the line when the provided ordering does not match
up with the required ordering.

In the following example, an exploration rule transforms the join into
two joins unioned together, one over each disjunct. After the
transformation, a normalization rule fires that removes the
`t0.rowid IS NULL` side because rowids are non-null. This reveals the
`t1.rowid = t0.rowid` FD, which later causes `t0.rowid` to be used in
a provided ordering rather than `t1.rowid`. For the reasons mentioned
above, this later causes a panic when a `Project` attempts to remap to
the required `t1.rowid` ordering.
```
CREATE TABLE t0 (c0 INT);
CREATE TABLE t1 (c0 INT);

SELECT * FROM t0 CROSS JOIN t1
WHERE (t0.rowid IS NULL) OR (t1.rowid IN (t0.rowid))
ORDER BY t1.rowid;
```

This commit prevents the panic by making `Project` operators remap the
input provided ordering to use columns from the required ordering
(which are a subset of the output columns). This prevents the disparity
between required and provided orderings that can cause panics down the
line. In the example given above, the `t1.rowid` column would be chosen
for the provided ordering because it is in the required ordering.

Fixes cockroachdb#85393

Release note (bug fix): fixed a vulnerability in the optimizer that could
cause a panic in rare cases when planning complex queries with `ORDER BY`.

Release justification: fixes a release-blocker
Copy link
Collaborator Author

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @rytaft)


-- commits line 48 at r1:

Previously, rytaft (Rebecca Taft) wrote…

Should this have a "bug fix" release note?

Done.

@DrewKimball
Copy link
Collaborator Author

TFTRs!

bors r+

@craig
Copy link
Contributor

craig bot commented Aug 18, 2022

Build succeeded:

@craig craig bot merged commit 6e7e8e9 into cockroachdb:master Aug 18, 2022
@blathers-crl
Copy link

blathers-crl bot commented Aug 18, 2022

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from 73f41a6 to blathers/backport-release-21.2-86193: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 21.2.x failed. See errors above.


error creating merge commit from 73f41a6 to blathers/backport-release-22.1-86193: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 22.1.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

DrewKimball added a commit to DrewKimball/cockroach that referenced this pull request Sep 22, 2022
This commit fixes a bug that was recently introduced in cockroachdb#86193 that can
cause an internal panic when remapping an ordering from the input to the
output of a `Project` operator. cockroachdb#86193 modified the logic to ensure that
the remapping only refers to columns from the required ordering; however,
it failed to add the optional columns to the allowed set of columns. This
could cause a panic if the provided ordering contains columns from the
optional set. This is fixed by adding the optional columns when remapping.

Fixes cockroachdb#87806

Release note (bug fix): Fixes a bug introduced in 22.1.7 that could cause
an internal panic when a query ordering contained redundant ordering
columns.
DrewKimball added a commit to DrewKimball/cockroach that referenced this pull request Sep 22, 2022
This commit fixes a bug that was recently introduced in cockroachdb#86193 that can
cause an internal panic when remapping an ordering from the input to the
output of a `Project` operator. cockroachdb#86193 modified the logic to ensure that
the remapping only refers to columns from the required ordering; however,
it failed to add the optional columns to the allowed set of columns. This
could cause a panic if the provided ordering contains columns from the
optional set. This is fixed by adding the optional columns when remapping.

Fixes cockroachdb#87806

Release note: None (not in a release)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

opt: internal error: no output column equivalent to 2
4 participants