Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt,ccl: eliminate Project inside GroupBy, add tests for validation of partial unique indexes #67263

Merged
merged 3 commits into from
Jul 6, 2021

Conversation

rytaft
Copy link
Collaborator

@rytaft rytaft commented Jul 6, 2021

opt: add a test-only check for NullsAreDistinct in GroupingPrivate

This commit adds a check that NullsAreDistinct is true iff the grouping
operator is UpsertDistinctOn or EnsureUpsertDistinctOn.

Release note: None

opt: add exploration rule to eliminate Project inside GroupBy

This commit updates the exploration rule EliminateIndexJoinInsideGroupBy
and renames it to EliminateIndexJoinOrProjectInsideGroupBy. The rule now
removes either an IndexJoin or Project operator if it can be proven that
the removal does not affect the output of the parent grouping operator.

Removal of a Project is needed in cases where the partial index predicate
constrains some columns to be constant, and therefore provides those
columns as constant projections. If the projected columns are not actually
needed by the GroupBy, however, the Project is not necessary and interferes
with other rules matching, such as SplitGroupByScanIntoUnionScans.

Informs #65473

Release note (performance improvement): Improved the efficiency of validation
for some partial unique indexes in REGIONAL BY ROW tables by improving the
query plan to use all streaming operations.

ccl: add tests for validation of partial unique indexes

This commit adds tests to ensure that the expected query plan is used
to validate new partial unique indexes in REGIONAL BY ROW tables.

Informs #65473

Release note: None

@rytaft rytaft requested review from mgartner, cucaroach and a team July 6, 2021 12:38
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@rytaft
Copy link
Collaborator Author

rytaft commented Jul 6, 2021

Hold off on reviewing this -- I need to make a change.

This commit adds a check that NullsAreDistinct is true iff the grouping
operator is UpsertDistinctOn or EnsureUpsertDistinctOn.

Release note: None
@rytaft rytaft force-pushed the fix-validation branch 2 times, most recently from ec2b87b to fece174 Compare July 6, 2021 14:58
@rytaft rytaft changed the title ccl: add tests for validation of partial unique indexes opt,ccl: eliminate Project inside GroupBy, add tests for validation of partial unique indexes Jul 6, 2021
rytaft added 2 commits July 6, 2021 16:03
This commit updates the exploration rule EliminateIndexJoinInsideGroupBy
and renames it to EliminateIndexJoinOrProjectInsideGroupBy. The rule now
removes either an IndexJoin or Project operator if it can be proven that
the removal does not affect the output of the parent grouping operator.

Removal of a Project is needed in cases where the partial index predicate
constrains some columns to be constant, and therefore provides those
columns as constant projections. If the projected columns are not actually
needed by the GroupBy, however, the Project is not necessary and interferes
with other rules matching, such as SplitGroupByScanIntoUnionScans.

Informs cockroachdb#65473

Release note (performance improvement): Improved the efficiency of validation
for some partial unique indexes in REGIONAL BY ROW tables by improving the
query plan to use all streaming operations.
This commit adds tests to ensure that the expected query plan is used
to validate new partial unique indexes in REGIONAL BY ROW tables.

Release note: None
@rytaft
Copy link
Collaborator Author

rytaft commented Jul 6, 2021

Ok, RFAL

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! :lgtm:

Reviewed 2 of 2 files at r2, 6 of 6 files at r3, 1 of 1 files at r4.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @cucaroach and @rytaft)


pkg/ccl/logictestccl/testdata/logic_test/regional_by_row, line 1560 at r4 (raw file):

                    │ estimated row count: 10
                    │
                    ├── • union all

I think it would be helpful for users if EXPLAIN clearly showed that these union-all's are locality optimized. Definitely not necessary for this PR, but what do you think?

Copy link
Collaborator Author

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

bors r+

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @cucaroach and @mgartner)


pkg/ccl/logictestccl/testdata/logic_test/regional_by_row, line 1560 at r4 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

I think it would be helpful for users if EXPLAIN clearly showed that these union-all's are locality optimized. Definitely not necessary for this PR, but what do you think?

These UNION ALLs aren't actually locality optimized -- they are streaming due to the ordering on +b. #66211 is already open to improve EXPLAIN for locality optimized UNION ALLs, but that wouldn't apply here (locality optimized UNION ALL must have a limit).

We could make it clearer that this is a streaming op, I guess -- is that what you had in mind? (In that case, we'd also want to make that change for group by...)

@craig
Copy link
Contributor

craig bot commented Jul 6, 2021

Build succeeded:

@craig craig bot merged commit 4c39df8 into cockroachdb:master Jul 6, 2021
@mgartner
Copy link
Collaborator

mgartner commented Jul 6, 2021


pkg/ccl/logictestccl/testdata/logic_test/regional_by_row, line 1560 at r4 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

These UNION ALLs aren't actually locality optimized -- they are streaming due to the ordering on +b. #66211 is already open to improve EXPLAIN for locality optimized UNION ALLs, but that wouldn't apply here (locality optimized UNION ALL must have a limit).

We could make it clearer that this is a streaming op, I guess -- is that what you had in mind? (In that case, we'd also want to make that change for group by...)

Heh, ya I was confused, sorry. Thanks for the link to #66211. But yes, I think our EXPLAIN and EXPLAIN (opt) are too cryptic for streaming set operations. IIRC segmented sorts are also cryptic. Maybe we should expand #66211 to include more EXPLAIN improvements beyond LOS.

cc'ing @vy-ton to help prioritize this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants