Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: prune RHS of a semi/anti join #39472

Merged
merged 1 commit into from
Aug 16, 2019

Conversation

ridwanmsharif
Copy link

@ridwanmsharif ridwanmsharif commented Aug 8, 2019

Fixes #38704.

This change adds a PruneSemiAntiJoinRightCols to prune
columns in the RHS of a semi/anti join. Alternatively,
we could just tag the PruneJoinRightCols rule as higher
priority to achieve the same effect (previously that rule
was never triggered because the EliminateProjects rule
would fire and remove the projections after
PruneJoinLeftCols rule is applied). I prefer this rule
because it avoids requiring an ordering of transformations.

Release note: None

@ridwanmsharif ridwanmsharif requested a review from justinj August 8, 2019 21:10
@ridwanmsharif ridwanmsharif requested a review from a team as a code owner August 8, 2019 21:10
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:LGTM:

[nit] in the commit message, avoids require -> avoids requiring

Reviewed 13 of 13 files at r1.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @justinj)

Copy link
Contributor

@andy-kimball andy-kimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Do you know if there's any impact on TPCH Q21 performance?

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @justinj and @ridwanmsharif)


pkg/sql/opt/norm/rules/prune_cols.opt, line 255 at r1 (raw file):

    $on:*
    $private:* &
    (CanPruneCols

NIT: Indent (CanPruneCols another level like other rules (or maybe there's tab vs. space issue?).

Fixes cockroachdb#38704.

This change adds a PruneSemiAntiJoinRightCols to prune
columns in the RHS of a semi/anti join. Alternatively,
we could just tag the PruneJoinRightCols rule as higher
priority to achieve the same effect (previously that rule
was never triggered because the `EliminateProjects` rule
would fire and remove the projections after
`PruneJoinLeftCols` rule is applied). I prefer this rule
because it avoids requiring an ordering of transformations.

Release note: None
@ridwanmsharif ridwanmsharif force-pushed the prune-semi-anti-join-rhs branch from 8bc422a to 76411a6 Compare August 15, 2019 16:49
Copy link
Author

@ridwanmsharif ridwanmsharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting indeed. From what I'm seeing over multiple runs the average time for TPCH Q21 increases from ~27.16s to ~28.88s.

The only change I see in the Q21 plan as a result of this change (apart form some columns being pruned away) is that we're using the l_sk index when scanning the lineitem table (because after pruning all the columns returned are available in the index). This in turn, has the SemiJoin use the hash join instead of the merge join because the scan of lineitem doesn't order by orderkey. This sounds like an issue with costing merge joins and hash joins correctly (scanning using the pk and doing a merge join vs scanning using an index and doing a hash join) and so I don't think this blocks this PR. I'll merge if the team agrees.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @justinj and @rytaft)


pkg/sql/opt/norm/rules/prune_cols.opt, line 255 at r1 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

NIT: Indent (CanPruneCols another level like other rules (or maybe there's tab vs. space issue?).

Done.

@ridwanmsharif
Copy link
Author

bors r+

craig bot pushed a commit that referenced this pull request Aug 16, 2019
39464: exec: handle some mixed type comparison expressions r=rafiss a=rafiss

This adds support for int-int comparisons and float-float comparisons
(of different sizes), as well as various string comparisons. This
is necessary to support TPC-H queries 7, 16, 19, and 21 (though there
are still other issues to address before those are fully supported).

There is some refactoring here so that as a next step we can support
comparisons between different types entirely (e.g. int-decimal).

touches #39189

Release note: None

39472: opt: prune RHS of a semi/anti join r=ridwanmsharif a=ridwanmsharif

Fixes #38704.

This change adds a PruneSemiAntiJoinRightCols to prune
columns in the RHS of a semi/anti join. Alternatively,
we could just tag the PruneJoinRightCols rule as higher
priority to achieve the same effect (previously that rule
was never triggered because the `EliminateProjects` rule
would fire and remove the projections after
`PruneJoinLeftCols` rule is applied). I prefer this rule
because it avoids requiring an ordering of transformations.

Release note: None

Co-authored-by: Rafi Shamim <[email protected]>
Co-authored-by: Ridwan Sharif <[email protected]>
@craig
Copy link
Contributor

craig bot commented Aug 16, 2019

Build succeeded

@craig craig bot merged commit 76411a6 into cockroachdb:master Aug 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

opt: Add pruning rule for columns on the right side of Semi/Anti joins
4 participants