-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Common Subexpression Elimination for PhysicalExpr
trees
#13046
base: main
Are you sure you want to change the base?
Conversation
ec0d407
to
9361ee6
Compare
cc @alamb, @andygrove |
PsysicalExpr
treesPhysicalExpr
trees
Thanks @peter-toth. I plan on testing this out with Comet. |
// The EliminateCommonPhysicalSubExprs rule extracts common physical | ||
// subexpression trees into a `ProjectionExec` node under the actual node to | ||
// calculate the common values only once. | ||
Arc::new(EliminateCommonPhysicalSubexprs::new()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have any examples of where this rule does eliminations for a plan that has been optimized by the corresponding logical rule?
Because if not it not clear to me why it should be part of the default/recommended list of rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the usecase is for systems (like Comet) that don't use the LogicalPlanner.
That being said, I think the question of "should it be run by default" is a good one -- and I agree with your assertion if we can't find an example where this helps a plan we should probably not enable it by default.
FYI @andygrove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this rule is useful only for Comet and similar usecases. Let me remove it from the default optimizer: d2529ce
de98385
to
d2529ce
Compare
type CommonNodes<'n, N> = IndexMap<Identifier<'n, N>, (N, String)>; | ||
/// A list that contains the common [`TreeNode`]s and their alias, extracted during the | ||
/// second, rewriting traversal. | ||
type CommonNodes<'n, N> = Vec<(N, String)>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for this change of type from IndexMap
to Vec
in CommonNodes
is that physical columns works with indexes rather than names. E.g. when we repace a common subexpression to a column during rewrite, we need both the name and the index of the common subexpression in the intermediate ProjectionExec
node. Storing the index in NodeStats
and using Vec
in CommonNodes
better fits this usecase.
984c6ee
to
63d5be4
Compare
63d5be4
to
01bf07b
Compare
Which issue does this PR close?
Part of #12599.
Rationale for this change
As described in #12599, there is a CSE rule for logical plans already, but some projects create physical plans directly that could benefit from physical CSE.
What changes are included in this PR?
This PR:
EliminateCommonPhysicalSubexprs
rule to eliminate common subtrees forArc<dyn PhysicalExpr>
trees. This initial implementation targetsProjectionExec
nodes only. Follow-up PR can add support for other nodes like aggregates and filters.DynHashNode
trait and implements it forPhysicalExpr
s.CommonSubexprEliminate
rule.Are these changes tested?
Added new UTs.
Are there any user-facing changes?
No.