-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop copying LogicalPlan and Exprs in CommonSubexprEliminate
(2-3% planning speed improvement)
#10835
Conversation
@@ -870,37 +870,7 @@ impl LogicalPlan { | |||
LogicalPlan::Filter { .. } => { | |||
assert_eq!(1, expr.len()); | |||
let predicate = expr.pop().unwrap(); | |||
|
|||
// filter predicates should not contain aliased expressions so we remove any aliases |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is extract into Filter::remove_aliases
as it is needed for CSE.
@@ -173,9 +179,8 @@ impl CommonSubexprEliminate { | |||
&mut common_exprs, | |||
)?; | |||
|
|||
let mut new_input = self | |||
.try_optimize(input, config)? | |||
.unwrap_or_else(|| input.clone()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole point of this PR is to remove clones -- I'll try and point out example places such as this where the plans and/or expressions were being cloned
// WindowAggr: windowExpr=[[sum(c9) ORDER BY [c3 + c4] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW]] | ||
// --WindowAggr: windowExpr=[[sum(c9) ORDER BY [c3 + c4] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW]] | ||
// where, it is referred once by each `WindowAggr` (total of 2) in the plan. | ||
let mut plan = LogicalPlan::Window(window.clone()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clone removed
let Window { | ||
input, window_expr, .. | ||
} = window; | ||
plan = input.as_ref().clone(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clone removed
.zip(arrays_list.iter()) | ||
.map(|(exprs, arrays)| { | ||
exprs | ||
.iter() | ||
.cloned() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another clone
.iter() | ||
.zip(aggr_expr.iter()) | ||
.map(|(new_expr, old_expr)| { | ||
new_expr.clone().alias_if_changed(old_expr.display_name()?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another
Update -- q1 gets significantly worse -- I'll try and profile it and figure out why |
UPDATE: I ran the wrong code. I am now rerunning... |
Ok, update here is that after some additional tracking of when plans are rewritten, this PR now improves planning performance by 2-3% so I think it is ready for review |
CommonSubexprEliminate
CommonSubexprEliminate
(2-3% planning speed improvement)
I believe this PR has conflicts with #10939 which I am now resolving Thank you @crepererum for your review |
@peter-toth would you like to review this PR as well? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pinging me @alamb, LGTM, only minor suggestions.
/// Decimal128(Some(69999999999999),30,15) | ||
/// AND lineitem.l_quantity < Decimal128(Some(2400),15,2) | ||
/// ``` | ||
pub fn remove_aliases(predicate: Expr) -> Result<Transformed<Expr>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it make sense to move this method into Expr
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it does -- I will do so as a follow on PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
if !common_exprs.is_empty() { | ||
new_input = build_common_expr_project_plan(new_input, common_exprs)?; | ||
transformed = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we need this because if !common_exprs.is_empty()
is true then rewrite_exprs.transformed
must also be true a bit above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is a good invariant -- I did not know that. Changed to an assert in in 8dd6256
Ok(Transformed::new(expr, false, TreeNodeRecursion::Jump)) | ||
} | ||
Expr::Alias(_) => Ok(Transformed::new( | ||
expr.unalias(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we know that expr is an Expr::Alias
maybe we could just use Expr::Alias(alias) => *alias.expr
instead of calling .unalias()
, that matches the expr again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Done in 550e445
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @peter-toth
/// Decimal128(Some(69999999999999),30,15) | ||
/// AND lineitem.l_quantity < Decimal128(Some(2400),15,2) | ||
/// ``` | ||
pub fn remove_aliases(predicate: Expr) -> Result<Transformed<Expr>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it does -- I will do so as a follow on PR
|
||
if !common_exprs.is_empty() { | ||
new_input = build_common_expr_project_plan(new_input, common_exprs)?; | ||
transformed = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is a good invariant -- I did not know that. Changed to an assert in in 8dd6256
Ok(Transformed::new(expr, false, TreeNodeRecursion::Jump)) | ||
} | ||
Expr::Alias(_) => Ok(Transformed::new( | ||
expr.unalias(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Done in 550e445
Thanks again everyone for all the reviews |
…planning speed improvement) (apache#10835) * Stop copying LogicalPlan and Exprs in `CommonSubexprEliminate` * thread transformed * Update unary to report transformed correctly * Preserve through window transforms * track aggregate * Avoid re-computing Aggregate schema * Update datafusion/optimizer/src/common_subexpr_eliminate.rs * Avoid unecessary setting transform flat * Cleanup unaliasing
…planning speed improvement) (apache#10835) * Stop copying LogicalPlan and Exprs in `CommonSubexprEliminate` * thread transformed * Update unary to report transformed correctly * Preserve through window transforms * track aggregate * Avoid re-computing Aggregate schema * Update datafusion/optimizer/src/common_subexpr_eliminate.rs * Avoid unecessary setting transform flat * Cleanup unaliasing
…planning speed improvement) (apache#10835) * Stop copying LogicalPlan and Exprs in `CommonSubexprEliminate` * thread transformed * Update unary to report transformed correctly * Preserve through window transforms * track aggregate * Avoid re-computing Aggregate schema * Update datafusion/optimizer/src/common_subexpr_eliminate.rs * Avoid unecessary setting transform flat * Cleanup unaliasing
Draft: this is still a draft as the performance doesn't really change -- I need to investigate more
Which issue does this PR close?
Closes #9873
Closes #9637 -- the last one of my planned PRs to make DataFusion planning faster by not copying so much
Note that we have more plans for CSE here: #10426
Rationale for this change
For low latency use cases the overhead of planning can be substantial, especially for relatively complex plans
What changes are included in this PR?
CommonSubexprEliminate
to avoid deep copying plans using TreeNode APIAre these changes tested?
Functionally by existing CI
Performance tests: show a 2% improvement for
tpch_all
and a 3% improvement fortpcds_all
Details
Are there any user-facing changes?
Performance Improvements (see above)