-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[EPIC] Stop copying LogicalPlan
during OptimizerPasses
#9637
Comments
I have two idea before deep diving to the code
enum SimpliedExpr {
Simplified(Expr)
Original
}
fn simply_expr(&self, expr: &Expr) -> SimpliedExpr if expr is Original, we can avoid clone. |
I think 2 sounds interesting Another think I was thinking was something like I think that along with Maybe we can prototype with impl OptimizerRule {
...
/// does this rule support rewriting owned plans?
fn supports_owned(&self) -> bool { return false }
/// if supports_owned returns true, calls try_optimize_owned
fn try_optimize_owned(
&self,
plan: LogicalPlan,
config: &dyn OptimizerConfig
) -> Result<Transformed<LogicalPlan>, DataFusionError> {
not_implemented_err!("try_optimized_owned is not implemented for this rule")
}
... And then play around with the code that calls If we could show a significant performance improvement for the |
@alamb I think being able to skip failed rules is the main challenge to rewriting the plan with ownership We need to restore the original plan if the rewrite fails at some point, however, the old plan is consumed and we lose ownership. Giving up ownership too early is not a good idea if we need to restore the original plan. Instead of Transformed, I think we need to preserve the old data and fail state to mimic Err in Result but with old data #[derive(Debug, PartialEq, Eq)]
pub enum OptimizedState {
Yes,
No,
Fail,
}
#[derive(Debug)]
pub struct Optimized<T, E = DataFusionError> {
pub optimzied_data: Option<T>,
// Used to store the original data if optimized successfully
pub original_data: T,
pub optimized_state: OptimizedState,
// Used to store the error if optimized failed, so we can early return but preserve the original data
pub error: Option<E>,
}
impl<T, E> Optimized<T, E> {
pub fn yes(optimzied_data: T, original_data: T) -> Self {
Self {
optimzied_data: Some(optimzied_data),
original_data,
optimized_state: OptimizedState::Yes,
error: None,
}
}
pub fn no(original_data: T) -> Self {
Self {
optimzied_data: None,
original_data,
optimized_state: OptimizedState::No,
error: None,
}
}
pub fn fail(original_data: T, e: E) -> Self {
Self {
optimzied_data: None,
original_data,
optimized_state: OptimizedState::Fail,
error: Some(e),
}
}
} Then, for every optimization, we return |
This is an excellent point
If we need to preserve the old data, there is no choice but to copy the Since Could we maybe only do the copy when if skip_failed_rules {
let original_plan = plan.clone();
let new_plan = optimizer.try_optimize_owned(plan)
.ok_or_else(|e| original_plan);
} else {
optimizer.try_optimize_owned(plan)?
} |
I think it is possible to preserve unnecessary clones even if we preserve the old plan, and only clone the data for the new expr and new plan, but I agree it is a little complicated, it needs the enum I show above. We can optimize for the default path first, and others later on. |
Let's give it a try! |
note: I profile my change and find out the time does not improve at all, I find that the cloned inside |
I analyze the so I think #9595 might be an important step for the optimization here. |
I played around with some ideas last night and it is looking promising (though not yet done). I put my draft here #9708. I hope to try and work on it more over the next few days, but I am pretty busy preparing for meetups / presentations / papers for next week. It may be a few more days |
BTW here is a video of how I profile DataFusion: #9577 (comment) |
I'm curious, It seems like most of the reasoning behind Arc<LogicalPlan> and |
I think area allocating is pretty common in various compiler/query optimizer systems because you know the lifetime of a plan doesn't outlive the the optimizer pass (as in you can bound the intermediate results) I think rust actually is ideally setup to avoid having to arena allocate (which still requires copying stuff to a new plan) (using ownership) -- we just need to regigger how DataFusion is setup to use this. I think we are close |
There is one issue that we are not able to call |
What I did in #9708 which seems to have worked pretty well is to copy the LogicalPlan only if Like this let prev_plan = if options.optimizer.skip_failed_rules {
Some(new_plan.clone())
} else {
None
}; |
Upd: I see the latest commit that uses mem::swap for updating Arc<T> in `rewrite_arc`. 👍
I think when you rewrite inputs, you still need to call Arc::into_inner to
get T: LogicalPlan without clone. And, it is where things are getting
tricky.
But I played around without mut before, maybe there are ways to solve this
with mut.
…On Thu, Mar 21, 2024, 6:04 PM Andrew Lamb ***@***.***> wrote:
There is one issue that we are not able to call Arc::into_inner for
skip-failed-rules path since we hold a clone for restoration.
What I did in #9708 <#9708>
which seems to have worked pretty well is to copy the LogicalPlan *only*
if skip_failed_rules is set
Like this
<https://github.com/apache/arrow-datafusion/pull/9708/files#diff-42d66905c3fa6b245c3493fb4df4816e0fde3036941315ab42198fb16f3907dfR319>
let prev_plan = if options.optimizer.skip_failed_rules {
Some(new_plan.clone())
} else {
None
};
—
Reply to this email directly, view it on GitHub
<#9637 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADZCLR2E6LQDXLPP7PW2ZJ3YZKWBJAVCNFSM6AAAAABEZIFWPOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJRHAYDKNBUGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Yes, this is indeed the case. The code in #9708 is pretty terrible / hacky to deal with the |
With some somewhat hacky code in #9708 (comment) I saw a 25% performance improvement. Given the results I saw in #9708 here is my proposal:
I am not sure which of these APIs would be better (rewrite in place via impl OptimizerRule {
...
/// does this rule support rewriting plans rather than copying them?
fn supports_mut (&self) -> bool { return false }
/// if supports_mut, returns true, rewrites `plan` in place
fn optimize_mut(
&self,
plan: &mut LogicalPlan,
config: &dyn OptimizerConfig
) -> Result<Transformed<()>, DataFusionError> {
not_implemented_err!("try_optimized_owned is not implemented for this rule")
}
... impl OptimizerRule {
...
/// does this rule support rewriting owned plans?
fn supports_owned(&self) -> bool { return false }
/// if supports_owned returns true, rewrites the LogicalPlan in returning a newly rewritten plan
fn try_optimize_owned(
&self,
plan: LogicalPlan,
config: &dyn OptimizerConfig
) -> Result<Transformed<LogicalPlan>, DataFusionError> {
not_implemented_err!("try_optimized_owned is not implemented for this rule")
}
... |
Upd: I realize we need to do the std::mem::take for Box too. 😢
I think we can take ownership by default, and mutable inplace if better? For example, change to muable for rewriting new expressions and new plans. |
I was exploring keeping the
SO I was thinking we could start with the hacky solution that kept |
I filed tickets for the other optimizer passes and linked them to the description of this ticket |
Update here is that this issue is done other other than
|
Is your feature request related to a problem or challenge?
Broken out from #9577 where @mustafasrepo @comphead and @jayzhan211 and I were discussing optimizer performance
TLDR is that the datafusion optimizer is slow. When I did some profiling locally by running the following
My analysis is that almost 40% of the planning time is spent in SimplifyExprs and CommonSubexprEliminate and most of that time is related to copying expressions from what I can tell
While those passes themselves internally make a bunch of clones, which we are improving (e.g. @jayzhan211 on #9628) I think there is a more fundamental structural problem
I think a core challenge is that the OptimizerRule trait pretty much requires copying
Exprs
on each pass, as it gets a&LogicalPlan
input, but produces aLogicalPlan
outputThis mean any pass that works on
Exprs
mustclone
allExprs
(by callingLogicalPlan::expressions()
) rewrite them, and then then create a newLogicalPlan
with those new Exprs.Here is that pattern in the expression simplifier:
https://github.com/apache/arrow-datafusion/blob/0eec5f8e1d0f55e48f5cdc628fbb5ddd89b91512/datafusion/optimizer/src/simplify_expressions/simplify_exprs.rs#L112-L123
Describe the solution you'd like
Find some way to avoid clone'ing exprs during LogicalPlan rewrite
Update: here are the tasks:
Infrastructure Preparation
Optimizer
to use owned plans andTreeNode
API (10% faster planning) #9948LogicalPlan::clone()
inLogicalPlan::map_children
when possible #9999Update
OptimizerRule
s to avoid copyingSimplifyExpressions
: IntroduceOptimizerRule::rewrite
to rewrite in place, rewriteExprSimplifier
(20% faster planning) #9954CommonSubexprEliminate
: Stop copyingExpr
s and LogicalPlans so much during Common Subexpression Elimination #9873DecorrelatePredicateSubquery
: Stop copying LogicalPlan and Exprs inDecorrelatePredicateSubquery
#10289EliminateCrossJoin
: Stop copying LogicalPlan and Exprs inEliminateCrossJoin
#10287EliminateDuplicatedExpr
: refactorEliminateDuplicatedExpr
optimizer pass to avoid clone #10218EliminateFilter
: Stop copying LogicalPlan and Exprs inEliminateFilter
#10288EliminateJoin
: Implement rewrite for EliminateOneUnion and EliminateJoin #10184EliminateLimit
: Stop copying LogicalPlan and Exprs inEliminateLimit
#10212EliminateNestedUnion
: Stop copying LogicalPlan and Exprs inEliminateNestedUnion
#10296EliminateOneUnion
: Implement rewrite for EliminateOneUnion and EliminateJoin #10184EliminateOuterJoin
: RefactorEliminateOuterJoin
to implementOptimizerRule::rewrite()
#10081ExtractEquijoinPredicate
: implement rewrite for ExtractEquijoinPredicate and avoid clone in filter #10165FilterNullJoinKeys
implement rewrite for FilterNullJoinKeys #10166OptimizeProjections
: Stop copying LogicalPlan and Exprs inOptimizeProjections
#10209PropagateEmptyRelation
: Stop copying LogicalPlan and Exprs inPropagateEmptyRelation
#10290PushDownFilter
: Stop copying LogicalPlan and Exprs inPushDownFilter
#10291PushDownLimit
: Stop copying LogicalPlan and Exprs inPushDownLimit
#10292ReplaceDistinctWithAggregate
: Stop copying LogicalPlan and Exprs inReplaceDistinctWithAggregate
#10293RewriteDisjunctivePredicate
: Stop copying LogicalPlan and Exprs inRewriteDisjunctivePredicate
#10213ScalarSubqueryToJoin
: Stop copying LogicalPlan and Exprs inScalarSubqueryToJoin
#10294SingleDistinctToGroupBy
: Stop copying LogicalPlan and Exprs inSingleDistinctToGroupBy
#10295UnwrapCastInComparison
RefactorUnwrapCastInComparison
to implementOptimizerRule::rewrite()
#10087 / RefactorUnwrapCastInComparison
to removeExpr
clones #10115Update
AnalyzerRule
s to avoid copyingAnalyzerMisc
: Minor: Avoid copying all expressions inAnalzyer
/check_plan
#9974InlineTableScan
: Avoid copies inInlineTableScan
via TreeNode API #10038ApplyFunctionRewrites
: fixNamedStructField should be rewritten in OperatorToFunction
in subquery regression (changeApplyFunctionRewrites
to use TreeNode API #10032TypeCoercion
: Stop copying LogicalPlan and Exprs inTypeCoercion
#10210TypeCoercion
more: Onyl recompute schema inTypeCoercion
when necessary #10365CountWildcardRule
: (needs a little reworking to avoid clones) Avoid copies inCountWildcardRule
via TreeNode API #10066Update Other to avoid copying
LogicalPlan::with_param_values
#10016Describe alternatives you've considered
No response
Additional context
We have talked about various other ways to reduce copying of LogicalPlans as well as its challenges in other tickets:
Box
es withArc
in theExpr
enum
. #9577The text was updated successfully, but these errors were encountered: