-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid Expr
copies OptimizeProjection
, 12% faster planning, encapsulate indicies
#10216
Conversation
f120d80
to
9bd4636
Compare
/// See [Self::index_of_column] for a version that returns an error if the | ||
/// column is not found | ||
pub fn maybe_index_of_column(&self, col: &Column) -> Option<usize> { | ||
self.index_of_column_by_name(col.relation.as_ref(), &col.name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This hung me up for a while -- the original code uses flat_map
which discards the Err
if the column is not found, was not obvious to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is much more clear, Thanks!
OptimizeProjection
Expr
copies, encapsulate the handling of child indicies in OptimizeProjection
2434484
to
4d4c2bd
Compare
Expr
copies, encapsulate the handling of child indicies in OptimizeProjection
Expr
copies OptimizeProjection
, 12% faster planning, encapsulate indicies
) -> Result<Option<LogicalPlan>> { | ||
// `child_required_indices` stores |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this documentation moved to RequiredIndices
@@ -339,31 +318,35 @@ fn optimize_projections( | |||
let left_len = join.left.schema().fields().len(); | |||
let (left_req_indices, right_req_indices) = | |||
split_join_requirements(left_len, indices, &join.join_type); | |||
let exprs = plan.expressions(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This call copies all expressions in the plan
@@ -123,12 +120,13 @@ fn optimize_projections( | |||
// that appear in this plan's expressions to its child. All these | |||
// operators benefit from "small" inputs, so the projection_beneficial | |||
// flag is `true`. | |||
let exprs = plan.expressions(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise, this was also copying all expressions
@@ -137,13 +135,9 @@ fn optimize_projections( | |||
// that appear in this plan's expressions to its child. These operators | |||
// do not benefit from "small" inputs, so the projection_beneficial | |||
// flag is `false`. | |||
let exprs = plan.expressions(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And another copy removed
@@ -178,16 +170,12 @@ fn optimize_projections( | |||
Make sure `.necessary_children_exprs` implementation of the `UserDefinedLogicalNode` is \ | |||
consistent with actual children length for the node."); | |||
} | |||
// Expressions used by node. | |||
let exprs = plan.expressions(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another copy
@@ -408,26 +392,6 @@ fn optimize_projections( | |||
} | |||
} | |||
|
|||
/// This function applies the given function `f` to the projection indices |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of those code was moved into RequiredIndices
as methods both to encapsulate the logic a bit more as well as to allow creation without copying
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @alamb. This PR is LGTM!. As well as improving performance, it improves maintainability also.
Thanks for the reiew @mustafasrepo -- I am slowly working through how to refactor this pass to avoid copies. I think a few more PRs and I'll have it |
Which issue does this PR close?
Part of #10209
Rationale for this change
This is part 1 of improving the performance of
OptimizeProjection
(by less copying) in #10209.While trying to rework how
OptimizeProjection works
, I found that the populationof child indices pretty much required calling
LogicalPlan::expressions()
which results in an owed copy of all the plansExpr
sRemoving these copies themselves improves the situation (and I think may make the code easier to reasona about), and sets us up to stop copying the
LogicalPlan
nodes as a follow on PRWhat changes are included in this PR?
This PR encapsulates that logic for managing required indices into a new struct
RequiredIndicies
which avoids some Expr copies (and sets up the rewrite to avoidLogicalPlan
copies)Are these changes tested?
Covered by existing tests
Are there any user-facing changes?
no functional changes
performance benchmarks show an overall 14% improvements for tpch_all and tpcds_all, which is pretty neat
Details