Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid Expr copies OptimizeProjection, 12% faster planning, encapsulate indicies #10216

Merged
merged 1 commit into from
Apr 25, 2024

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Apr 24, 2024

Which issue does this PR close?

Part of #10209

Rationale for this change

This is part 1 of improving the performance of OptimizeProjection (by less copying) in #10209.

While trying to rework how OptimizeProjection works, I found that the population
of child indices pretty much required calling LogicalPlan::expressions() which results in an owed copy of all the plans Exprs

Removing these copies themselves improves the situation (and I think may make the code easier to reasona about), and sets us up to stop copying the LogicalPlan nodes as a follow on PR

What changes are included in this PR?

This PR encapsulates that logic for managing required indices into a new struct RequiredIndicies which avoids some Expr copies (and sets up the rewrite to avoid LogicalPlan copies)

Are these changes tested?

Covered by existing tests

Are there any user-facing changes?

no functional changes

performance benchmarks show an overall 14% improvements for tpch_all and tpcds_all, which is pretty neat

Details

++ critcmp main optimize_proj_pt1
group                                         main                                   optimize_proj_pt1
-----                                         ----                                   -----------------
logical_aggregate_with_join                   1.02  1210.2±63.66µs        ? ?/sec    1.00  1188.2±32.04µs        ? ?/sec
logical_plan_tpcds_all                        1.02    158.0±1.26ms        ? ?/sec    1.00    154.3±1.56ms        ? ?/sec
logical_plan_tpch_all                         1.05     17.0±0.20ms        ? ?/sec    1.00     16.3±0.18ms        ? ?/sec
logical_select_all_from_1000                  1.04     19.2±0.16ms        ? ?/sec    1.00     18.5±0.09ms        ? ?/sec
logical_select_one_from_700                   1.01   792.6±42.68µs        ? ?/sec    1.00    787.7±7.31µs        ? ?/sec
logical_trivial_join_high_numbered_columns    1.01   743.5±10.96µs        ? ?/sec    1.00   732.9±28.34µs        ? ?/sec
logical_trivial_join_low_numbered_columns     1.01    729.8±8.98µs        ? ?/sec    1.00    720.9±9.40µs        ? ?/sec
physical_plan_tpcds_all                       1.14  1500.3±10.49ms        ? ?/sec    1.00  1311.0±13.83ms        ? ?/sec
physical_plan_tpch_all                        1.14     99.9±0.72ms        ? ?/sec    1.00     87.9±0.39ms        ? ?/sec
physical_plan_tpch_q1                         1.14      5.4±0.02ms        ? ?/sec    1.00      4.7±0.02ms        ? ?/sec
physical_plan_tpch_q10                        1.10      4.6±0.02ms        ? ?/sec    1.00      4.2±0.02ms        ? ?/sec
physical_plan_tpch_q11                        1.08      4.0±0.02ms        ? ?/sec    1.00      3.7±0.02ms        ? ?/sec
physical_plan_tpch_q12                        1.11      3.3±0.01ms        ? ?/sec    1.00      3.0±0.01ms        ? ?/sec
physical_plan_tpch_q13                        1.08      2.2±0.01ms        ? ?/sec    1.00      2.1±0.01ms        ? ?/sec
physical_plan_tpch_q14                        1.11      2.9±0.02ms        ? ?/sec    1.00      2.6±0.03ms        ? ?/sec
physical_plan_tpch_q16                        1.12      4.0±0.03ms        ? ?/sec    1.00      3.6±0.02ms        ? ?/sec
physical_plan_tpch_q17                        1.12      3.8±0.03ms        ? ?/sec    1.00      3.4±0.02ms        ? ?/sec
physical_plan_tpch_q18                        1.11      4.2±0.04ms        ? ?/sec    1.00      3.8±0.02ms        ? ?/sec
physical_plan_tpch_q19                        1.26      8.0±0.09ms        ? ?/sec    1.00      6.3±0.04ms        ? ?/sec
physical_plan_tpch_q2                         1.11      8.5±0.07ms        ? ?/sec    1.00      7.6±0.03ms        ? ?/sec
physical_plan_tpch_q20                        1.12      4.9±0.04ms        ? ?/sec    1.00      4.4±0.02ms        ? ?/sec
physical_plan_tpch_q21                        1.14      6.8±0.07ms        ? ?/sec    1.00      6.0±0.03ms        ? ?/sec
physical_plan_tpch_q22                        1.11      3.6±0.04ms        ? ?/sec    1.00      3.3±0.03ms        ? ?/sec
physical_plan_tpch_q3                         1.09      3.3±0.01ms        ? ?/sec    1.00      3.0±0.02ms        ? ?/sec
physical_plan_tpch_q4                         1.08      2.4±0.01ms        ? ?/sec    1.00      2.2±0.01ms        ? ?/sec
physical_plan_tpch_q5                         1.11      4.8±0.02ms        ? ?/sec    1.00      4.3±0.02ms        ? ?/sec
physical_plan_tpch_q6                         1.11  1735.8±11.71µs        ? ?/sec    1.00   1557.8±9.94µs        ? ?/sec
physical_plan_tpch_q7                         1.13      6.2±0.04ms        ? ?/sec    1.00      5.5±0.02ms        ? ?/sec
physical_plan_tpch_q8                         1.12      8.0±0.07ms        ? ?/sec    1.00      7.1±0.02ms        ? ?/sec
physical_plan_tpch_q9                         1.12      6.0±0.04ms        ? ?/sec    1.00      5.4±0.02ms        ? ?/sec
physical_select_all_from_1000                 1.49     90.9±0.30ms        ? ?/sec    1.00     60.9±0.20ms        ? ?/sec
physical_select_one_from_700                  1.08      3.7±0.05ms        ? ?/sec    1.00      3.4±0.02ms        ? ?/sec

@github-actions github-actions bot added the optimizer Optimizer rules label Apr 24, 2024
@alamb alamb force-pushed the alamb/optimize_proj_pt1 branch 3 times, most recently from f120d80 to 9bd4636 Compare April 24, 2024 19:02
/// See [Self::index_of_column] for a version that returns an error if the
/// column is not found
pub fn maybe_index_of_column(&self, col: &Column) -> Option<usize> {
self.index_of_column_by_name(col.relation.as_ref(), &col.name)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hung me up for a while -- the original code uses flat_map which discards the Err if the column is not found, was not obvious to me

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is much more clear, Thanks!

@alamb alamb changed the title Avoid some copies, encapsulate the handling of child indicies in OptimizeProjection Avoid some Expr copies, encapsulate the handling of child indicies in OptimizeProjection Apr 24, 2024
@alamb alamb changed the title Avoid some Expr copies, encapsulate the handling of child indicies in OptimizeProjection Avoid Expr copies OptimizeProjection, 12% faster planning, encapsulate indicies Apr 24, 2024
@alamb alamb marked this pull request as ready for review April 24, 2024 21:25
) -> Result<Option<LogicalPlan>> {
// `child_required_indices` stores
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this documentation moved to RequiredIndices

@@ -339,31 +318,35 @@ fn optimize_projections(
let left_len = join.left.schema().fields().len();
let (left_req_indices, right_req_indices) =
split_join_requirements(left_len, indices, &join.join_type);
let exprs = plan.expressions();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This call copies all expressions in the plan

@@ -123,12 +120,13 @@ fn optimize_projections(
// that appear in this plan's expressions to its child. All these
// operators benefit from "small" inputs, so the projection_beneficial
// flag is `true`.
let exprs = plan.expressions();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, this was also copying all expressions

@@ -137,13 +135,9 @@ fn optimize_projections(
// that appear in this plan's expressions to its child. These operators
// do not benefit from "small" inputs, so the projection_beneficial
// flag is `false`.
let exprs = plan.expressions();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And another copy removed

@@ -178,16 +170,12 @@ fn optimize_projections(
Make sure `.necessary_children_exprs` implementation of the `UserDefinedLogicalNode` is \
consistent with actual children length for the node.");
}
// Expressions used by node.
let exprs = plan.expressions();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another copy

@@ -408,26 +392,6 @@ fn optimize_projections(
}
}

/// This function applies the given function `f` to the projection indices
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of those code was moved into RequiredIndices as methods both to encapsulate the logic a bit more as well as to allow creation without copying

Copy link
Contributor

@mustafasrepo mustafasrepo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @alamb. This PR is LGTM!. As well as improving performance, it improves maintainability also.

@alamb
Copy link
Contributor Author

alamb commented Apr 25, 2024

Thanks for the reiew @mustafasrepo -- I am slowly working through how to refactor this pass to avoid copies. I think a few more PRs and I'll have it

@alamb alamb merged commit 5c86db0 into apache:main Apr 25, 2024
24 checks passed
@alamb alamb deleted the alamb/optimize_proj_pt1 branch April 25, 2024 19:28
ccciudatu pushed a commit to hstack/arrow-datafusion that referenced this pull request Apr 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optimizer Optimizer rules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants