Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop copying LogicalPlan and Exprs in EliminateCrossJoin (4% faster planning) #10431

Merged
merged 4 commits into from
May 13, 2024

Conversation

alamb
Copy link
Contributor

@alamb alamb commented May 8, 2024

Draft as it builds on

Which issue does this PR close?

Closes #10287

Rationale for this change

Make planning faster by avoiding copying

What changes are included in this PR?

  1. Change EliminateCrossJoin to use TreeNode API and stop copying plans
  2. Reduce Expr copies more
  3. Improve documentation

Are these changes tested?

Existing CI

Are there any user-facing changes?

No functional changes,

Performance change: 4% better planning

(note this includes the improvements from #10427 as well)

Details

group                                         eliminate_cross_join                   main
-----                                         --------------------                   ----
logical_aggregate_with_join                   1.00  1214.6±15.54µs        ? ?/sec    1.00  1218.8±14.06µs        ? ?/sec
logical_plan_tpcds_all                        1.01    161.6±1.93ms        ? ?/sec    1.00    159.4±1.84ms        ? ?/sec
logical_plan_tpch_all                         1.02     17.2±0.19ms        ? ?/sec    1.00     16.9±0.16ms        ? ?/sec
logical_select_all_from_1000                  1.00     18.0±0.08ms        ? ?/sec    1.05     18.8±0.09ms        ? ?/sec
logical_select_one_from_700                   1.00   822.4±15.44µs        ? ?/sec    1.00   820.4±44.09µs        ? ?/sec
logical_trivial_join_high_numbered_columns    1.01    765.5±9.28µs        ? ?/sec    1.00   759.0±13.89µs        ? ?/sec
logical_trivial_join_low_numbered_columns     1.01    751.8±9.19µs        ? ?/sec    1.00   741.7±23.29µs        ? ?/sec
physical_plan_tpcds_all                       1.00   1310.5±5.75ms        ? ?/sec    1.04   1357.6±7.25ms        ? ?/sec
physical_plan_tpch_all                        1.00     90.1±0.88ms        ? ?/sec    1.04     94.1±0.82ms        ? ?/sec
physical_plan_tpch_q1                         1.00      3.9±0.05ms        ? ?/sec    1.33      5.2±0.07ms        ? ?/sec
physical_plan_tpch_q10                        1.00      4.3±0.06ms        ? ?/sec    1.03      4.4±0.07ms        ? ?/sec
physical_plan_tpch_q11                        1.00      3.9±0.06ms        ? ?/sec    1.03      4.0±0.05ms        ? ?/sec
physical_plan_tpch_q12                        1.00      3.0±0.05ms        ? ?/sec    1.02      3.1±0.07ms        ? ?/sec
physical_plan_tpch_q13                        1.00      2.1±0.03ms        ? ?/sec    1.00      2.2±0.04ms        ? ?/sec
physical_plan_tpch_q14                        1.00      2.5±0.04ms        ? ?/sec    1.10      2.8±0.04ms        ? ?/sec
physical_plan_tpch_q16                        1.00      3.6±0.04ms        ? ?/sec    1.07      3.9±0.04ms        ? ?/sec
physical_plan_tpch_q17                        1.00      3.5±0.05ms        ? ?/sec    1.04      3.7±0.05ms        ? ?/sec
physical_plan_tpch_q18                        1.00      3.9±0.07ms        ? ?/sec    1.03      4.1±0.04ms        ? ?/sec
physical_plan_tpch_q19                        1.00      5.9±0.05ms        ? ?/sec    1.08      6.4±0.05ms        ? ?/sec
physical_plan_tpch_q2                         1.00      7.7±0.06ms        ? ?/sec    1.03      7.9±0.05ms        ? ?/sec
physical_plan_tpch_q20                        1.00      4.6±0.06ms        ? ?/sec    1.02      4.7±0.06ms        ? ?/sec
physical_plan_tpch_q21                        1.00      6.2±0.08ms        ? ?/sec    1.02      6.3±0.05ms        ? ?/sec
physical_plan_tpch_q22                        1.00      3.4±0.06ms        ? ?/sec    1.04      3.5±0.05ms        ? ?/sec
physical_plan_tpch_q3                         1.00      3.1±0.04ms        ? ?/sec    1.07      3.3±0.04ms        ? ?/sec
physical_plan_tpch_q4                         1.00      2.3±0.03ms        ? ?/sec    1.04      2.4±0.06ms        ? ?/sec
physical_plan_tpch_q5                         1.00      4.4±0.06ms        ? ?/sec    1.05      4.7±0.05ms        ? ?/sec
physical_plan_tpch_q6                         1.00  1507.9±41.94µs        ? ?/sec    1.06  1598.3±24.25µs        ? ?/sec
physical_plan_tpch_q7                         1.00      5.7±0.06ms        ? ?/sec    1.03      5.8±0.07ms        ? ?/sec
physical_plan_tpch_q8                         1.00      7.3±0.05ms        ? ?/sec    1.03      7.5±0.07ms        ? ?/sec
physical_plan_tpch_q9                         1.00      5.6±0.07ms        ? ?/sec    1.02      5.7±0.07ms        ? ?/sec
physical_select_all_from_1000                 1.00     58.7±0.25ms        ? ?/sec    1.05     61.5±1.01ms        ? ?/sec
physical_select_one_from_700                  1.00      3.7±0.03ms        ? ?/sec    1.02      3.7±0.05ms        ? ?/sec

@github-actions github-actions bot added logical-expr Logical plan and expressions optimizer Optimizer rules labels May 8, 2024
@alamb alamb force-pushed the alamb/eliminate_cross_join branch 3 times, most recently from 6ec02a4 to 1fcc6d1 Compare May 9, 2024 13:58
@@ -237,7 +324,7 @@ fn find_inner_join(
)?);

return Ok(LogicalPlan::Join(Join {
left: Arc::new(left_input.clone()),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here is one example where the plans were cloned

}
}
_ => all_inputs.push(child.clone()),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here is an example of the clones that are removed

@@ -259,7 +346,7 @@ fn find_inner_join(
)?);

Ok(LogicalPlan::CrossJoin(CrossJoin {
left: Arc::new(left_input.clone()),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likewise here is another removed clone

};

// If there are no join keys then do nothing:
if all_join_keys.is_empty() {
Filter::try_new(predicate.clone(), Arc::new(left))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also removed several expr clones

@alamb alamb changed the title Stop copying LogicalPlan and Exprs in EliminateCrossJoin Stop copying LogicalPlan and Exprs in EliminateCrossJoin (4% faster planning) May 9, 2024
@alamb alamb force-pushed the alamb/eliminate_cross_join branch from 1fcc6d1 to 41be324 Compare May 13, 2024 12:08
@github-actions github-actions bot removed the logical-expr Logical plan and expressions label May 13, 2024
@alamb alamb force-pushed the alamb/eliminate_cross_join branch from 41be324 to 05d0946 Compare May 13, 2024 13:04
}
}

let can_flatten_inputs = can_flatten_join_inputs(&plan);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this took a fair amount of finagling to split up the cases where the rewrite will happen and thus we should destructure the Filter and when not. I believe the logic is all the same, but the code needed to be reorganized

/// 'select ... from a, b, c where (a.x = b.y and b.xx = 100 and a.z = c.z)
/// or (a.x = b.y and b.xx = 200 and a.z=c.z);'
/// 'select ... from a, b where a.x > b.y'
/// Eliminate cross joins by rewriting them to inner joins when possible.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I was in here and had it all in my head, I updated the documentation to explain how things worked

@alamb alamb marked this pull request as ready for review May 13, 2024 13:17
}

if !can_flatten_join_inputs(&filter.input) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this condition can go earlier to save some cpu?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the call to rewrite_children needs to happen to apply the rule recursively before proceeding

I am not quite sure what else we could save.

I tried to make this clearer in 030444d with comments and moving another definition of can_flatten_join_inputs closer to where it was used.

// The filter of inner join will lost, skip this rule.
// issue: https://github.com/apache/datafusion/issues/4844
return Ok(false);
return internal_err!(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Filter::try_new(predicate.clone(), Arc::new(left))
.map(|f| Some(LogicalPlan::Filter(f)))
Filter::try_new(predicate, Arc::new(left))
.map(LogicalPlan::Filter)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we do it in 1 map iteration ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in cb8c1f1

Some(filter_expr) => Filter::try_new(filter_expr, Arc::new(left))
.map(|f| Some(LogicalPlan::Filter(f))),
_ => Ok(Some(left)),
.map(LogicalPlan::Filter)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice to have 1 map iteration

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in cb8c1f1

///
/// After the rule is applied, the plan will look like this:
/// ```text
/// Filter(b.xx = 100)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope the pushdown filter predicate rule comes after this rule

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes absolutely

I double checked and PushDownFilter does indeed run after this pass:

Arc::new(EliminateCrossJoin::new()),
Arc::new(CommonSubexprEliminate::new()),
Arc::new(EliminateLimit::new()),
Arc::new(PropagateEmptyRelation::new()),
// Must be after PropagateEmptyRelation
Arc::new(EliminateOneUnion::new()),
Arc::new(FilterNullJoinKeys::default()),
Arc::new(EliminateOuterJoin::new()),
// Filters can't be pushed down past Limits, we should do PushDownFilter after PushDownLimit
Arc::new(PushDownLimit::new()),
Arc::new(PushDownFilter::new()),
Arc::new(SingleDistinctToGroupBy::new()),

Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @alamb I'd say it is great

@alamb
Copy link
Contributor Author

alamb commented May 13, 2024

Thank you for the review @comphead -- I think I addressed all your comments

@alamb alamb merged commit adf0bfc into apache:main May 13, 2024
23 checks passed
findepi pushed a commit to findepi/datafusion that referenced this pull request Jul 16, 2024
… planning) (apache#10431)

* Stop copying LogicalPlan and Exprs in `EliminateCrossJoin`

* Clarify when can_flatten_join_inputs runs

* Use a single `map`
@alamb alamb deleted the alamb/eliminate_cross_join branch October 18, 2024 20:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optimizer Optimizer rules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Stop copying LogicalPlan and Exprs in EliminateCrossJoin
2 participants