[SPARK-38034][SQL] Optimize TransposeWindow rule #35334

constzhou · 2022-01-26T11:39:05Z

What changes were proposed in this pull request?

Optimize the TransposeWindow rule to extend applicable cases and optimize time complexity.
TransposeWindow rule will try to eliminate unnecessary shuffle:

but the function compatiblePartitions will only take the first n elements of the window2 partition sequence, for some cases, this will not take effect, like the case below:

val df = spark.range(10).selectExpr("id AS a", "id AS b", "id AS c", "id AS d")
df.selectExpr(
"sum(d) OVER(PARTITION BY b,a) as e",
"sum(c) OVER(PARTITION BY a) as f"
).explain

Current plan

== Physical Plan ==
*(5) Project [e#10L, f#11L]
+- Window [sum(c#4L) windowspecdefinition(a#2L, specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) AS f#11L], [a#2L]
+- *(4) Sort [a#2L ASC NULLS FIRST], false, 0
+- Exchange hashpartitioning(a#2L, 200), true, [id=#41]
+- *(3) Project [a#2L, c#4L, e#10L]
+- Window [sum(d#5L) windowspecdefinition(b#3L, a#2L, specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) AS e#10L], [b#3L, a#2L]
+- *(2) Sort [b#3L ASC NULLS FIRST, a#2L ASC NULLS FIRST], false, 0
+- Exchange hashpartitioning(b#3L, a#2L, 200), true, [id=#33]
+- *(1) Project [id#0L AS d#5L, id#0L AS b#3L, id#0L AS a#2L, id#0L AS c#4L]
+- *(1) Range (0, 10, step=1, splits=10)

Expected plan:

== Physical Plan ==
*(4) Project [e#924L, f#925L]
+- Window [sum(d#43L) windowspecdefinition(b#41L, a#40L, specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) AS e#924L], [b#41L, a#40L]
+- *(3) Sort [b#41L ASC NULLS FIRST, a#40L ASC NULLS FIRST], false, 0
+- *(3) Project [d#43L, b#41L, a#40L, f#925L]
+- Window [sum(c#42L) windowspecdefinition(a#40L, specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) AS f#925L], [a#40L]
+- *(2) Sort [a#40L ASC NULLS FIRST], false, 0
+- Exchange hashpartitioning(a#40L, 200), true, [id=#282]
+- *(1) Project [id#38L AS d#43L, id#38L AS b#41L, id#38L AS a#40L, id#38L AS c#42L]
+- *(1) Range (0, 10, step=1, splits=10)

Also the permutations method has a O(n!) time complexity, which is very expensive when there are many partition columns, we could try to optimize it.

Why are the changes needed?

We could apply the rule for more cases, which could improve the execution performance by eliminate unnecessary shuffle, and by reducing the time complexity from O(n!) to O(n2), the performance for the rule itself could improve

Does this PR introduce any user-facing change?

no

How was this patch tested?

UT

constzhou · 2022-01-30T04:01:47Z

Thanks for correcting the title @HyukjinKwon and thanks for reviewing it @tanelk, should we find more people to help review it?

sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/TransposeWindowSuite.scala

HyukjinKwon · 2022-02-08T10:28:48Z

cc @hvanhovell FYI

constzhou · 2022-02-17T02:12:09Z

Gentle ping @hvanhovell :)
Also cc @wangyum @cloud-fan this is an improvement for optimizer rule TransposeWindow，could anyone help to take another look at it ？ Thanks！

github-actions · 2022-05-29T00:19:49Z

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

peter-toth · 2022-08-03T09:06:14Z

@cloud-fan, @hvanhovell, I think this is a pretty good optimization.
The old compatiblePartitions() was very compute intensive if Window nodes have wide partitionSpecs.

We have a customer who upgraded from Spark 2 to Spark 3 and due to the new TransposeWindow rule in Spark 3 suffered severe performance degradation...

cloud-fan · 2022-08-04T02:51:45Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala

@@ -1148,9 +1148,9 @@ object CollapseWindow extends Rule[LogicalPlan] {
 */
 object TransposeWindow extends Rule[LogicalPlan] {
  private def compatiblePartitions(ps1 : Seq[Expression], ps2: Seq[Expression]): Boolean = {
-    ps1.length < ps2.length && ps2.take(ps1.length).permutations.exists(ps1.zip(_).forall {


I'd consider this as a perf bug...

You mean we should only optimize the performance without changing the logic？Maybe we could find a way to avoid the permutation

I mean the previous code is kind of buggy. This PR is more like a bug fix instead of perf improvement and we should backport it.

cloud-fan · 2022-08-04T02:54:05Z

@constzhou sorry for the late review. Can you rebase this PR to retrigger the tests?

constzhou · 2022-08-04T03:33:33Z

@constzhou sorry for the late review. Can you rebase this PR to retrigger the tests?

It's OK :) triggered now

cloud-fan · 2022-08-05T10:41:26Z

I think the previous O(n!) time complexity code is unexpected and buggy. Let's backport this PR.

### What changes were proposed in this pull request? Optimize the TransposeWindow rule to extend applicable cases and optimize time complexity. TransposeWindow rule will try to eliminate unnecessary shuffle: but the function compatiblePartitions will only take the first n elements of the window2 partition sequence, for some cases, this will not take effect, like the case below: val df = spark.range(10).selectExpr("id AS a", "id AS b", "id AS c", "id AS d") df.selectExpr( "sum(`d`) OVER(PARTITION BY `b`,`a`) as e", "sum(`c`) OVER(PARTITION BY `a`) as f" ).explain Current plan == Physical Plan == *(5) Project [e#10L, f#11L] +- Window [sum(c#4L) windowspecdefinition(a#2L, specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) AS f#11L], [a#2L] +- *(4) Sort [a#2L ASC NULLS FIRST], false, 0 +- Exchange hashpartitioning(a#2L, 200), true, [id=#41] +- *(3) Project [a#2L, c#4L, e#10L] +- Window [sum(d#5L) windowspecdefinition(b#3L, a#2L, specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) AS e#10L], [b#3L, a#2L] +- *(2) Sort [b#3L ASC NULLS FIRST, a#2L ASC NULLS FIRST], false, 0 +- Exchange hashpartitioning(b#3L, a#2L, 200), true, [id=#33] +- *(1) Project [id#0L AS d#5L, id#0L AS b#3L, id#0L AS a#2L, id#0L AS c#4L] +- *(1) Range (0, 10, step=1, splits=10) Expected plan: == Physical Plan == *(4) Project [e#924L, f#925L] +- Window [sum(d#43L) windowspecdefinition(b#41L, a#40L, specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) AS e#924L], [b#41L, a#40L] +- *(3) Sort [b#41L ASC NULLS FIRST, a#40L ASC NULLS FIRST], false, 0 +- *(3) Project [d#43L, b#41L, a#40L, f#925L] +- Window [sum(c#42L) windowspecdefinition(a#40L, specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) AS f#925L], [a#40L] +- *(2) Sort [a#40L ASC NULLS FIRST], false, 0 +- Exchange hashpartitioning(a#40L, 200), true, [id=#282] +- *(1) Project [id#38L AS d#43L, id#38L AS b#41L, id#38L AS a#40L, id#38L AS c#42L] +- *(1) Range (0, 10, step=1, splits=10) Also the permutations method has a O(n!) time complexity, which is very expensive when there are many partition columns, we could try to optimize it. ### Why are the changes needed? We could apply the rule for more cases, which could improve the execution performance by eliminate unnecessary shuffle, and by reducing the time complexity from O(n!) to O(n2), the performance for the rule itself could improve ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? UT Closes #35334 from constzhou/SPARK-38034_optimize_transpose_window_rule. Authored-by: xzhou <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit 0cc331d) Signed-off-by: Wenchen Fan <[email protected]>

cloud-fan · 2022-08-05T10:44:57Z

thanks, merging to master/3.3/3.2! (3.1 has conflicts and I didn't backport)

AmplabJenkins · 2022-08-05T20:11:57Z

Can one of the admins verify this patch?

### What changes were proposed in this pull request? Optimize the TransposeWindow rule to extend applicable cases and optimize time complexity. TransposeWindow rule will try to eliminate unnecessary shuffle: but the function compatiblePartitions will only take the first n elements of the window2 partition sequence, for some cases, this will not take effect, like the case below: val df = spark.range(10).selectExpr("id AS a", "id AS b", "id AS c", "id AS d") df.selectExpr( "sum(`d`) OVER(PARTITION BY `b`,`a`) as e", "sum(`c`) OVER(PARTITION BY `a`) as f" ).explain Current plan == Physical Plan == *(5) Project [e#10L, f#11L] +- Window [sum(c#4L) windowspecdefinition(a#2L, specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) AS f#11L], [a#2L] +- *(4) Sort [a#2L ASC NULLS FIRST], false, 0 +- Exchange hashpartitioning(a#2L, 200), true, [id=apache#41] +- *(3) Project [a#2L, c#4L, e#10L] +- Window [sum(d#5L) windowspecdefinition(b#3L, a#2L, specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) AS e#10L], [b#3L, a#2L] +- *(2) Sort [b#3L ASC NULLS FIRST, a#2L ASC NULLS FIRST], false, 0 +- Exchange hashpartitioning(b#3L, a#2L, 200), true, [id=apache#33] +- *(1) Project [id#0L AS d#5L, id#0L AS b#3L, id#0L AS a#2L, id#0L AS c#4L] +- *(1) Range (0, 10, step=1, splits=10) Expected plan: == Physical Plan == *(4) Project [e#924L, f#925L] +- Window [sum(d#43L) windowspecdefinition(b#41L, a#40L, specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) AS e#924L], [b#41L, a#40L] +- *(3) Sort [b#41L ASC NULLS FIRST, a#40L ASC NULLS FIRST], false, 0 +- *(3) Project [d#43L, b#41L, a#40L, f#925L] +- Window [sum(c#42L) windowspecdefinition(a#40L, specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) AS f#925L], [a#40L] +- *(2) Sort [a#40L ASC NULLS FIRST], false, 0 +- Exchange hashpartitioning(a#40L, 200), true, [id=apache#282] +- *(1) Project [id#38L AS d#43L, id#38L AS b#41L, id#38L AS a#40L, id#38L AS c#42L] +- *(1) Range (0, 10, step=1, splits=10) Also the permutations method has a O(n!) time complexity, which is very expensive when there are many partition columns, we could try to optimize it. ### Why are the changes needed? We could apply the rule for more cases, which could improve the execution performance by eliminate unnecessary shuffle, and by reducing the time complexity from O(n!) to O(n2), the performance for the rule itself could improve ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? UT Closes apache#35334 from constzhou/SPARK-38034_optimize_transpose_window_rule. Authored-by: xzhou <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit 0cc331d) Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit 0f609ff)

HyukjinKwon changed the title ~~[SPARK-34807][SQL] Optimize TransposeWindow rule~~ [SPARK-38034][SQL] Optimize TransposeWindow rule Jan 27, 2022

tanelk approved these changes Jan 28, 2022

View reviewed changes

HyukjinKwon reviewed Feb 3, 2022

View reviewed changes

sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/TransposeWindowSuite.scala Outdated Show resolved Hide resolved

HyukjinKwon reviewed Feb 8, 2022

View reviewed changes

sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/TransposeWindowSuite.scala Outdated Show resolved Hide resolved

github-actions bot added the SQL label Feb 8, 2022

github-actions bot added the Stale label May 29, 2022

github-actions bot closed this May 30, 2022

cloud-fan removed the Stale label Aug 4, 2022

cloud-fan reopened this Aug 4, 2022

cloud-fan reviewed Aug 4, 2022

View reviewed changes

xzhou added 2 commits August 4, 2022 11:15

optimize TransposeWindow rule

2c1c4dd

modify UT description

0d8e472

constzhou force-pushed the SPARK-38034_optimize_transpose_window_rule branch from 75bc299 to 0d8e472 Compare August 4, 2022 03:22

cloud-fan closed this in 0cc331d Aug 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-38034][SQL] Optimize TransposeWindow rule #35334

[SPARK-38034][SQL] Optimize TransposeWindow rule #35334

constzhou commented Jan 26, 2022 •

edited

Loading

constzhou commented Jan 30, 2022

HyukjinKwon commented Feb 8, 2022

constzhou commented Feb 17, 2022

github-actions bot commented May 29, 2022

peter-toth commented Aug 3, 2022 •

edited

Loading

cloud-fan Aug 4, 2022

constzhou Aug 4, 2022

cloud-fan Aug 5, 2022

cloud-fan commented Aug 4, 2022

constzhou commented Aug 4, 2022

cloud-fan commented Aug 5, 2022

cloud-fan commented Aug 5, 2022 •

edited

Loading

AmplabJenkins commented Aug 5, 2022

[SPARK-38034][SQL] Optimize TransposeWindow rule #35334

[SPARK-38034][SQL] Optimize TransposeWindow rule #35334

Conversation

constzhou commented Jan 26, 2022 • edited Loading

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

constzhou commented Jan 30, 2022

HyukjinKwon commented Feb 8, 2022

constzhou commented Feb 17, 2022

github-actions bot commented May 29, 2022

peter-toth commented Aug 3, 2022 • edited Loading

cloud-fan Aug 4, 2022

Choose a reason for hiding this comment

constzhou Aug 4, 2022

Choose a reason for hiding this comment

cloud-fan Aug 5, 2022

Choose a reason for hiding this comment

cloud-fan commented Aug 4, 2022

constzhou commented Aug 4, 2022

cloud-fan commented Aug 5, 2022

cloud-fan commented Aug 5, 2022 • edited Loading

AmplabJenkins commented Aug 5, 2022

constzhou commented Jan 26, 2022 •

edited

Loading

peter-toth commented Aug 3, 2022 •

edited

Loading

cloud-fan commented Aug 5, 2022 •

edited

Loading