-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-16844][SQL] Support codegen for sort-based aggreagate #17164
Conversation
Test build #73907 has finished for PR 17164 at commit
|
A benchmark result:
|
|
|
@hvanhovell I reworked #14481 though, I'm not sure it is still worth trying this codegen. Could you give me insight first? Thanks! |
Test build #73908 has finished for PR 17164 at commit
|
@maropu I think this is pretty exciting. This is very useful in situations where we have a lot of groups, in that case I will happily take a 2x performance improvement any day. This is still pretty decent if you consider that this aggregate is dominate by sorting. |
@hvanhovell okay! I'll brush up code, then if I finished, I'll let you know for code review. Thanks. |
a45048a
to
b29c22d
Compare
Test build #73924 has finished for PR 17164 at commit
|
Test build #73925 has finished for PR 17164 at commit
|
Test build #73926 has finished for PR 17164 at commit
|
Test build #73935 has finished for PR 17164 at commit
|
8413dd7
to
29c713b
Compare
Test build #73948 has finished for PR 17164 at commit
|
Test build #73949 has finished for PR 17164 at commit
|
Test build #73952 has finished for PR 17164 at commit
|
Test build #73974 has started for PR 17164 at commit |
Jenkins, retest this please. |
Test build #73986 has finished for PR 17164 at commit
|
Jenkins, retest this please. |
Test build #73993 has finished for PR 17164 at commit
|
Test build #74066 has finished for PR 17164 at commit
|
This pr added an new SQL option |
@hvanhovell ping |
val aggTime = metricTerm(ctx, "aggTime") | ||
val beforeAgg = ctx.freshName("beforeAgg") | ||
s""" | ||
| while (!$initAgg) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to use while
? Can we use if
instead of while
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC we can't because continue
may exist in ${consume(ctx, resultVars).trim}
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks
| $currentGroupingKeyTerm = null; | ||
| | ||
| if (shouldStop()) return; | ||
| } while (false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this do { } while (false);
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto.
@hvanhovell ping |
Test build #74569 has finished for PR 17164 at commit
|
@hvanhovell ping |
@maropu I do think this is useful. However we really need to refactor the planner, if we want to get the most value from this. |
okay, it'd be better to close this? |
@maropu I am not sure. I like to keep interesting PRs open. Would you be interested in doing some work on the planner? |
okay, I keep this open. Yea, sure and I'm interested in. If there are sub-tasks for that, I'd be grad if you ping me. Thanks! |
@maropu Maybe we can close this PR at first? |
ok |
What changes were proposed in this pull request?
This pr supported codegen for
SortAggregate
.This is the rework of #14481.
Close #14481
How was this patch tested?
Checked tests in
DataFrameAggregateSuite
.