Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batching support for ROW-based FIRST() window function #9489

Merged
merged 12 commits into from
Oct 27, 2023

Conversation

mythrocks
Copy link
Collaborator

@mythrocks mythrocks commented Oct 19, 2023

Partially addresses #9299.

This commit adds support for batched execution of FIRST() aggregations row-based window functions. This helps relax the erstwhile requirement that the window aggregation be executed over a single in-memory batch, thereby alleviating memory pressure and out-of-memory failures.

This commit adds support for `FIRST()` window functions, running in a ROWS
context.

This does not currently support ignore/include nulls.

Signed-off-by: MithunR <[email protected]>
Plus, tests for partitioned/unpartitioned, and keep/ignore nulls.
Also removed unused import.
@mythrocks
Copy link
Collaborator Author

Build

@mythrocks mythrocks self-assigned this Oct 19, 2023
@mythrocks mythrocks added improve reliability Features to improve reliability or bugs that severly impact the reliability of the plugin labels Oct 19, 2023
revans2
revans2 previously approved these changes Oct 23, 2023
Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks okay to me, but It would be nice to not have First as a special case in GpuWindowExec. Also because you are not doing last could you file a follow on issue for it?

@mythrocks
Copy link
Collaborator Author

It would be nice to not have First as a special case in GpuWindowExec.

Agreed. The latest commit addresses that. The snag was that GpuNthValue is only selectively capable of running-window optimization, i.e. when n == 1. I've made the fixers' returns Optional, to handle this case.

because you are not doing last could you file a follow on issue for it?

I've filed #9520. I'd be happy to tackle it right after the FIRST() for RANGE queries.

@mythrocks
Copy link
Collaborator Author

Build

@mythrocks
Copy link
Collaborator Author

Build

@mythrocks
Copy link
Collaborator Author

The failure last time around was a Parquet test for timestamp push down.

@revans2
Copy link
Collaborator

revans2 commented Oct 25, 2023

The failure last time around was a Parquet test for timestamp push down.

That is a known issue and the test has been disabled while CUDF tries to fix it. Please upmerge and then rerun your tests.

revans2
revans2 previously approved these changes Oct 25, 2023
@mythrocks
Copy link
Collaborator Author

Build

@mythrocks
Copy link
Collaborator Author

This test appears to be failing because the IGNORE NULLS syntax seems to have changed between Spark versions. I'm investigating.

@mythrocks
Copy link
Collaborator Author

Build

@pxLi
Copy link
Collaborator

pxLi commented Oct 27, 2023

build

@mythrocks mythrocks merged commit ddb8f6b into NVIDIA:branch-23.12 Oct 27, 2023
29 of 30 checks passed
@mythrocks
Copy link
Collaborator Author

I've merged this change. Thank you for the reviews.

Focusing on #9544 now, for RANGE support.

mythrocks added a commit to mythrocks/spark-rapids that referenced this pull request Oct 30, 2023
Fixes NVIDIA#9569.

NVIDIA#9489 added `NTH_VALUE()` tests with option to `IGNORE NULLS`, but mistakenly
enabled `IGNORE NULLS` for Spark versions prior to `3.2.1`.

This commit restricts tests for `IGNORE NULLS` to only Spark versions
exceeding `3.1.x`, where the feature is available.
mythrocks added a commit to mythrocks/spark-rapids that referenced this pull request Oct 30, 2023
Fixes NVIDIA#9569.

NVIDIA#9489 added `NTH_VALUE()` tests with option to `IGNORE NULLS`, but mistakenly
enabled `IGNORE NULLS` for Spark versions prior to `3.2.1`.

This commit restricts tests for `IGNORE NULLS` to only Spark versions
exceeding `3.1.x`, where the feature is available.

Signed-off-by: MithunR <[email protected]>
pxLi pushed a commit that referenced this pull request Oct 31, 2023
Fixes #9569.

#9489 added `NTH_VALUE()` tests with option to `IGNORE NULLS`, but mistakenly
enabled `IGNORE NULLS` for Spark versions prior to `3.2.1`.

This commit restricts tests for `IGNORE NULLS` to only Spark versions
exceeding `3.1.x`, where the feature is available.

Signed-off-by: MithunR <[email protected]>
mythrocks added a commit to mythrocks/spark-rapids that referenced this pull request Nov 3, 2023
Fixes NVIDIA#9520.
Fixes NVIDIA#9299.

This is a followup to the changes in NVIDIA#9489, which adds the running
window optimization to the `FIRST()` window fuction.

This commit adds running window support for the `LAST()` window
function, with support for both `ROWS` and `RANGE` based window
specifications.
This change should allow the `LAST()` aggregation to run in multiple
batches if the window is `[UNBOUNDED PRECEDING, CURRENT ROW`]. This
will allow for much larger window and group sizes, because the group
no longer needs to fit in GPU memory. This should help mitigate
out-of-memory errors with `LAST()` window aggregations.

Signed-off-by: MithunR <[email protected]>
mythrocks added a commit that referenced this pull request Nov 6, 2023
* Running window optimization for LAST()

Fixes #9520.
Fixes #9299.

This is a followup to the changes in #9489, which adds the running
window optimization to the `FIRST()` window fuction.

This commit adds running window support for the `LAST()` window
function, with support for both `ROWS` and `RANGE` based window
specifications.
This change should allow the `LAST()` aggregation to run in multiple
batches if the window is `[UNBOUNDED PRECEDING, CURRENT ROW`]. This
will allow for much larger window and group sizes, because the group
no longer needs to fit in GPU memory. This should help mitigate
out-of-memory errors with `LAST()` window aggregations.

Signed-off-by: MithunR <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improve reliability Features to improve reliability or bugs that severly impact the reliability of the plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants