-
Notifications
You must be signed in to change notification settings - Fork 234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support "WindowGroupLimit" optimization on GPU [databricks] #10500
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1. Rough idea of algo. 2. Not bothering with the 3.5 shim now. 3. NPE when running BasicWindowCalc.
Without this, the runtime assumes that the ranking function is a CPU function, even though the GpuWindowGroupLimitExec is on the GPU. Signed-off-by: MithunR <[email protected]>
mythrocks
force-pushed
the
window-group-limit
branch
from
February 26, 2024 06:57
83cf223
to
6b1842d
Compare
revans2
reviewed
Feb 26, 2024
sql-plugin/src/main/spark350/scala/com/nvidia/spark/rapids/shims/GpuWindowGroupLimitExec.scala
Show resolved
Hide resolved
sql-plugin/src/main/spark350/scala/com/nvidia/spark/rapids/shims/GpuWindowGroupLimitExec.scala
Show resolved
Hide resolved
sql-plugin/src/main/spark350/scala/com/nvidia/spark/rapids/shims/SparkShims.scala
Outdated
Show resolved
Hide resolved
Build |
Build |
revans2
approved these changes
Feb 28, 2024
Could we test on databricks? Just because they tend to pull things back and this feels like one of the things that they would pull back. |
mythrocks
changed the title
Support "WindowGroupLimit" optimization on GPU
Support "WindowGroupLimit" optimization on GPU [databricks]
Feb 28, 2024
Build |
I just confirmed that this change works on Databricks as well. |
I've merged this change. Thank you for the reviews and advice, @revans2. |
mythrocks
added a commit
that referenced
this pull request
Mar 11, 2024
* WindowGroupLimit support for [databricks]. Fixes #10531. This is a followup to #10500, which added support to push down window-group-limit filters before the shuffle phase. #10500 inadvertently neglected to ensure that the optimization works on Databricks. (It turns out that window-group-limit was cherry-picked into Databricks 13.3, despite the nominal Spark version being `3.4.1`.) This change ensures that the same optimization is available on Databricks 13.3 (and beyond). --------- Signed-off-by: MithunR <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #8208.
This commit adds support for
WindowGroupLimitExec
to run on GPU. This optimization was added in Apache Spark 3.5, to reduce the number of rows that participate in shuffles, for queries that contain filters on the result of ranking functions. For example:Such a query would require a shuffle to bring all rows in a window-group to be made available in the same task.
In Spark 3.5, an optimization was added in SPARK-37099 to take advantage of the
rnk < 10
predicate to reduce shuffle load.Specifically, since only 9 (i.e. 10-1) ranks participate in the window function, only those many rows need be shuffled into the task, per input batch. By pre-filtering rows that can't possibly satisfy the condition, the number of shuffled records can be reduced.
The GPU implementation (i.e.
GpuWindowGroupLimitExec
) differs slightly from the CPU implementation, because it needs to execute on the entire input column batch. As a result,GpuWindowGroupLimitExec
runs the rank scan on each input batch, and then filters out ranks that exceed the limit specified in the predicate (rnk < 10
). After the shuffle, theRANK()
is calculated again byGpuRunningWindowExec
, to produce the final result.The current implementation addresses
RANK()
andDENSE_RANK
window functions. Other ranking functions (likeROW_NUMBER()
) can be added at a later date.