-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Support window operations on Decimal #1333
Comments
CUDF Decimal support in window functions: |
I have no idea what does "collect list" point to. |
I did some tests with cuDF java tests. I found that all window operations required by current issue can run on decimal columns except |
Thanks, then we should split this up into at least 2 issues. The first is for what we can support today. The second should be a follow on to implement sum, and then we can make sure that there is an issue in CUDF to track it. I asked @razajafri to take lead on making sure that all of the issues we need are in cudf so please coordinate on that. The final issue is for collect_list which @mythrocks is working on a general cudf implementation for (we do not currently support it for any window operations). |
rapidsai/cudf#7061 and rapidsai/cudf#7037 are both merged in and can be supported by the plugin. I will create an issue for |
I have opened a cudf issue to track |
This pull request is to verify window operations on decimal columns in java package, which is required by spark-rapids on [issue 1333](NVIDIA/spark-rapids#1333). Authors: - sperlingxx <[email protected]> Approvers: - Robert (Bobby) Evans (@revans2) URL: #7120
Lead, Lag, Max, and Min were merged in as a part of #1512 Both sum and row_number are are merged into cudf so we should be able to get those merged in shortly. |
Hi @revans2, I met some problem on wrapping the sum support for decimal type. The problem is spark catalyst will conduct a hard-code precision promotion on the output decimal type of sum expression (code link), which leads to precision overflow any input decimal whose precision greater than 8. |
@sperlingxx We want to match what Spark does. We are just not going to be able to support sums greater than a precision of 8. This is the same limitation that we have for doing a sum with a group-by or a reduction in spark. |
@sperlingxx you are right row_number is done. I missed it because it really was a test only change. |
@sameerz at this point we have everything done for this feature except collect_list. Do we want to keep this open for it, or should we split it off into a separate issue and close this? |
I am hoping we will see a |
collect_list was merged in rapidsai/cudf#7189 |
Follow up issue for supporting rank is in issue #1584 . |
…IDIA#1333) Signed-off-by: spark-rapids automation <[email protected]>
Is your feature request related to a problem? Please describe.
This is still blocked on CUDF adding in support for decimal window operations. The window operations currently supported include:
The text was updated successfully, but these errors were encountered: