-
Notifications
You must be signed in to change notification settings - Fork 907
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor rolling.cu to reduce compile time #6512
Refactor rolling.cu to reduce compile time #6512
Conversation
Please update the changelog in order to start CI tests. View the gpuCI docs here. |
Codecov Report
@@ Coverage Diff @@
## branch-0.18 #6512 +/- ##
===============================================
+ Coverage 82.04% 82.06% +0.02%
===============================================
Files 96 96
Lines 16388 16391 +3
===============================================
+ Hits 13446 13452 +6
+ Misses 2942 2939 -3
Continue to review full report at Codecov.
|
Is there any performance impact of supporting only int64_t reps? |
There is the additional cost of casting the |
Aside: please reserve [FEA] and [BUG], etc. for issues, not PRs. |
@mythrocks resolve conflicts. (I tried locally, few tests failed) |
@karthikeyann, #6557 borked this PR. This will be harder to resolve piecemeal than to just nuke and redo. With your permission, I'll |
There are not code review comments on this PR yet, so it won't mess up reviewers if you force push now. If you force push after reviews, the reviews become disassociated from the code (so you can't see the comments in the code changes view). If anyone has pulled your branch locally their local pull will be out of sync after you force push, so that's the second reason to be careful. But I think it's fine here if you can't resolve the conflicts by merging (usually you can). |
@mythrocks should we push this to 0.18? |
Sorry I missed this message, @harrism, @karthikeyann, et al. Yes, we should move this to 0.18. |
50ced8e
to
91ebd8e
Compare
I'm moving this back to
|
6b0a3b0
to
352269d
Compare
I've moved this to
|
rerun tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just a minor formatting suggestion.
Requesting changes because auto-merger is relentless :)
Ah, the checks pass, at last. :] @vuule, I've also addressed the |
Closes #6472.
rolling.cu
is taking inordinately long to compile, slowing down thelibcudf
build. The following changes were made to mitigate this:grouped_rolling_window()
andgrouped_time_based_rolling_window()
togrouped_rolling.cu
. Common functions were moved torolling_detail.cuh
.time_based_grouped_rolling_window()
.grouped_*_rolling_window()
functions used to pass around fancy iterators, causing massive template instantiations. This has been changed to materialize the window offsets as separate columns, and use those with existingrolling_window()
functions to produce the final result.These changes have been tested by running a window function test from SparkSQL, over a 2.4GB ORC file with 155M records (1.5M groups of about 97 records each on average):
nsys
profile seems to indicate that the total time spent in thegpu_rolling
kernel has reduced. This is still being examined, to confirm.)rolling.cu
andgrouped_rolling.cu
in parallel now takes 60s as opposed to about 300s before.