-
Notifications
You must be signed in to change notification settings - Fork 902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix result column types for empty inputs to rolling window #8274
Fix result column types for empty inputs to rolling window #8274
Conversation
I've branched this off of |
9be3c1d
to
c4645f4
Compare
Rebased for the latest changes on I have run into an odd thing. Perhaps someone sharper than myself might spot it. |
Ok, found it. It's the struct dispatch_source {
#pragma nv_exec_check_disable
template <typename Element, typename F, typename... Ts>
CUDA_HOST_DEVICE_CALLABLE decltype(auto) operator()(aggregation::Kind k,
F&& f,
Ts&&... args) const noexcept
{
return aggregation_dispatcher(
k, dispatch_aggregation<Element>{}, std::forward<F>(f), std::forward<Ts>(args)...);
}
}; Wouldn't this mean that we can't have |
c4645f4
to
889b670
Compare
Based on the feedback from the team, I have changed |
889b670
to
00002f7
Compare
00002f7
to
07ffa1a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only suggestion I have is that it might make sense to make this function name a little more rolling-specific:
cudf::detail::empty_output(input, agg);
@gpucibot merge |
Rerun tests. |
1 similar comment
Rerun tests. |
It seems to be building, albeit slowly. :/ |
Failed multiple times 😢 |
Again, fail with error "cpu4 is offline". |
@ttnghia can you please let us see the failed output of CI next time so we can engage devops? If you always re-run t.e.s.t.s. then we can't see the output. |
Sure, below is some output from CI tests that failed:
|
I observed that many failed tests are due to time out: the CI machines are overloaded by too many tests and randomly stop responding/very slowly respond to new test requests. So, the classic solution to this classic problem is still applicable here: throw more machines into it please! |
Since code freeze is about to start and this is the only PR left in branch 21.06, I'm going to trigger rerun tests again anyway. |
The 2 failures now look more legit: |
|
Full error message:
|
Please pardon the delay. I've now added special handling for UDFs. Thanks, @shwina, @vyasr. Could I please bother you guys to take a look? Note: I'm not currently honouring the return-type for the UDF, as reported by Numba. I'm returning |
I'll open a new issue to discuss future steps on that. |
@mythrocks I made a minor suggestion to clarify your comment, but otherwise this looks good to me. Your fix looks like it should address the immediate problem. |
Codecov Report
@@ Coverage Diff @@
## branch-21.06 #8274 +/- ##
===============================================
Coverage ? 82.83%
===============================================
Files ? 109
Lines ? 17896
Branches ? 0
===============================================
Hits ? 14824
Misses ? 3072
Partials ? 0 Continue to review full report at Codecov.
|
Argh. Finally! |
Thanks for the reviews, all! |
Fixes the rolling-window part of #7611.
All the rolling window functions return empty results when the input aggregation column is empty, just as they should. But the column types are incorrectly set to match the input type. While this is alright for
[MIN(), MAX(), LEAD(), LAG()]
, it is incorrect for some aggregations:This mapping is congruent with
cudf::target_type_t
from<cudf/detail/aggregation/aggregation.hpp>
.This commit corrects the type of the output column that results from an empty input. It adds test for all the combinations listed above.
Note: This is dependent on #8158, and should be merged after that is committed.