Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor of rolling_window implementation. #8158

Merged
merged 27 commits into from
May 24, 2021

Conversation

nvdbaranec
Copy link
Contributor

@nvdbaranec nvdbaranec commented May 4, 2021

This is an attempt to significantly reduce the complexity of the logic of the SFINAE and various functors/functions inside of rolling_detail.cuh. There are 2 major components:

  • It introduces the idea of device "rolling operators". These operators are essentially just the implementations of what were formerly the process_rolling_window() functtions. However, they provide they key mechanism for removing the complex SFINAE out of the core logic. They do this by providing their own logic that can throw for invalid aggregation/type pairs at construction time, internally.

  • It refactors the type and aggregation-dispatched functors to use the collector/finalize paradigm used by groupby. Specifically, the rolling operation is broken down into three parts. 1.) Preprocess incoming aggregation/type pairs, potentially transforming them into different operations. 2.) Perform the rolling window operation on the transformed inputs. 3.) Postprocess the output from the rolling rolling window operation to obtain the final result.

Combined, these two changes dramatically reduce the amount of dispatch and gpu rolling implementation code one has to read through.

The implementation of the collect list rolling operation has been moved into rolling_collect_list.cuh

There are a couple of other things worth mentioning:

  • Each device rolling operator implements an is_supported() constexpr function which are stripped down, type-specific versions of the old is_rolling_supported() global function. It might be possible to eliminate this with further fundamental type traits. Looking for opinions here.

  • is_rolling_supported() has been removed from the code, however the various tests relied on it pretty heavily. So for now I just transplanted it into the test code in a common place. It's definitely not an ideal solution, but maybe ok for now.

  • It might be worth moving the device rolling operators into their own module to further shrink rolling_detail.cuh. Also looking for opinions here.

@nvdbaranec nvdbaranec added libcudf Affects libcudf (C++/CUDA) code. improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels May 4, 2021
@nvdbaranec nvdbaranec requested review from a team as code owners May 4, 2021 19:58
@github-actions github-actions bot added the Python Affects Python cuDF API. label May 4, 2021
@nvdbaranec nvdbaranec removed the Python Affects Python cuDF API. label May 4, 2021
Copy link
Contributor

@mythrocks mythrocks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've gotten my head around this change now. Sorry, it took a little time.

I had a couple of nitpicks, around DeviceRolling*::is_supported() and if constexpr.
I haven't fully grokked where the dictionary specific code in cudf::detail::rolling_window() might be moved. I'm looking forward to reviewing that change.

@nvdbaranec
Copy link
Contributor Author

rerun tests

…ded a note about removing

some code once is_valid_aggregation<> gets cleaned up a bit.
@nvdbaranec
Copy link
Contributor Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 691dd11 into rapidsai:branch-21.06 May 24, 2021
rapids-bot bot pushed a commit that referenced this pull request May 28, 2021
Fixes the rolling-window part of #7611.

All the rolling window functions return empty results when the input aggregation column is empty, just as they should. But the column types are incorrectly set to match the input type. While this is alright for `[MIN(), MAX(), LEAD(), LAG()]`, it is incorrect for some aggregations:
Aggregation   |     Input Types      |           Output Type             |
--------------|----------------------|-----------------------------------|
COUNT_VALID   | All types            | INT32                             |
COUNT_ALL     | All types            | INT32                             |
ROW_NUMBER    | All types            | INT32                             |
SUM           | Numerics (e.g. INT8) | 64-bit promoted type (e.g. INT64) |
SUM           | Chrono               | Same as input type                |
SUM           | All else             | Unsupported                       |
MEAN          | Numerics             | FLOAT64                           |
MEAN          | Chrono               | FLOAT64                           |
MEAN          | All else             | Unsupported                       |
COLLECT_LIST  | All types T          | LIST with child of type T         |

This mapping is congruent with `cudf::target_type_t` from `<cudf/detail/aggregation/aggregation.hpp>`.

This commit corrects the type of the output column that results from an empty input. It adds test for all the combinations listed above.

Note: This is dependent on #8158, and should be merged after that is committed.

Authors:
  - MithunR (https://github.com/mythrocks)

Approvers:
  - Nghia Truong (https://github.com/ttnghia)
  - https://github.com/nvdbaranec
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #8274
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants