Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Add libcudf any() and all() reductions #1874

Closed
harrism opened this issue May 29, 2019 · 8 comments · Fixed by #3094
Closed

[FEA] Add libcudf any() and all() reductions #1874

harrism opened this issue May 29, 2019 · 8 comments · Fixed by #3094
Assignees
Labels
feature request New feature or request libcudf Affects libcudf (C++/CUDA) code.

Comments

@harrism
Copy link
Member

harrism commented May 29, 2019

Is your feature request related to a problem? Please describe.

Currently cuDF Python implements any() and all() for a column by calling libcudf's min() and max() (see NVIDIA/thrust#1863). libcudf should provide any() and all() instead. They can be made faster and also give us better control over semantics.

Describe the solution you'd like
It would be faster to provide an optimized any() and all() that first check that the values are non-zero to get a boolean per element and then use bitwise reductions in CUDA for a faster reduction.

Describe alternatives you've considered
It currently works, so this is just a potential optimization.

@harrism harrism added feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. labels May 29, 2019
@harrism
Copy link
Member Author

harrism commented May 29, 2019

@kovaltan FYI in case you have thoughts on this.

@kovaltan
Copy link
Contributor

I agree to have any and all function in cudf for better control over semantics.
But I suspect about the potential optimization.

It would be faster to provide an optimized any() and all() that first check that the values are non-zero to get a boolean per element and then use bitwise reductions in CUDA for a faster reduction.

The current implementation of reduction has already output dtype option.
For non boolean column, if we set output_dtype = bool8 and do reduction min (for all), or max (for any),
the reducntion kernel cast a element into cudf::bool and then do reduction for rows.
The current implementation uses 1 byte atomicMin for cudf::bool reduction, there would be a room for optimization.
But in future, I'm thinking to re-implement reduction using null supported iterator given by PR NVIDIA/thrust#1833, then the implementation will use cub and it would be well optimised.

Btw, I've found a bug about all and any of integer column, and fixed it, I will file it later.

@harrism
Copy link
Member Author

harrism commented May 30, 2019

In my experience even CUB doesn't have an optimized 1-bit-per-boolean reduction (i.e. using __popc() or __syncthreads_and()/__syncthreads_or().

@rgsl888prabhu rgsl888prabhu self-assigned this Oct 7, 2019
@rgsl888prabhu
Copy link
Contributor

rgsl888prabhu commented Oct 14, 2019

In my experience even CUB doesn't have an optimized 1-bit-per-boolean reduction (i.e. using __popc() or __syncthreads_and()/__syncthreads_or().

@harrism Why not use thrust::all_of and thrust::any_of ?

@jrhemstad
Copy link
Contributor

In my experience even CUB doesn't have an optimized 1-bit-per-boolean reduction (i.e. using __popc() or __syncthreads_and()/__syncthreads_or().

@harrism Why not use thrust::all_of and thrust::any_of ?

thrust::all_of/any_of are slow and should not be used. See: https://github.com/thrust/thrust/issues/1016

@karthikeyann
Copy link
Contributor

faster early out for any_of and all_of
https://github.com/thrust/thrust/issues/1016#issuecomment-542867313

@karthikeyann karthikeyann reopened this Oct 16, 2019
@karthikeyann karthikeyann self-assigned this Oct 16, 2019
@jrhemstad
Copy link
Contributor

faster early out for any_of and all_of
thrust/thrust#1016 (comment)

@karthikeyann Your results in that issue are fishy/impossible.

@karthikeyann
Copy link
Contributor

Yes. looked into it. it had a bug. updating it now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request libcudf Affects libcudf (C++/CUDA) code.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants