Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Define and implement new stream compaction APIs copy_if, drop_nulls, apply_boolean_mask, drop_duplicate and unique_count. #3303

Merged
merged 49 commits into from
Nov 21, 2019

Conversation

rgsl888prabhu
Copy link
Contributor

@rgsl888prabhu rgsl888prabhu commented Nov 5, 2019

Define and implement new stream compaction APIs copy_if, drop_nulls, apply_boolean_mask, drop_duplicate and unique_count.

NOTE: All the APIs supports string

close #2948
close #3415

@rgsl888prabhu rgsl888prabhu requested review from a team as code owners November 5, 2019 21:05
@codecov
Copy link

codecov bot commented Nov 5, 2019

Codecov Report

Merging #3303 into branch-0.11 will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff              @@
##           branch-0.11    #3303   +/-   ##
============================================
  Coverage        87.36%   87.36%           
============================================
  Files               49       49           
  Lines             9295     9295           
============================================
  Hits              8121     8121           
  Misses            1174     1174
Impacted Files Coverage Δ
python/cudf/cudf/core/column/column.py 87.06% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 55596e3...5e64ad3. Read the comment docs.

@rgsl888prabhu rgsl888prabhu changed the title [WIP] Define and implement new stream_compaction APIs copy_if and drop_null [REVIEW] Define and implement new stream_compaction APIs copy_if and drop_null Nov 5, 2019
Copy link
Member

@harrism harrism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, except for the mystery of the missing apply_boolean_mask!

cpp/include/cudf/detail/copy_if.cuh Outdated Show resolved Hide resolved
cpp/include/cudf/detail/copy_if.cuh Outdated Show resolved Hide resolved
cpp/include/cudf/detail/copy_if.cuh Outdated Show resolved Hide resolved
cpp/include/cudf/detail/copy_if.cuh Outdated Show resolved Hide resolved
cpp/include/cudf/detail/copy_if.cuh Outdated Show resolved Hide resolved
cpp/include/cudf/table/table_view.hpp Outdated Show resolved Hide resolved
cpp/include/cudf/wrappers/bool.hpp Show resolved Hide resolved
cpp/include/cudf/wrappers/timestamps.hpp Outdated Show resolved Hide resolved
cpp/src/stream_compaction/drop_nulls.cu Outdated Show resolved Hide resolved
cpp/src/stream_compaction/drop_nulls.cu Outdated Show resolved Hide resolved
@rgsl888prabhu
Copy link
Contributor Author

Looks good, except for the mystery of the missing apply_boolean_mask!

@harrism Ouch, I thought it had different kernel and was planning a PR along with drop_duplicate. I will update this PR with that one as well.

@harrism
Copy link
Member

harrism commented Nov 6, 2019

SHouldn't be too hard -- it's a wrapper around copy_if like drop_nulls is.

cpp/include/cudf/detail/sorting.hpp Outdated Show resolved Hide resolved
*/
std::unique_ptr<experimental::table>
drop_nulls(table_view const& input,
table_view const& keys,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused. Jake asked you to changed this to a vector of indices, and you agreed, but the change hasn't been made and the conversation is resolved. Unresolving to confirm.

cpp/include/cudf/detail/utilities/cuda.cuh Outdated Show resolved Hide resolved
cpp/include/cudf/detail/utilities/cuda.cuh Outdated Show resolved Hide resolved
cpp/src/stream_compaction/drop_duplicates.cu Outdated Show resolved Hide resolved
cpp/include/cudf/detail/copy_if.cuh Outdated Show resolved Hide resolved
cpp/include/cudf/detail/copy_if.cuh Outdated Show resolved Hide resolved
cpp/include/cudf/detail/copy_if.cuh Outdated Show resolved Hide resolved
cpp/include/cudf/detail/copy_if.cuh Show resolved Hide resolved
cpp/include/cudf/detail/gather.cuh Outdated Show resolved Hide resolved
cpp/include/cudf/table/table_device_view.cuh Outdated Show resolved Hide resolved
@harrism
Copy link
Member

harrism commented Nov 20, 2019

rerun tests

Copy link
Member

@harrism harrism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic work, Ram!

@rgsl888prabhu
Copy link
Contributor Author

@jrhemstad

cpp/include/cudf/detail/copy_if.cuh Outdated Show resolved Hide resolved
cpp/include/cudf/detail/copy_if.cuh Outdated Show resolved Hide resolved
cpp/include/cudf/detail/copy_if.cuh Outdated Show resolved Hide resolved
@rgsl888prabhu rgsl888prabhu added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team labels Nov 21, 2019
@harrism harrism merged commit bcb7bf1 into rapidsai:branch-0.11 Nov 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Add strings support to cudf::gather [FEA] Port stream_compaction to cudf::column types
5 participants