Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor cudf::contains #10997

Merged
merged 346 commits into from
May 31, 2022
Merged

Conversation

ttnghia
Copy link
Contributor

@ttnghia ttnghia commented May 27, 2022

This is just a simple refactor to cudf::contains:

  • Remove cudf/structs/detail/contains.hpp and its corrresponding source file src/structs/search/contains.cu, moving its (modified) implementation into src/search/contains_nested.cu.
  • Adopt experimental::row::equality::two_table_comparator for struct equality comparison.
  • Add const qualifier for the operator() functions.
  • Rename some variables, and reorganize code to make it cleaner.

No new feature is added in this PR, just modifying the existing functions and moving things around.

This PR is extracted from the bigger PR for easier review. The original PR is #10656 for fully supporting nested type in cudf::contains. As such, this blocks it.

Improve doc of decompose structs to account for lists
- avoid copying shared_ptrs for linked_col
Fixes the problem of having multiple copies of children
@ttnghia ttnghia removed the improvement Improvement / enhancement to an existing function label May 27, 2022
@ttnghia ttnghia added improvement Improvement / enhancement to an existing function and removed feature request New feature or request labels May 27, 2022
cpp/src/search/contains.cu Show resolved Hide resolved
cpp/src/search/contains.cu Outdated Show resolved Hide resolved
cpp/src/search/contains_nested.cu Outdated Show resolved Hide resolved
@codecov

This comment was marked as off-topic.

@ttnghia
Copy link
Contributor Author

ttnghia commented May 29, 2022

Rerun tests.

cpp/include/cudf/detail/search.hpp Outdated Show resolved Hide resolved
cpp/src/search/contains.cu Show resolved Hide resolved
cpp/src/search/search_ordered.cu Show resolved Hide resolved
Copy link
Contributor

@robertmaynard robertmaynard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CMake changes LGTM

@ttnghia ttnghia requested a review from PointKernel May 31, 2022 15:35
@ttnghia
Copy link
Contributor Author

ttnghia commented May 31, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 164e28e into rapidsai:branch-22.08 May 31, 2022
@ttnghia ttnghia deleted the refactor_cudfcontains branch May 31, 2022 15:54
rapids-bot bot pushed a commit that referenced this pull request Aug 17, 2022
This extends the `cudf::contains` API to support nested types (lists + structs) with arbitrarily nested levels. As such, `cudf::contains` will work with literally any type of input data.

In addition, this fixes null handling of `cudf::contains` with structs column + struct scalar input when the structs column contains null rows at the top level while the scalar key is valid but all nulls at children levels.

Closes: #8965
Depends on:
 * #10730
 * #10883
 * #10802
 * #10997
 * NVIDIA/cuCollections#172
 * NVIDIA/cuCollections#173
 * #11037
 * #11356

Authors:
  - Nghia Truong (https://github.com/ttnghia)
  - Devavret Makkar (https://github.com/devavret)
  - Bradley Dice (https://github.com/bdice)
  - Karthikeyan (https://github.com/karthikeyann)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Bradley Dice (https://github.com/bdice)
  - Yunsong Wang (https://github.com/PointKernel)

URL: #10656
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team CMake CMake build issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change Spark Functionality that helps Spark RAPIDS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants