-
Notifications
You must be signed in to change notification settings - Fork 902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor cudf::contains
, renaming and switching parameters role
#10802
Refactor cudf::contains
, renaming and switching parameters role
#10802
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, with a few docs nitpicks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor const nits, but otherwise Java changes lgtm.
This comment was marked as off-topic.
This comment was marked as off-topic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For completeness, we should also update the parameter names for the Python bindings for lower and upper bound in this file. Also the second parameter for contains
looks like it should be needles
, not needle
. These bindings are like a C++ header in that they only export declarations, so the parameter names don't actually need to match the definition for the code to function correctly, but it will still be confusing to readers if they don't.
@gpucibot merge |
Compile warning introduced with merge of PR #10802 ``` 10 warnings like this: 12:43:23 $SRC_DIR/cpp/src/search/search.cu(108): warning #177-D: parameter "args" was declared but never referenced 12:43:23 detected during instantiation of function "lambda [](auto &&...)->auto [with <auto-1>=<rmm::exec_policy, const thrust::counting_iterator<cudf::size_type, thrust::use_default, thrust::use_default, thrust::use_default> &, thrust::counting_iterator<cudf::size_type, thrust::use_default, thrust::use_default, thrust::use_default>, const thrust::counting_iterator<cudf::size_type, thrust::use_default, thrust::use_default, thrust::use_default> &, thrust::counting_iterator<cudf::size_type, thrust::use_default, thrust::use_default, thrust::use_default>, cudf::size_type *const &, const cudf::row_lexicographic_comparator<cudf::nullate::DYNAMIC> &>]" 12:43:23 (121): here ``` Line 108 has a lambda refactoring that seems to confuse the compiler. ``` auto const do_search = [find_first](auto&&... args) { if (find_first) { thrust::lower_bound(std::forward<decltype(args)>(args)...); } else { thrust::upper_bound(std::forward<decltype(args)>(args)...); } }; do_search(rmm::exec_policy(stream), count_it, count_it + haystack.num_rows(), count_it, count_it + needles.num_rows(), out_it, comp); ``` The warning is wrong and the compiler generates the correct code so this is likely a compiler bug. This PR fixes the warning by replacing the lambda with an if statement. Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Nghia Truong (https://github.com/ttnghia) - Yunsong Wang (https://github.com/PointKernel) URL: #10827
This extends the `cudf::contains` API to support nested types (lists + structs) with arbitrarily nested levels. As such, `cudf::contains` will work with literally any type of input data. In addition, this fixes null handling of `cudf::contains` with structs column + struct scalar input when the structs column contains null rows at the top level while the scalar key is valid but all nulls at children levels. Closes: #8965 Depends on: * #10730 * #10883 * #10802 * #10997 * NVIDIA/cuCollections#172 * NVIDIA/cuCollections#173 * #11037 * #11356 Authors: - Nghia Truong (https://github.com/ttnghia) - Devavret Makkar (https://github.com/devavret) - Bradley Dice (https://github.com/bdice) - Karthikeyan (https://github.com/karthikeyann) Approvers: - AJ Schmidt (https://github.com/ajschmidt8) - Bradley Dice (https://github.com/bdice) - Yunsong Wang (https://github.com/PointKernel) URL: #10656
This PR does the following:
cudf::contains
, changing fromt
/values
,col
/value
, etc... intohaystack
/needle
in a consistent way.haystack
andneedles
parameters of the overloadcudf::contains(column_view, column_view)
, which incorrectly searches forhaystack
insideneedles
.cudf::contains
doxygen.Since the role of parameters is switched, this causes breaking changes.
Closes #10725. In addition, this is also a foundation for more changes in
search.cu
to support nested types in #10656.