Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add column type tests #8505

Merged
merged 20 commits into from
Jun 29, 2021
Merged

Add column type tests #8505

merged 20 commits into from
Jun 29, 2021

Conversation

isVoid
Copy link
Contributor

@isVoid isVoid commented Jun 12, 2021

Addresses column requests for #8357

This PR adds nested type checks for cudf::column.

@github-actions github-actions bot added CMake CMake build issue libcudf Affects libcudf (C++/CUDA) code. labels Jun 12, 2021
@isVoid isVoid changed the title Add column and scalar type tests Add column type tests Jun 22, 2021
@isVoid isVoid marked this pull request as ready for review June 22, 2021 20:56
@isVoid isVoid requested review from a team as code owners June 22, 2021 20:56
@isVoid isVoid self-assigned this Jun 22, 2021
@isVoid isVoid added 3 - Ready for Review Ready for review by team feature request New feature or request labels Jun 22, 2021
cpp/src/utilities/type_checks.cpp Outdated Show resolved Hide resolved
cpp/src/utilities/type_checks.cpp Show resolved Hide resolved
Comment on lines 41 to 43
return lhs.num_children() > 0 and rhs.num_children() > 0
? lhs.child(kidx).type() == rhs.child(kidx).type()
: true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would return true if either lhs or rhs were empty but the other was not.

Suggested change
return lhs.num_children() > 0 and rhs.num_children() > 0
? lhs.child(kidx).type() == rhs.child(kidx).type()
: true;
return lhs.num_children() > 0 and rhs.num_children() > 0
? lhs.child(kidx).type() == rhs.child(kidx).type()
: lhs.is_empty() == rhs.is_empty();

I see there is a test for this case, so was that intended?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out. From arrow:

In [16]: empty = pa.array(pd.Series([], dtype="category"), from_pandas=True)

In [18]: arr = pa.array(pd.Series([1, 1, 2], dtype="category"), from_pandas=True)

In [19]: empty.type == empty.type
Out[19]: True

In [20]: arr.type == empty.type
Out[20]: False

So what you suggested above is correct.

* @return false `T` is not structs-type
*/
template <typename T>
constexpr inline bool is_structs()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the plural "s" here and with is_lists. All the other traits are singular like is_fixed_width or is_chrono.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, it probably make more sense align with other namings in traits.hpp.

I got the plural form idea from lists_column_view and structs_column_view and was under the impression they were named that way since.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why notis_list_view

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The below reveals the fact that we use list_view as an alias for LIST type during type-dispatching, which is too detailed IMO.

template <typename T, CUDF_ENABLE_IF(is_list_view<T>())>

while

template <typename T, CUDF_ENABLE_IF(is_list<T>())>

seems just clear enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But LIST is not a C++ type, and is_X() is a type trait. After all, is_list<T>() is just syntactic sugar for std::is_same<T, list_view>, so it's barely needed at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously the code was doing rhs.type().id()==type_id::LIST and rhs.type().id()==type_id::STRUCT so the thought was to make is_list(data_type) and is_struct(data_type) functions. Since the code is now using the type-dispatcher, these are no longer required since we can just do std::is_same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might be able to get rid of is_same in all if we use explicit specialization:

template <>
bool columns_equal_fn::operator()<list_view>(column_view const& lhs, column_view const& rhs)

@isVoid isVoid added the non-breaking Non-breaking change label Jun 23, 2021
@isVoid isVoid requested a review from a team as a code owner June 24, 2021 23:02
@github-actions github-actions bot added the conda label Jun 24, 2021
@codecov
Copy link

codecov bot commented Jun 25, 2021

Codecov Report

❗ No coverage uploaded for pull request base (branch-21.08@cefc266). Click here to learn what that means.
The diff coverage is n/a.

❗ Current head 2de2006 differs from pull request most recent head 1777355. Consider uploading reports for the commit 1777355 to get more accurate results
Impacted file tree graph

@@               Coverage Diff               @@
##             branch-21.08    #8505   +/-   ##
===============================================
  Coverage                ?   83.01%           
===============================================
  Files                   ?      109           
  Lines                   ?    18225           
  Branches                ?        0           
===============================================
  Hits                    ?    15129           
  Misses                  ?     3096           
  Partials                ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cefc266...1777355. Read the comment docs.

@isVoid
Copy link
Contributor Author

isVoid commented Jun 25, 2021

rerun tests

cpp/src/utilities/type_checks.cpp Show resolved Hide resolved
cpp/src/utilities/type_checks.cpp Outdated Show resolved Hide resolved
cpp/src/utilities/type_checks.cpp Outdated Show resolved Hide resolved
@davidwendt
Copy link
Contributor

Just wondering if this function is something that will be used outside of libcudf? Like from Python cudf or Spark Java? the #8357 issue only mentions internal libcudf.

Copy link
Contributor

@davidwendt davidwendt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work.

@isVoid isVoid added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team labels Jun 28, 2021
Copy link
Member

@ajschmidt8 ajschmidt8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving ops-codeowner file changes

Copy link
Contributor

@robertmaynard robertmaynard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CMake changes LGTM

@isVoid
Copy link
Contributor Author

isVoid commented Jun 29, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 0206fc9 into rapidsai:branch-21.08 Jun 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge CMake CMake build issue feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants