Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add get_element for struct column #8578

Merged
merged 6 commits into from
Jun 25, 2021
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 13 additions & 4 deletions cpp/src/copying/get_element.cu
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@

#include <cudf/column/column_device_view.cuh>
#include <cudf/copying.hpp>
#include <cudf/detail/copy.hpp>
#include <cudf/detail/indexalator.cuh>
#include <cudf/detail/is_element_valid.hpp>
#include <cudf/detail/utilities/cuda.cuh>
Expand All @@ -25,6 +26,7 @@
#include <cudf/scalar/scalar_device_view.cuh>
#include <cudf/scalar/scalar_factories.hpp>

#include <memory>
#include <rmm/cuda_stream_view.hpp>

namespace cudf {
Expand Down Expand Up @@ -174,11 +176,18 @@ struct get_element_functor {
mr);
}

template <typename T, typename... Args>
std::enable_if_t<std::is_same<T, struct_view>::value, std::unique_ptr<scalar>> operator()(
Args &&...)
template <typename T, std::enable_if_t<std::is_same<T, struct_view>::value> *p = nullptr>
std::unique_ptr<scalar> operator()(
column_view const &input,
size_type index,
rmm::cuda_stream_view stream,
rmm::mr::device_memory_resource *mr = rmm::mr::get_current_device_resource())
{
CUDF_FAIL("get_element_functor not supported for struct_view");
bool valid = is_element_valid_sync(input, index, stream);
auto row_contents =
std::make_unique<column>(slice(input, index, index + 1), stream, mr)->release();
auto scalar_contents = std::make_unique<table>(std::move(row_contents.children));
return std::make_unique<struct_scalar>(std::move(*scalar_contents), valid, stream, mr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think std::move(*scalar_contents) is wrong as the unique_ptr hasn't been informed that it doesn't have ownership anymore. I think you want:

auto scalar_contents = table(std::move(row_contents.children));
std::make_unique<struct_scalar>(std::move(scalar_contents), valid, stream, mr);

Copy link
Contributor

@mythrocks mythrocks Jun 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::make_unique<struct_scalar>(std::move(scalar_contents), valid, stream, mr);

Not quite. The struct_scalar constructor needs a table&&.

I think std::move(*scalar_contents) is wrong as the unique_ptr hasn't been informed that it doesn't have ownership anymore.

The unique_ptr retains ownership of the "moved-from" object, which should be safe to destroy when the unique_ptr goes out of scope. (Please correct me if I'm wrong.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you are correct. To clarify, the object (the table) being pointed to by the std::unique_ptr<> has had it's internal buffers emptied out, but the table held by the pointer itself still exists, so when it gets deleted, it's destructor does nothing.

It is a little bit subtle though. @robertmaynard 's suggestion makes the sequence of operations a lot more clear.

As a side note, it's a little surprising the PR that added these move constructors didn't actually have any tests for them.

Copy link
Contributor

@mythrocks mythrocks Jun 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I misread @robertmaynard's suggestion: scalar_contents is a table, in his case.

I agree. @robertmaynard's way reads better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments. Moving to @robertmaynard 's suggestion.

From a lower level perspective, previously

std::make_unique<struct_scalar>(std::move(*scalar_contents), valid, stream, mr);

creates a table object with the contents that was pointed to by the unique_ptr and was later moved to the struct_scalar. After this step, the unique_ptr still points to a table object, but as a result of default move constructor, that object now contains 0 columns, effectively empty. It's safe for unique_ptr to deallocate this object because the table object is still live, its internal _columns field is also live, only it's empty.

Agreed it's subtle compared to the new writings.

}
};

Expand Down
131 changes: 131 additions & 0 deletions cpp/tests/copying/get_value_tests.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
#include <cudf/copying.hpp>
#include <cudf/detail/iterator.cuh>
#include <cudf/dictionary/dictionary_factories.hpp>
#include <cudf/dictionary/update_keys.hpp>
#include <cudf/scalar/scalar.hpp>
#include <cudf/types.hpp>
#include <cudf/utilities/type_dispatcher.hpp>
Expand All @@ -27,6 +28,7 @@
#include <cudf_test/column_wrapper.hpp>
#include <cudf_test/cudf_gtest.hpp>
#include <cudf_test/iterator_utilities.hpp>
#include <cudf_test/table_utilities.hpp>
#include <cudf_test/type_list_utilities.hpp>
#include <cudf_test/type_lists.hpp>

Expand Down Expand Up @@ -792,5 +794,134 @@ TYPED_TEST(ListGetStructValueTest, NestedGetNull)
CUDF_TEST_EXPECT_COLUMNS_EQUIVALENT(*expected_data, typed_s->view());
}

struct StructGetValueTest : public BaseFixture {
};
template <typename T>
struct StructGetValueTestTyped : public BaseFixture {
};

TYPED_TEST_CASE(StructGetValueTestTyped, FixedWidthTypes);

TYPED_TEST(StructGetValueTestTyped, mixed_types_valid)
{
using LCW = lists_column_wrapper<TypeParam, int32_t>;

// col fields
fixed_width_column_wrapper<TypeParam> f1{1, 2, 3};
strings_column_wrapper f2{"aa", "bbb", "c"};
dictionary_column_wrapper<TypeParam, uint32_t> f3{42, 42, 24};
LCW f4{LCW{8, 8, 8}, LCW{9, 9}, LCW{10}};

structs_column_wrapper col{f1, f2, f3, f4};

size_type index = 2;
auto s = get_element(col, index);
auto typed_s = static_cast<struct_scalar const *>(s.get());

// expect fields
fixed_width_column_wrapper<TypeParam> ef1{3};
strings_column_wrapper ef2{"c"};
dictionary_column_wrapper<int32_t, TypeParam> ef3{24};
LCW ef4{LCW{10}};

table_view expect_data{{ef1, ef2, ef3, ef4}};

EXPECT_TRUE(typed_s->is_valid());
CUDF_TEST_EXPECT_TABLES_EQUIVALENT(expect_data, typed_s->view());
}

TYPED_TEST(StructGetValueTestTyped, mixed_types_valid_with_nulls)
{
using LCW = lists_column_wrapper<TypeParam, int32_t>;
using validity_mask_t = std::vector<valid_type>;

// col fields
fixed_width_column_wrapper<TypeParam> f1({1, 2, 3}, {true, false, true});
strings_column_wrapper f2({"aa", "bbb", "c"}, {false, false, true});
dictionary_column_wrapper<TypeParam, uint32_t> f3({42, 42, 24},
validity_mask_t{true, true, true}.begin());
LCW f4({LCW{8, 8, 8}, LCW{9, 9}, LCW{10}}, validity_mask_t{false, false, false}.begin());

structs_column_wrapper col{f1, f2, f3, f4};

size_type index = 1;
auto s = get_element(col, index);
auto typed_s = static_cast<struct_scalar const *>(s.get());

// expect fields
fixed_width_column_wrapper<TypeParam> ef1({-1}, {false});
strings_column_wrapper ef2({""}, {false});

dictionary_column_wrapper<TypeParam, uint32_t> x({42}, {true});
dictionary_column_view dict_col(x);
fixed_width_column_wrapper<TypeParam> new_key{24};
auto ef3 = cudf::dictionary::add_keys(dict_col, new_key);

LCW ef4({LCW{10}}, validity_mask_t{false}.begin());

table_view expect_data{{ef1, ef2, *ef3, ef4}};

EXPECT_TRUE(typed_s->is_valid());
CUDF_TEST_EXPECT_TABLES_EQUIVALENT(expect_data, typed_s->view());
}

TYPED_TEST(StructGetValueTestTyped, mixed_types_invalid)
{
using LCW = lists_column_wrapper<TypeParam, int32_t>;
using validity_mask_t = std::vector<valid_type>;

// col fields
fixed_width_column_wrapper<TypeParam> f1{1, 2, 3};
strings_column_wrapper f2{"aa", "bbb", "c"};
dictionary_column_wrapper<TypeParam, uint32_t> f3{42, 42, 24};
LCW f4{LCW{8, 8, 8}, LCW{9, 9}, LCW{10}};

structs_column_wrapper col({f1, f2, f3, f4}, validity_mask_t{false, true, true}.begin());

size_type index = 0;
auto s = get_element(col, index);
auto typed_s = static_cast<struct_scalar const *>(s.get());

EXPECT_FALSE(typed_s->is_valid());

// expect to preserve types along column hierarchy.
// TODO: use `column_types_equal` after GH 8505 merged
EXPECT_EQ(typed_s->view().column(0).type().id(), type_to_id<TypeParam>());
EXPECT_EQ(typed_s->view().column(1).type().id(), type_id::STRING);
EXPECT_EQ(typed_s->view().column(2).type().id(), type_id::DICTIONARY32);
EXPECT_EQ(typed_s->view().column(2).child(1).type().id(), type_to_id<TypeParam>());
EXPECT_EQ(typed_s->view().column(3).type().id(), type_id::LIST);
EXPECT_EQ(typed_s->view().column(3).child(1).type().id(), type_to_id<TypeParam>());
}

TEST_F(StructGetValueTest, multi_level_nested)
{
using LCW = lists_column_wrapper<int32_t, int32_t>;
using validity_mask_t = std::vector<valid_type>;

// col fields
LCW l3({LCW{1, 1, 1}, LCW{2, 2}, LCW{3}}, validity_mask_t{false, true, true}.begin());
structs_column_wrapper l2{l3};
auto l1 = make_lists_column(1,
fixed_width_column_wrapper<offset_type>{0, 3}.release(),
l2.release(),
0,
create_null_mask(1, mask_state::UNALLOCATED));
std::vector<std::unique_ptr<column>> l0_fields;
l0_fields.emplace_back(std::move(l1));
structs_column_wrapper l0(std::move(l0_fields));

size_type index = 0;
auto s = get_element(l0, index);
auto typed_s = static_cast<struct_scalar const *>(s.get());

// Expect fields
column_view cv = column_view(l0);
table_view fields(std::vector<column_view>(cv.child_begin(), cv.child_end()));

EXPECT_TRUE(typed_s->is_valid());
CUDF_TEST_EXPECT_TABLES_EQUIVALENT(fields, typed_s->view());
}

} // namespace test
} // namespace cudf