Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make inclusive scan safe for cases with leading nulls #7432

Merged
merged 9 commits into from
Feb 26, 2021
3 changes: 2 additions & 1 deletion cpp/src/bitmask/null_mask.cu
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,8 @@ void set_null_mask(bitmask_type *bitmask,
{
CUDF_FUNC_RANGE();
CUDF_EXPECTS(begin_bit >= 0, "Invalid range.");
CUDF_EXPECTS(begin_bit < end_bit, "Invalid bit range.");
CUDF_EXPECTS(begin_bit <= end_bit, "Invalid bit range.");
Comment on lines 137 to +138
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like these checks should be either merged or we should have more detailed error messages so that the separate checks are useful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like these assertions have a similar issue:

CUDF_EXPECTS(begin_bit >= 0, "Invalid range.");
CUDF_EXPECTS(begin_bit <= end_bit, "Invalid bit range.");

CUDF_EXPECTS(start >= 0, "Invalid range.");
CUDF_EXPECTS(start <= stop, "Invalid bit range.");

I didn't bother to see if there are more, but I feel improving such checks deserves a separate PR.

if (begin_bit == end_bit) return;
if (bitmask != nullptr) {
auto number_of_mask_words =
num_bitmask_words(end_bit) - begin_bit / detail::size_in_bits<bitmask_type>();
Expand Down
8 changes: 4 additions & 4 deletions cpp/tests/bitmask/set_nullmask_tests.cu
Original file line number Diff line number Diff line change
Expand Up @@ -105,8 +105,7 @@ TEST_F(SetBitmaskTest, error_range)
std::vector<size_pair> begin_end_fail{
{-1, size}, // begin>=0
{-2, -1}, // begin>=0
{8, 8}, // begin<end
{9, 8}, // begin<end
{9, 8}, // begin<=end
};
for (auto begin_end : begin_end_fail) {
auto begin = begin_end.first, end = begin_end.second;
Expand All @@ -116,8 +115,9 @@ TEST_F(SetBitmaskTest, error_range)
std::vector<size_pair> begin_end_pass{
{0, size}, // begin>=0
{0, 1}, // begin>=0
{8, 9}, // begin<end
{size - 1, size}, // begin<end
{8, 8}, // begin==end
{8, 9}, // begin<=end
magnatelee marked this conversation as resolved.
Show resolved Hide resolved
{size - 1, size}, // begin<=end
};
for (auto begin_end : begin_end_pass) {
auto begin = begin_end.first, end = begin_end.second;
Expand Down
21 changes: 21 additions & 0 deletions cpp/tests/reductions/scan_tests.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -470,6 +470,27 @@ TYPED_TEST(ScanTest, EmptyColumnskip_nulls)
CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected_col_out2, col_out->view());
}

TYPED_TEST(ScanTest, LeadingNulls)
{
auto const v = cudf::test::make_type_param_vector<TypeParam>({100, 200, 300});
auto const b = std::vector<bool>{0, 1, 1};
cudf::test::fixed_width_column_wrapper<TypeParam> const col_in(v.begin(), v.end(), b.begin());
std::unique_ptr<cudf::column> col_out;

// expected outputs
std::vector<TypeParam> out_v(v.size());
std::vector<bool> out_b(v.size(), 0);

// skipna=false
CUDF_EXPECT_NO_THROW(
col_out =
cudf::scan(col_in, cudf::make_sum_aggregation(), scan_type::INCLUSIVE, null_policy::INCLUDE));
cudf::test::fixed_width_column_wrapper<TypeParam> expected_col_out(
out_v.begin(), out_v.end(), out_b.begin());
CUDF_TEST_EXPECT_COLUMN_PROPERTIES_EQUAL(expected_col_out, col_out->view());
CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected_col_out, col_out->view());
}

template <typename T>
struct FixedPointTestBothReps : public cudf::test::BaseFixture {
};
Expand Down