Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Define and implement new stream compaction APIs copy_if, drop_nulls, apply_boolean_mask, drop_duplicate and unique_count. #3303

Merged
merged 49 commits into from
Nov 21, 2019
Merged
Changes from 1 commit
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
86b05ec
changes
rgsl888prabhu Oct 31, 2019
470d7a2
Merge branch 'branch-0.11' of https://github.com/rapidsai/cudf into 2…
rgsl888prabhu Oct 31, 2019
0c6ad10
primary changes
rgsl888prabhu Nov 1, 2019
8331791
files are compiling
rgsl888prabhu Nov 4, 2019
f46ae31
changes for test cases
rgsl888prabhu Nov 4, 2019
d055987
merge 0.11
rgsl888prabhu Nov 4, 2019
216b367
changes to filter
rgsl888prabhu Nov 5, 2019
26b1b32
changes to return column
rgsl888prabhu Nov 5, 2019
8e755fb
ahh working
rgsl888prabhu Nov 5, 2019
b743175
All set
rgsl888prabhu Nov 5, 2019
7cfccc1
documentation
rgsl888prabhu Nov 5, 2019
a0d0e61
code changes and test cases
rgsl888prabhu Nov 5, 2019
a72ebe7
CHANGELOG
rgsl888prabhu Nov 5, 2019
ca97b2e
adding apply_boolean_mask
rgsl888prabhu Nov 7, 2019
8a1cea6
Adding apply_boolean_mask with test case
rgsl888prabhu Nov 7, 2019
8dd75d8
documentation
rgsl888prabhu Nov 7, 2019
664afea
review changes
rgsl888prabhu Nov 7, 2019
8c4f556
changes
rgsl888prabhu Nov 7, 2019
eb457ed
Merge branch 'branch-0.11' into 2948_adding_copy_if
harrism Nov 8, 2019
f15905a
unique_count and test cases
rgsl888prabhu Nov 11, 2019
4f710c7
added string test for unique count
rgsl888prabhu Nov 11, 2019
07d48c5
Added drop_duplicate test cases
rgsl888prabhu Nov 11, 2019
f75ad3a
documentation
rgsl888prabhu Nov 11, 2019
fa74c22
merge
rgsl888prabhu Nov 11, 2019
2ae0095
documentation
rgsl888prabhu Nov 11, 2019
3801ddd
cosmetic changes
rgsl888prabhu Nov 12, 2019
a544d09
doc changes
rgsl888prabhu Nov 12, 2019
00a6c6f
review changes including wrapper
rgsl888prabhu Nov 12, 2019
b20edc9
drop_duplicate to use vector of index values as keys
rgsl888prabhu Nov 13, 2019
0ffbf41
Changes apart from scatter in copy_if
rgsl888prabhu Nov 13, 2019
257b07d
documentation
rgsl888prabhu Nov 13, 2019
e076c09
addressed final set of review comments
rgsl888prabhu Nov 14, 2019
4551a99
review changes
rgsl888prabhu Nov 15, 2019
1fad669
code changes to support string in copy_if
rgsl888prabhu Nov 15, 2019
eb0641f
merge with 0.11
rgsl888prabhu Nov 15, 2019
3bb2666
missed changes
rgsl888prabhu Nov 15, 2019
c5d2f9f
merge with 0.11
rgsl888prabhu Nov 18, 2019
e9d3298
review changes
rgsl888prabhu Nov 18, 2019
653799f
specialization for gather
rgsl888prabhu Nov 18, 2019
703221a
removing the factory method
rgsl888prabhu Nov 18, 2019
856794d
review changes
rgsl888prabhu Nov 18, 2019
df0d9aa
review changes
rgsl888prabhu Nov 19, 2019
8a77cfb
review changes
rgsl888prabhu Nov 19, 2019
a1e1259
Merge branch 'branch-0.11' into 2948_adding_copy_if
harrism Nov 20, 2019
f5687b2
string test for drop_duplicates
rgsl888prabhu Nov 20, 2019
4129ed5
adding string test for cudf::gather
rgsl888prabhu Nov 21, 2019
91ef036
review changes
rgsl888prabhu Nov 21, 2019
ae85f57
Merge branch 'branch-0.11' of https://github.com/rapidsai/cudf into 2…
rgsl888prabhu Nov 21, 2019
5e64ad3
review changes
rgsl888prabhu Nov 21, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 12 additions & 12 deletions cpp/include/cudf/detail/copy_if.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -110,14 +110,20 @@ __global__ void scatter_kernel(cudf::mutable_column_device_view output_view,
__shared__ bool temp_valids[has_validity ? block_size+cudf::experimental::detail::warp_size : 1];
__shared__ T temp_data[block_size];

// each warp shares its total valid count to shared memory to ease
rgsl888prabhu marked this conversation as resolved.
Show resolved Hide resolved
// computing the total number of valid / non-null elements written out.
// note maximum block size is limited to 1024 by this, but that's OK
cudf::size_type warp_valid_counts{0};
cudf::size_type block_sum = 0;

// Note that since the maximum gridDim.x on all supported GPUs is as big as
// cudf::size_type, this loop is sufficient to cover our maximum column size
// regardless of the value of block_size and per_thread.
for (int i = 0; i < per_thread; i++) {
bool mask_true = (tid < size) && filter(tid);

block_sum = 0;
// get output location using a scan of the mask result
cudf::size_type block_sum = 0;
const cudf::size_type local_index = block_scan_mask<block_size>(mask_true,
block_sum);

Expand All @@ -138,10 +144,6 @@ __global__ void scatter_kernel(cudf::mutable_column_device_view output_view,
}
}

// each warp shares its total valid count to shared memory to ease
// computing the total number of valid / non-null elements written out.
// note maximum block size is limited to 1024 by this, but that's OK
cudf::size_type warp_valid_counts{0};

__syncthreads(); // wait for shared data and validity mask to be complete

Expand Down Expand Up @@ -192,18 +194,16 @@ __global__ void scatter_kernel(cudf::mutable_column_device_view output_view,
}
}

__syncthreads(); // wait for warp_valid_counts to be ready

// Compute total null_count for this block and add it to global count
cudf::size_type block_valid_count = cudf::experimental::detail::single_lane_block_sum_reduce<block_size, leader_lane>(warp_valid_counts);
if (threadIdx.x == 0) { // one thread computes and adds to null count
atomicAdd(output_null_count, block_sum - block_valid_count);
}
}

block_offset += block_sum;
tid += block_size;
}
// Compute total null_count for this block and add it to global count
cudf::size_type block_valid_count = cudf::experimental::detail::single_lane_block_sum_reduce<block_size, leader_lane>(warp_valid_counts);
if (threadIdx.x == 0) { // one thread computes and adds to null count
atomicAdd(output_null_count, block_sum-block_valid_count);
}
}

// Dispatch functor which performs the scatter for fixed column types and gather for other
Expand Down