Fix memory read/write error in concatenate_lists_ignore_null #8978
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Reference #8883
Running
cuda-memcheck
onLISTS_TEST
found a read error in thecudf::lists::detail::concatenate_lists_ignore_null()
utility. This function is usingthrust::transform
to build the output offsets values and executes from 0 tonum_rows+1
. The device lambda included logic to update a temporaryuint8
vector with validity value (0 or 1) for each row. Unfortunately, this required reading offset values atidx
andidx+1
which would fail whenidx==num_rows
sinceidx+1
would be out-of-bounds for the input offsets in this case. Also, thevalidities[idx]
statement would fail on write sinceidx==num_rows
is past the end of this vector as well. Finally, the temporaryvalidities
vector was passed tocudf::detail::valid_if
utility to turn it into a bitmask.Since 2 kernels are used to create the output lists column, the temporary
validities
vector is not required since thevalid_if
utility can take a device predicate to build the bitmask instead. The code logic for determine validity was therefore moved from thetransform
call to thevalid_if
predicate instead. This keeps the same number of kernels without the need for the temporary buffer and fixes the out-of-bounds memory access.