-
Notifications
You must be signed in to change notification settings - Fork 902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix out-of-bounds memory read in orc gpuEncodeOrcColumnData #9196
Fix out-of-bounds memory read in orc gpuEncodeOrcColumnData #9196
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the fix!
My struct PR includes some refactoring of this code, will make sure to merge with this PR correctly.
Codecov Report
@@ Coverage Diff @@
## branch-21.10 #9196 +/- ##
===============================================
Coverage ? 10.77%
===============================================
Files ? 115
Lines ? 19138
Branches ? 0
===============================================
Hits ? 2062
Misses ? 17076
Partials ? 0 Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if there is a unit test to catch the already fixed bug.
I don't think so, since the part that was read out-of-bounds is never used. |
@gpucibot merge |
Device memory read error found in
gpuEncodeOrcColumnData
when runningORC_TEST
withcompute-sanitizer
.The was in the
cudf::detail::get_mask_offset_word
utility which may need to read multiplebitmask_type
values (4-bytes == 32-bits) to satisfy the begin/end bit parameters. Thesource_end_bit
is intended to be exclusive but the logic inadvertently reads the nextbytemask_type
from the inputsource
null-mask on boundary cases like the one found in the gtest above. Here thesource_begin_bit==480
and thesource_end_bit==512
and becauseword_index(512) > word_index(480)
the next read access is out of bounds. This PR fixed the logic in the utility by ensuring only the inclusive bits are verified to require and extra read fromsource
.The logic in
cudf::io::orc::gpu::gpuEncodeOrcColumnData
that calls this utility also required a fix where it always requested at least 32-bits regardless if it was out of bounds forsource
. This PR fixes the math logic to specify the correct end-bit value.