-
-
Notifications
You must be signed in to change notification settings - Fork 255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bounds-checking debug assertions for array access #1514
Changes from 4 commits
d45dca7
02da0c1
c1201e1
7e5eddf
87d50c8
e323363
4d4cfe4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -189,6 +189,15 @@ impl<'mcx, T: UnboxDatum> Array<'mcx, T> { | |
#[allow(clippy::option_option)] | ||
#[inline] | ||
pub fn get<'arr>(&'arr self, index: usize) -> Option<Option<T::As<'arr>>> { | ||
// Technically this should be covered by the null_slice check, | ||
// but that assumes the null bitmap is well-formed (i.e. equal in | ||
// length to the array), which might be worth double-checking in | ||
// debug builds. | ||
#[cfg(debug_assertions)] | ||
if index >= self.raw.len() { | ||
return None; | ||
}; | ||
|
||
let Some(is_null) = self.null_slice.get(index) else { return None }; | ||
if is_null { | ||
return Some(None); | ||
|
@@ -231,7 +240,18 @@ impl<'mcx, T: UnboxDatum> Array<'mcx, T> { | |
) -> Option<T::As<'arr>> { | ||
match is_null { | ||
true => None, | ||
false => unsafe { self.slide_impl.bring_it_back_now(self, ptr) }, | ||
false => { | ||
// Ensure we are not attempting to dereference a pointer | ||
// outside of the array. | ||
debug_assert!(self.is_within_bounds(ptr)); | ||
// Prevent a datum that begins inside the array but would end | ||
// outside the array from being dereferenced. | ||
debug_assert!(self.is_within_bounds_inclusive( | ||
ptr.wrapping_add(unsafe { self.slide_impl.hop_size(ptr) }) | ||
)); | ||
|
||
unsafe { self.slide_impl.bring_it_back_now(self, ptr) } | ||
} | ||
} | ||
} | ||
|
||
|
@@ -256,6 +276,21 @@ impl<'mcx, T: UnboxDatum> Array<'mcx, T> { | |
ptr.add(offset) | ||
} | ||
} | ||
|
||
/// Returns true if the pointer provided is within the bounds of the array. | ||
/// Primarily intended for use with debug_assert!()s. | ||
/// Note that this will return false for the 1-past-end, which is a useful | ||
/// position for a pointer to be in, but not valid to dereference. | ||
#[inline] | ||
pub(crate) fn is_within_bounds(&self, ptr: *const u8) -> bool { | ||
(self.raw.data_ptr() <= ptr) && (ptr < self.raw.end_ptr()) | ||
} | ||
/// Similar to [`Self::is_within_bounds()`], but also returns true for the | ||
/// 1-past-end position. | ||
#[inline] | ||
pub(crate) fn is_within_bounds_inclusive(&self, ptr: *const u8) -> bool { | ||
(self.raw.data_ptr() <= ptr) && (ptr <= self.raw.end_ptr()) | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Something I am currently thinking through is whether these are resilient in the face of UB, LLVM optimizations, and arithmetic offset wrapping values around the integer addressing space. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For /// "one past the end" pointer for the entire array's bytes
pub(crate) fn end_ptr(&self) -> *const u8 {
let ptr = self.ptr.as_ptr().cast::<u8>();
ptr.wrapping_add(unsafe { crate::varlena::varsize_any(ptr.cast()) })
} Notably, we should not trust There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Okay! I think we have to trust the implementation of After recaching my memory from the original issue and PR, I now remember: What we really need is to defend against LLVM deciding pointers with the same address are unequal, as that means we can't rely on comparison semantics being coherent:
This means that we need to cast the pointer addresses to usize, and do the comparisons purely on integral values. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Terse version:Casting to Verbose version:Forgive me for intruding, but I'm worried you might be misremembering some small details of the bug you're worried about. Chief amongst them is the fact that casting the pointer addresses to That said, I'm not sure how much of a problem this is for your use-case. There are two reasons for this, one incidental and one fundamental. The incidental one: In every single reproduction of this issue that I've managed to find, the pointers only miscompare when they're dangling. This pull-request appears to only compare non-dangling pointers, so it ought to dodge this specific issue. The fundamental one: The main problem is that LLVM (and GCC, for that matter) think that two pointers can never be equal if they “come from” different allocations. If the pointers you're comparing are both “coming from” the same allocation –as appears to be the case as best as I can tell– then comparison will never misbehave. The solution I propose: Rust has a specific function whose entire purpose is to tell the compiler “please pessimise this as best as you can”; that function is One thing to note: I can't help but notice, due to your title, that this entire discussion happens for some debug assertions. Thus far, this issue has only ever been encountered in release mode. If you manage to reproduce it with debug assertions enabled, that would be important information that should be noted in the issue that tracks it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @workingjubilee …was the above useful, after all? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
} | ||
|
||
#[deny(unsafe_op_in_unsafe_fn)] | ||
|
@@ -718,6 +753,7 @@ where | |
let Self { array, curr, ptr } = self; | ||
let Some(is_null) = array.null_slice.get(*curr) else { return None }; | ||
*curr += 1; | ||
debug_assert!(array.is_within_bounds(*ptr)); | ||
let element = unsafe { array.bring_it_back_now(*ptr, is_null) }; | ||
if !is_null { | ||
// SAFETY: This has to not move for nulls, as they occupy 0 data bytes, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we used a fn from the
.offset
family instead of.wrapping_offset
then it's simply UB and any such check is elided, buthop_size
does not do this inherently...