-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
transmute
crashed with SIGTRAP
#128409
Comments
transmuting an integer to a pointer makes it so that the pointer does not have any provenance, meaning it is not valid to dereference.
https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=90c384e2cf026c7431e961fc8cd93888 |
On the page you linked it specifically says:
|
@Noratrieb @RossSmyth Thanks for your quick response. But when did that become a UB? The |
as usual, Rusts unsafe semantics have been underspecified for ages, and are slowly getting more clear, which does sadly break code that relied on them being a particular thing before. it's an inevitable consequences of having an underspecified model and tightening that up while keeping semantics that can allow for many optimizations, which are vital to making Rust fast. while the ptr documentation says it's non-normative and part of the strict provenance experiment, this isn't related to strict provenance. strict provenance is nothing other than a style of writing unsafe code (to never cast integers to pointers) and does not imply any language rules. the reason why it's undefined behavior to do what you're doing is because otherwise every read of memory would have to be considered a side effect by the optimizer, as it could do an implicit transmutation which would have to expose the provenance as described in Ralf's blog. |
The transmute is lawful. Rust says that a pointer does indeed have all the bytes of either a u16, u32, u64 (or possibly u128...?), depending on the platform, and all bits of an integer's bitpattern are valid pointer bytes. The pointer deref is not. Rust says that dereferencing a pointer that does not point to within an allocation (including an "allocation" in global memory or on the stack) is not defined behavior, and Rust must somehow decide when a pointer points within an object. @ldm0 Have you also read the
Please consider that we have chosen to accept RFC 3559 as an explanation for why memory is sometimes dereferenceable, and that in days of yore, "undefined behavior" was simply "whatever the language didn't nail down". The refinement into today's "unconstrained behavior" came later. |
...and why, anyways, is it the case that an integer-sized struct would differ from an integer here? Especially considering you applied |
I am fine with it, even if it wasn't mentioned anywhere several months ago(#128409 (comment)) :-/
I am still confused. Is dereference after
|
you should read Ralf's linked blog posts, they should contain some of the reasoning behind it all |
This comment was marked as outdated.
This comment was marked as outdated.
I did a detailed crater analysis in the PR that made this change, and submitted a handful of PRs and issue reports to the affected codebases we could locate. We make changes like this carefully, and we are not ignorant of how Rust is used. We would not completely break backtrace-rs, because it is part of the standard library. Such a PR would not merge. Function pointers are different from data pointers, but that's a bit besides the point. I'm sorry we didn't find your project, but we take steps to not randomly break a large fraction of the ecosystem with a codegen change like this. |
@ldm0 re: Is dereference after as casting lawful?
There are two reasons this currently works, and should continue to work, especially if you aren't also doing something that I will briefly say is "sus". You and I both know you shouldn't make up random addresses and dereference them, yes? This isn't about whether or not rustc should compile blatantly unsound code, it is "how do I write some code that should work". So, the
Thus this: let x = gen_usize();
let x = x as *mut Whatever;
unsafe { *x } is in fact the gesture that invokes "rustc attempts to make something up that would justify your program". The mechanism for this "rustc attempts to justify your program to itself" is unspecified, but it is known that we will try to assign it a provenance. Somehow. A provenance other than the null provenance. The reason
Reading a random integer will read bytes without provenance, because integers don't have provenance, because assigning provenance to integers disables various optimizations. For instance, if every instance of "5" has a potential provenance, then we can't simply unify instances of calculating or equating the number "5". Thus we would rather keep all these "bitcast" operations as low-overhead and low-cost to programs that are using unsafe code in a sound way. As the current maintainer of the backtrace code: yes, that crate has a lot of work to do. |
Anyway, no action seems necessary here. |
Code sample:
Expected to generate asm like this(for target
aarch64-apple-darwin
):However, the generated asm looks like this:
Godbolt: https://rust.godbolt.org/z/s5PY4TPME
I did some research, and found that Rust emits such LLVM IR for the
transmute
:However,
load (gep ptr null)
is assumed asunreachable
in LLVM'sInstCombine
pass, thusbrk
is generated.The
transmute
was orignally lowered asinttoptr
rather thangetelementptr i8 ptr null..
, which is legal and the result assembly won't crash.I performed a bisection and discovered that the codegen change was introduced by#121282. I suspect that this PR was intended to address integer-to-pointer conversions, but it inadvertently affects the transmute of integer-sized-struct-to-pointer conversions.
As far as I can tell, integer-sized-struct-to-pointer conversions are not UB 🤔 : https://doc.rust-lang.org/nightly/std/intrinsics/fn.transmute.html.
The text was updated successfully, but these errors were encountered: