Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What about: zero-sized dangling accesses/inbounds-offsets? #93

Closed
RalfJung opened this issue Feb 23, 2019 · 64 comments
Closed

What about: zero-sized dangling accesses/inbounds-offsets? #93

RalfJung opened this issue Feb 23, 2019 · 64 comments
Labels
A-memory Topic: Related to memory accesses C-open-question Category: An open question that we should revisit S-pending-design Status: Resolving this issue requires addressing some open design questions

Comments

@RalfJung
Copy link
Member

RalfJung commented Feb 23, 2019

Is the following code UB or not?

fn main() {
    let mut b = Box::new((((),), 4));
    let x: *mut ((),) = &mut b.0;
    drop(b);
    unsafe {
        // getelementptr inbounds with offset 0 of a dangling pointer
        let x_inner: *mut () = &mut (*x).0;
        // 0-sized access of a dangling pointer
        let _val = *x;
    }
}

On the one hand, x is dangling. On the other hand, doing the same thing with let x: *mut ((),) = 1usize as *mut ((),) would definitely be allowed. Does it make sense to treat dangling pointers as "more dangerous" than integer pointers?

AFAIK the actual accesses do not get translated to LLVM IR, so we are not constrained by LLVM. But we do emit a getelementptr inbounds, and the rules for that with offset 0 are rather unclear, I would say. @rkruppe do you know anything about this?

Sub-threads / points:

@strega-nil
Copy link

strega-nil commented Feb 24, 2019

I personally prefer that accesses of a zero sized type are always valid under raw pointers, and for references to only need alignment.

@RalfJung
Copy link
Member Author

We don't make a difference between raw pointer and reference accesses in terms of alignment for non-zero-sized accesses, why would we for zero-sized accesses?

@hanna-kruppe
Copy link

From the LLVM side, it seems rather unambigious to me that the current GEPi specification does not make an exception for offset 0 and so any GEPi on a dangling pointer results in poison. I don't really see a way to change that without hampering analyses based on GEPi: for sanity such a rule should apply to runtime offsets as well as compile time offsets, but then any analysis that wants to draw conclusions from the presence of a GEPi would have to prove that the offset is nonzero at runtime.

One could possibly argue that any pointer is inbounds for ZSTs since it's one past the end of a zero-sized allocation, but I do not believe such a concept exists in LLVM today (and it's unclear to me whether that would be desirable).

How about a different solution? For ZSTs, projections don't do any address calculation anyway, so we could just emit a bitcast to change the pointee type. That shouldn't lose any AA precision and avoids the implications of the GEPi.

@RalfJung
Copy link
Member Author

I just realized we recently had a related discussion in rust-lang/rust#54857.

so any GEPi on a dangling pointer results in poison

Just to be clear, does "dangling pointer" include 4usize as *mut _, or just pointers that were actually generated by allocation functions known to LLVM and then deallocated?

Cc @eddyb who argued previously that we can rely on such "zero-sized allocations".

For ZSTs, projections don't do any address calculation anyway, so we could just emit a bitcast to change the pointee type. That shouldn't lose any AA precision and avoids the implications of the GEPi.

This still leaves empty slices, where the offset is only known at run-time.

@hanna-kruppe
Copy link

so any GEPi on a dangling pointer results in poison

Just to be clear, does "dangling pointer" include 4usize as *mut _, or just pointers that were actually generated by allocation functions known to LLVM and then deallocated?

Interesting question, I don't know. More precisely the LangRef says the result is poison when "the base pointer is not an in bounds address of an allocated object", so this too runs into the question of zero sized allocations. At the same time, I would be wary of spec-lawyering too much in this context.

For ZSTs, projections don't do any address calculation anyway, so we could just emit a bitcast to change the pointee type. That shouldn't lose any AA precision and avoids the implications of the GEPi.

This still leaves empty slices, where the offset is only known at run-time.

Ugh, yeah, what a pain.

@RalfJung
Copy link
Member Author

I brought this up on the LLVM-dev list and had a bit of an exchange with one developer over there. They concluded that

I'd argue, after all this discussion at least, use non-inbounds if you
do not know you have a valid object (and want to avoid undef and all
what it entails).

@eddyb
Copy link
Member

eddyb commented Apr 17, 2019

How about a different solution? For ZSTs, projections don't do any address calculation anyway, so we could just emit a bitcast to change the pointee type. That shouldn't lose any AA precision and avoids the implications of the GEPi.

We already do this! It's needed because there are more things with offset 0 than struct-likes:
https://github.com/rust-lang/rust/blob/258e3b3a75a0da006cd492307fc46ef605e774ad/src/librustc_codegen_ssa/mir/place.rs#L111-L114

(ugh, why does that kind of linking not work cross-crate, @github?!)

            // Unions and newtypes only use an offset of 0.
            let llval = if offset.bytes() == 0 {
                self.llval
            } else /*...*/;
            PlaceRef {
                // HACK(eddyb) have to bitcast pointers until LLVM removes pointee types.
                llval: bx.pointercast(llval, bx.cx().type_ptr_to(bx.cx().backend_type(field))),

@RalfJung
Copy link
Member Author

We can't always do this for all 0-offsets though, like for &v[i] where i == 0.

@eddyb
Copy link
Member

eddyb commented Apr 20, 2019

@RalfJung Right, unless i doesn't matter (size_of::<typeof v[i]>() == 0).

@RalfJung
Copy link
Member Author

Sure. However, (&*NonNull::dangling::<[0; i32]> as &[i32])[i] will do a zero-GEPi on a dangling pointer in a not statically detectable way.

@HeroicKatora
Copy link

HeroicKatora commented Jul 17, 2019

Right, unless i doesn't matter (size_of::<typeof v[i]>() == 0).

Would that means it is safe to conjure any slice length of a ZST out of thin air? Would it make this code UB-free? It seems that this has at least some impact on privacy as it creates some references to T from no references to T and could bypass the checks a usual constructor would have for non-Copy ZST T.

fn arbitrary<'a, T>(length: usize) -> &'a [T] {
    assert!(core::mem::size_of::<T>() == 0);
    unsafe {
        core::slice::from_raw_parts(core::ptr::NonNull::dangling().as_ptr(), length)
    }
}

@hanna-kruppe
Copy link

Just because something does not immediately cause UB doesn't mean that it is safe in the sense that it composes with other code. A lot of code (even purely safe code, depending on whether we e.g. consider &! to be uninhabited or possible to construct) requires additional invariants to hold for memory safety to be ensured. There's many other examples lke this too.

@RalfJung
Copy link
Member Author

RalfJung commented Jul 17, 2019

@HeroicKatora Beyond what @rkruppe said, see this blog post that explains the difference between the kind of "wrong code" that causes immediate UB, vs the kind of "wrong code" that violates a high-level semantic contract and is therefore guilty of making other code have UB. This even older post is also relevant.

@HeroicKatora
Copy link

HeroicKatora commented Jul 17, 2019

So yes, it is memory safe in that it does not violate the validity invariants of a [ZST]. But what is slightly interesting is that a zero sized type can not expect that its safety invariant is upheld purely by privacy and abstraction, as opposed to non-zero-sized types for which it is not memory safe since already creating guaranteed valid instances involves privacy aspects. That's slightly odd, or am I mistaken and there is a way to create valid (not safe) instances of arbitrary types in code that does not exhibit UB?

@RalfJung
Copy link
Member Author

RalfJung commented Jul 17, 2019

! is a ZST. So if references require the pointee to be valid, using your function with ! (and length > 0) is insta-UB.

@HeroicKatora
Copy link

HeroicKatora commented Jul 17, 2019

Ah, good point. I'm asking due to the str-concat crate which we're thinking about expanding to arbitrary slices. In that case, the input would already be a &[T]. The actual code would be slightly different:

fn arbitrary_blowup<'a, T>(slice: &'a [T], length: usize) -> &'a [T] {
    assert!(core::mem::size_of::<T>() == 0);
    assert!(!slice.is_empty());
    unsafe {
        core::slice::from_raw_parts(slice.as_ptr(), length);
    }
}

In the spirit of this code, but maybe part of another issue, is &[!] also uninhabited? The slice of length 0 would not contain any instance of ! and could be inhabited (your comment suggests this but is not explicit).

@RalfJung
Copy link
Member Author

is &[!] also uninhabited

No. It is inhabited by &[] in safe code, as you indicated.

arbitrary_blowup will indeed never cause insta-UB. It exploits that there are only two possible validity invariants for ZST, as there is no data the invariant could depend on: either the invariant is trivially satisfied (such as the one of ()) or it is trivially unsatisfied (such as the one of !).

However, as discussed above, arbitrary_blowup is not a safe function as it can be used to copy non-Copy ZST. (Note that ! is Copy!)

@HeroicKatora
Copy link

HeroicKatora commented Jul 17, 2019

However, as discussed above, arbitrary_blowup is not a safe function as it can be used to copy non-Copy ZST. (Note that ! is Copy!)

Why would Copy be involved, arbitrary_blowup does not create any instances and &T is Copy. It would also never lead to additional Drop::drop calls for the same reason. (But for &'a mut [T]-variant I might see a point in being much less safe since the references involved are not Copy).

@RalfJung
Copy link
Member Author

Let me put it the other way: arbitrary_blowup is legal if you add a T: Copy bound.

There is no such thing as "creating an instance" of a ZST, but increasing the size of a ZST slice from 1 to 2 is effectively "duplicating" the first element -- aka doing a copy.

@HeroicKatora
Copy link

HeroicKatora commented Jul 17, 2019

Now I'm confused. All references obtained from the slice are indistinguishable. Non-mutable references are Copy. I'm not duplicating elements but only references to elements, since it does not involve a value of the slice type but only a reference to such a slice. cc @joshlf since zerocopy also creates references to non-Copy types from thin air.

@oberien
Copy link

oberien commented Jul 17, 2019

@HeroicKatora A ZST could have a Drop implementation. By creating new elements out of thin air, you add additional Drop implementations. For example a ZST could be used to safely cleanup a static on drop (for whatever reason). By creating a second instance, that code could become UB.

Edit: Adding a T: Copy bound also ensures that the type doesn't have a Drop impl.
Edit2: See the comment below, this is wrong.

@HeroicKatora
Copy link

HeroicKatora commented Jul 17, 2019

It doesn't lead to any new Drop::drop calls, dropping the reference to the created slice is a pure no-op. No ownership of any Ts is involved, and drop requires the T to be owned (rather, you must safely guarantee to be semantically able to create a value of T but for the sake of ptr::drop_in_place and owning DST-types you need not actually do this). If you were to read one of the Ts from the slice, sure you create a new instance, but the read itself ensures Copy since you can not move from the reference.

Edit: sorry, this was a mess at first. I deleted the comment and added a new for anyone following by mail.

@Lokathor
Copy link
Contributor

Lokathor commented Jul 17, 2019

I also agree that I don't understand the need for T:Copy.

If this were an array then yes you'd need to not make them up out of nowhere, but since this is a shared slice then adding more length to the slice doesn't do anything because when the slice drops it doesn't cause the elements to drop.

Edit: Clarification, I mean in the case of concat for shared slices. Obviously you can't go making up any length that you want for any type that you want, but if you're concatting slices then the only way that you'd get a non-zero length input for one of the slices being concat is if someone gave you an inhabited type or already did UB themselves.

@RalfJung
Copy link
Member Author

RalfJung commented Jul 17, 2019

Hm I see, I think you are right. A shared slice is just many shared references and having one more makes no difference. I should try to prove that in lambda-rust one day.

For &mut this would not work, because of mem::swap.

But anyway this is all off-topic for a discussion that is solely about offsetting. I should probably tell GItHub to hide all the posts here...

@HeroicKatora next time you have a new question triggered by discussion inside a thread that is not directly related to the thread's topic, please open a new issue!

@gnzlbg
Copy link
Contributor

gnzlbg commented Jul 17, 2019 via email

@Lokathor
Copy link
Contributor

Forked to it's own issue: #168

@chorman0773

This comment was marked as resolved.

@RalfJung

This comment was marked as resolved.

@chorman0773
Copy link
Contributor

chorman0773 commented Mar 24, 2023

Ah yes, I saw the comment by email and replied by email, so I didn't see the edit.

In that case I do agree, but I think it doesn't help to resolve the question here. val.add(1) adds 0 if the pointee is zero-sized.

That's what the "This is what I would like" is there for, along with the 3rd property. I'd like for val.byte_add(n) allow me to infer null_or_dereferenceable(n), to permit the speculation I noted above.
With the third property, val.byte_add(0) would necessarily permit inferring null_or_dereferenceable(0).

@RalfJung
Copy link
Member Author

If we allow *val for any pointer for ZST, then that means that any pointer is dereferenceable(0), so the implication trivially holds.

@chorman0773
Copy link
Contributor

chorman0773 commented Mar 24, 2023

This seems like it might conflict with some of the more interesting inferences I'd like to make, though I'm trying to figure out what those interfences might be.

An obvious one would be pruning branches that lead to such an access to a deallocated pointer.

one thing I'm considering is replacing it with a fixed sized load in machine code emission when well-aligned, though I'm currently having a fun time justifying that on a ZST access. Along with actually using a pload instruction.

@chorman0773
Copy link
Contributor

chorman0773 commented Mar 24, 2023

Hmm... actually, potential idea for why deallocated pointers especially shouldn't work. Sort of a contrived example, but:

pub fn foo(x: &AtomicU32,bytelen: usize){
   assert_le!(bytelen,4);
   x.fetch_sub(1,Ordering::Acquire);
   let y =unsafe{core::slice::from_raw_parts(x as *const AtomicU32 as *const u32 as *const u8,bytelen)};
   for b in y{//...}
}

Being able to turn that into for b in core::ptr::read(x as *const AtomicU32 as *const u32).into_ne_bytes()[..bytelen]{} would be a nice transformation.

For bytelen==0, if zero-sized accesses to dangling are valid, that transformation would be illegal, because something could concurrently deallocate the unprotectored &AtomicU32.

@digama0
Copy link

digama0 commented Mar 24, 2023

IIUC, in that example x is not dereferenceable(4), it is at best dereferenceable_on_entry(4) so you can't optimize the expression for y much because we only know it to be dereferenceable for bytelen and that only because of the expression itself. So it doesn't seem that unreasonable to me to lose that optimization. I don't see how ZST accesses being UB would help in that scenario though.

@CAD97
Copy link

CAD97 commented Mar 24, 2023

That example certainly is extremely contrived, since you're doing mixed atomic and nonatomic reads.

I believe your point here is that if ZST access to deallocated is UB, that *(x as *const _ as *const ()) is sufficient to justify that the full backing object remains live. If ZST access to deallocated is defined, then it's not. Importantly, even in the latter case you're asserting that *(x as *const _ as *const u8) justifies the whole object as live, not just the first byte. (But given Rust has no way to change the size of an allocation without invalidating old provenance, that seems a realistic assertion.)

(I should also note that if you're optimizing on an IR where reading deallocated is not immediate UB but rather returns poison or equivalent, that transformation is valid in that IR. However, I'm pretty sure such an IR doesn't model virtualized memory where such a read may cause a segfault, so that's not super relevant.)

Rewriting the example a little to avoid distractions:

pub unsafe fn foo(
    ptr: *const u32,
    len: usize,
) {
    assert!(len <= 4);

    // ptr is dereferencable here for `u32`
    *ptr;

    // some other code runs
    extern_fn();

    // ptr is dereferencable here for `()`
    *(ptr as *const ());

    // iterate over the first `len` bytes
    let slice = slice::from_raw_parts(ptr as *const u8, len);
    for &byte in slice { /* ... */ }
}

Can that be transformed to (the moral equivalent of)

pub unsafe fn foo(
    ptr: *const u32,
    len: usize,
) {
    assert!(len <= 4);

    // some other code runs
    extern_fn();

    // read as `u32` (4 bytes)
    let arr = (*ptr).to_ne_bytes();
    // then iterate over the first `len` bytes
    let slice = &arr[..len];
    for &byte in slice { /* ... */ }
}

which unconditionally does a u32 read after extern_fn() has run?

The *(ptr as *const ()) is here to be overly explicit about the properties we're taking as axioms from the initial form of the code. This should presumably not be necessary, as construction of the slice reference would presumably also perform a Retag of the first len bytes of the object, asserting there again that it's dereferencable.

I can illustrate this transformation introducing new UB without relying on the deallocation of ptr:

unsafe fn extern_fn() {
    let ptr: *const u32 = get_from_collusion();
    let bptr = (ptr as *const u8 as *mut u8).add(3);
    thread::spawn(|| *bptr = 0);
}

Now the unconditional u32 read will introduce a race with that write to the fourth byte of *ptr that didn't exist previously (for len < 4).

(The same thing about using an IR where a racing read results in poison rather than immediate UB applies, but is a bit more relevant in this case, since that might actually model real hardware. IIRC, racing reads do produce poison in LLVM IR, but across the entire scalar, not at a byte-level granularity like would be required here.)

Just in general, introducing any access through &UnsafeCell between synchronization points where there isn't one already is likely to be invalid even if the object is guaranteed to still be there, because another thread could be writing to it between those sync points, making your introduced access a data race.

@chorman0773
Copy link
Contributor

chorman0773 commented Mar 24, 2023

IIUC, in that example x is not dereferenceable(4), it is at best dereferenceable_on_entry(4) so you can't optimize the expression for y much because we only know it to be dereferenceable for bytelen and that only because of the expression itself. So it doesn't seem that unreasonable to me to lose that optimization. I don't see how ZST accesses being UB would help in that scenario though.

Note that by dereferenceable(n), I'm referring to xlang's pointer attrbute, not the llvm equivalent. At a parameter level, you could call dereferenceable(n) equivalent to dereferenceable_at_entry(n) in llvm. AFAIK, the current status quo is that we're clear that when a &UnsafeCell is created, including when passing it as a parameter, it must be dereferenceable at that instant.

The optimization in question relies on the potentially-zst access extending the dereferenceable, since AFAIK it's impossible to shrink reference provenance, only invalidate it completely.

I'm actually thinking that my MemoryController struct from my clever-isa-emu might be able to benefit from this, using pointers (Box) rather than references to UnsafeCell.

racing reads do produce poison in LLVM IR, but across the entire scalar, not at a byte-level granularity like would be required here

Such an optimization might occur in MCE, rather than on IR. xlang would also permit lifting to as_rvalue freeze for the read. Or in both xlang and llvm, as type [u8;4].

@chorman0773
Copy link
Contributor

(As a general note, I'm writing transformed code in Rust to avoid adding to confusion - assume that the transformations are taking place on an IR like xlang, which, among other things, lifts data race UB to uninit)

@RalfJung
Copy link
Member Author

I don't think we need a data race; extern_fn could do some accesses with pointers further up the "pointer deriving" tree that invalidate ptr's access to the last byte. Then the first function (in this post) would still be fine, but for len < 4 the 2nd function would be wrong.

@chorman0773
Copy link
Contributor

chorman0773 commented Mar 27, 2023

I'm fairly certain that shouldn't be a problem with xlang's model (again, assume I'm doing rewrites on xlang, not on rust source or a MIR equivalent). An example of how it might introduce the rust-level UB might be useful in assesing whether or not it would definitively. If such a function does exist, doing the transform in MCE is still valid.

@chorman0773
Copy link
Contributor

(I almost want to say that reads/writes to reachable bytes that aren't reachable for that purpose does uninit rather than UB to be sure).

@chorman0773
Copy link
Contributor

chorman0773 commented Mar 27, 2023

As a note, thinking about how this optimizes: On w65 this can nominally save 4 cycles per 2 bytes getting speculated (at the potential cost of 6 cycles for a 0 sized read). The zero-sized case actually saves 4 cycles in the len=1 case, by allowing the removal of a cmp+non-taken branch, which are 3 and 2 cycles apiece, while sacrificing 1 more cycle on the read when aligned to 2 bytes.

Edit: This came up on the Rust Community Server, so I'd like to clarify - I don't think that these cycle savings on w65 are the only performance benefit I can derive from this, but as you can imagine, it's alot harder to give concrete numbers on a modern CPU even if you do have the optimization implemented, let alone one on the "To implement" list.

@T-Dark0
Copy link

T-Dark0 commented Mar 27, 2023

About the fact this came up on the community server, I recenrtly heard arguments that part of the issue is that it's really weird to allow dangling ZST accesses, but not deallocated ZST accesses. While that's true, I think the real issue is a matter of perspective.

The apparent inconsistency is that "it's ok to read ZSTs from dangling pointers, but not if deallocated". However, consider this alternative formulation:

  • for all types, it's not per-se-UB* to read them from a sufficiently large region of allocated memory
  • deallocating memory causes a region of allocated memory to stop existing
  • creating a pointer automatically performs a zero-sized allocation. This is true for all means of creating a pointer, including, but not limited to, as *mut invalid offset, and addr_of. (Edit: I'd be open to saying offset on a deallocated pointer does not create a new allocation. That would make sense to me, as in a sense you're not creating a pointer, merely changing an existing one)

Under this model, the rules are simple: Nothing, not even a ZST, can be read from a dangling pointer. Just like nothing, not even a ZST, can be read from a deallocated pointer.
The only bit of "ZST magic" that we keep is that it's impossible to ever read a ZST from a non-deallocated dangling pointer, as the very existence of that pointer implies it points to a zero-bytes allocation. I also personally find value in the fact the magic is moved away from ZSTs, and onto pointers instead. Pointers are going to be special no matter what, and this way all types work the same.
If anything, under this model, it would be inconsistent to allow ZSTs to be read from deallocated pointers: they're just types, why would they be special?

This interpretation also seems to be supported by the core::ptr docs: https://doc.rust-lang.org/std/ptr/index.html#safety

However, casting any non-zero integer literal to a pointer is valid for zero-sized accesses, even if some memory happens to exist at that address and gets deallocated. This corresponds to writing your own allocator: allocating zero-sized objects is not very hard.

Emphasis added. My reading of this is that the operation of casting a pointeger from a nonzero integer performs a zero-sized allocation, and I think it would be fine to generalize this idea to all pointers, not just pointers from "magic integers".

To be entirely fair, this model would cause NonNull::dangling and ptr::invalid to be slightly unfortunate names. In an ideal world, they'd probably be called NonNull::alloc_zero_sized() and ptr::no_provenance(), I guess? (please no bikeshedding, I don't like these names either). However, even if @chorman0773's optimization does not end up being performed, I think it would be worthwhile to adopt a model where ZSTs are as non-special as possible, and this model avoids the inconsistency between dangling and deallocated.

* Performing such a read can still be UB for other reasons, such as violation of validity invariants. It's just not itself a cause of UB.

@digama0
Copy link

digama0 commented Mar 28, 2023

creating a pointer automatically performs a zero-sized allocation. This is true for all means of creating a pointer, including, but not limited to, as *mut invalid offset, and addr_of. (Edit: I'd be open to saying offset on a deallocated pointer does not create a new allocation. That would make sense to me, as in a sense you're not creating a pointer, merely changing an existing one)

I don't think this is an option, because it means "creating a pointer" is not a pure operation. In particular, you put invalid on the list but this is currently supposed to be identical to just transmuting an integer to a pointer type, and transmutes structurally cannot have any side effects without breaking all sorts of things. You would need to guard against causing side effects by transmuting an integer when you are just reading a pointer from memory and there is no hint of a transmute happening in the function being compiled.

For the "provenance monotonicity" property to work, you need the possible provenances to have a lattice structure, meaning that there needs to be a bottom provenance which has no more permissions than a deallocated pointer, which means we need a second "none"-like provenance, say zapped, and we may or may not need to give transmuted integers zapped provenance as well. In the end it seems like many of the issues end up coming back anyway but with one additional epicycle in the model.

If anything, under this model, it would be inconsistent to allow ZSTs to be read from deallocated pointers: they're just types, why would they be special?

ZST types are not special, but accesses are effectively a for loop over all the loaded memory locations - a permissions error on any of the bytes will cause UB - and so a zero-byte access should be an empty loop and hence a no-op. There are no borrow stacks to consult in the first place as these permissions are stored per-byte.

@JakobDegen JakobDegen added the S-pending-design Status: Resolving this issue requires addressing some open design questions label Jul 25, 2023
@JakobDegen
Copy link
Contributor

There was an agreement in which we agreed on allowing this for offsets - the accesses question was not covered

bors added a commit to rust-lang-ci/rust that referenced this issue Mar 6, 2024
Use GEP inbounds for ZST and DST field offsets

ZST field offsets have been non-`inbounds` since I made [this old layout change](https://github.com/rust-lang/rust/pull/73453/files#diff-160634de1c336f2cf325ff95b312777326f1ab29fec9b9b21d5ee9aae215ecf5). Before that, they would have been `inbounds` due to using `struct_gep`. Using `inbounds` for ZSTs likely doesn't matter for performance, but I'd like to remove the special case.

DST field offsets have been non-`inbounds` since the alignment-aware DST field offset computation was first [implemented](erikdesjardins@a2557d4#diff-04fd352da30ca186fe0bb71cc81a503d1eb8a02ca17a3769e1b95981cd20964aR1188) in 1.6 (back then `GEPi()` would be used for `inbounds`), but I don't think there was any reason for it.

Split out from rust-lang#121577 / rust-lang#121665.

r? `@oli-obk`

cc `@RalfJung` -- is there some weird situation where field offsets can't be `inbounds`?

Note that it's fine for `inbounds` offsets to be one-past-the-end, so it's okay even if there's a ZST as the last field in the layout:

> The base pointer has an in bounds address of an allocated object, which means that it points into an allocated object, or to its end. [(link)](https://llvm.org/docs/LangRef.html#getelementptr-instruction)

For rust-lang/unsafe-code-guidelines#93, zero-offset GEP is (now) always `inbounds`:

> Note that getelementptr with all-zero indices is always considered to be inbounds, even if the base pointer does not point to an allocated object. [(link)](https://llvm.org/docs/LangRef.html#getelementptr-instruction)
bors added a commit to rust-lang-ci/rust that referenced this issue Mar 8, 2024
Use GEP inbounds for ZST and DST field offsets

ZST field offsets have been non-`inbounds` since I made [this old layout change](https://github.com/rust-lang/rust/pull/73453/files#diff-160634de1c336f2cf325ff95b312777326f1ab29fec9b9b21d5ee9aae215ecf5). Before that, they would have been `inbounds` due to using `struct_gep`. Using `inbounds` for ZSTs likely doesn't matter for performance, but I'd like to remove the special case.

DST field offsets have been non-`inbounds` since the alignment-aware DST field offset computation was first [implemented](erikdesjardins@a2557d4#diff-04fd352da30ca186fe0bb71cc81a503d1eb8a02ca17a3769e1b95981cd20964aR1188) in 1.6 (back then `GEPi()` would be used for `inbounds`), but I don't think there was any reason for it.

Split out from rust-lang#121577 / rust-lang#121665.

r? `@oli-obk`

cc `@RalfJung` -- is there some weird situation where field offsets can't be `inbounds`?

Note that it's fine for `inbounds` offsets to be one-past-the-end, so it's okay even if there's a ZST as the last field in the layout:

> The base pointer has an in bounds address of an allocated object, which means that it points into an allocated object, or to its end. [(link)](https://llvm.org/docs/LangRef.html#getelementptr-instruction)

For rust-lang/unsafe-code-guidelines#93, zero-offset GEP is (now) always `inbounds`:

> Note that getelementptr with all-zero indices is always considered to be inbounds, even if the base pointer does not point to an allocated object. [(link)](https://llvm.org/docs/LangRef.html#getelementptr-instruction)
github-actions bot pushed a commit to rust-lang/miri that referenced this issue Mar 8, 2024
Use GEP inbounds for ZST and DST field offsets

ZST field offsets have been non-`inbounds` since I made [this old layout change](https://github.com/rust-lang/rust/pull/73453/files#diff-160634de1c336f2cf325ff95b312777326f1ab29fec9b9b21d5ee9aae215ecf5). Before that, they would have been `inbounds` due to using `struct_gep`. Using `inbounds` for ZSTs likely doesn't matter for performance, but I'd like to remove the special case.

DST field offsets have been non-`inbounds` since the alignment-aware DST field offset computation was first [implemented](erikdesjardins/rust@a2557d4#diff-04fd352da30ca186fe0bb71cc81a503d1eb8a02ca17a3769e1b95981cd20964aR1188) in 1.6 (back then `GEPi()` would be used for `inbounds`), but I don't think there was any reason for it.

Split out from #121577 / #121665.

r? `@oli-obk`

cc `@RalfJung` -- is there some weird situation where field offsets can't be `inbounds`?

Note that it's fine for `inbounds` offsets to be one-past-the-end, so it's okay even if there's a ZST as the last field in the layout:

> The base pointer has an in bounds address of an allocated object, which means that it points into an allocated object, or to its end. [(link)](https://llvm.org/docs/LangRef.html#getelementptr-instruction)

For rust-lang/unsafe-code-guidelines#93, zero-offset GEP is (now) always `inbounds`:

> Note that getelementptr with all-zero indices is always considered to be inbounds, even if the base pointer does not point to an allocated object. [(link)](https://llvm.org/docs/LangRef.html#getelementptr-instruction)
lnicola pushed a commit to lnicola/rust-analyzer that referenced this issue Apr 7, 2024
Use GEP inbounds for ZST and DST field offsets

ZST field offsets have been non-`inbounds` since I made [this old layout change](https://github.com/rust-lang/rust/pull/73453/files#diff-160634de1c336f2cf325ff95b312777326f1ab29fec9b9b21d5ee9aae215ecf5). Before that, they would have been `inbounds` due to using `struct_gep`. Using `inbounds` for ZSTs likely doesn't matter for performance, but I'd like to remove the special case.

DST field offsets have been non-`inbounds` since the alignment-aware DST field offset computation was first [implemented](erikdesjardins/rust@a2557d4#diff-04fd352da30ca186fe0bb71cc81a503d1eb8a02ca17a3769e1b95981cd20964aR1188) in 1.6 (back then `GEPi()` would be used for `inbounds`), but I don't think there was any reason for it.

Split out from #121577 / #121665.

r? `@oli-obk`

cc `@RalfJung` -- is there some weird situation where field offsets can't be `inbounds`?

Note that it's fine for `inbounds` offsets to be one-past-the-end, so it's okay even if there's a ZST as the last field in the layout:

> The base pointer has an in bounds address of an allocated object, which means that it points into an allocated object, or to its end. [(link)](https://llvm.org/docs/LangRef.html#getelementptr-instruction)

For rust-lang/unsafe-code-guidelines#93, zero-offset GEP is (now) always `inbounds`:

> Note that getelementptr with all-zero indices is always considered to be inbounds, even if the base pointer does not point to an allocated object. [(link)](https://llvm.org/docs/LangRef.html#getelementptr-instruction)
RalfJung pushed a commit to RalfJung/rust-analyzer that referenced this issue Apr 27, 2024
Use GEP inbounds for ZST and DST field offsets

ZST field offsets have been non-`inbounds` since I made [this old layout change](https://github.com/rust-lang/rust/pull/73453/files#diff-160634de1c336f2cf325ff95b312777326f1ab29fec9b9b21d5ee9aae215ecf5). Before that, they would have been `inbounds` due to using `struct_gep`. Using `inbounds` for ZSTs likely doesn't matter for performance, but I'd like to remove the special case.

DST field offsets have been non-`inbounds` since the alignment-aware DST field offset computation was first [implemented](erikdesjardins/rust@a2557d4#diff-04fd352da30ca186fe0bb71cc81a503d1eb8a02ca17a3769e1b95981cd20964aR1188) in 1.6 (back then `GEPi()` would be used for `inbounds`), but I don't think there was any reason for it.

Split out from #121577 / #121665.

r? `@oli-obk`

cc `@RalfJung` -- is there some weird situation where field offsets can't be `inbounds`?

Note that it's fine for `inbounds` offsets to be one-past-the-end, so it's okay even if there's a ZST as the last field in the layout:

> The base pointer has an in bounds address of an allocated object, which means that it points into an allocated object, or to its end. [(link)](https://llvm.org/docs/LangRef.html#getelementptr-instruction)

For rust-lang/unsafe-code-guidelines#93, zero-offset GEP is (now) always `inbounds`:

> Note that getelementptr with all-zero indices is always considered to be inbounds, even if the base pointer does not point to an allocated object. [(link)](https://llvm.org/docs/LangRef.html#getelementptr-instruction)
@celinval
Copy link

celinval commented Nov 6, 2024

Can we close this issue based on #472 or am I missing something?

@RalfJung
Copy link
Member Author

RalfJung commented Nov 6, 2024

Yes I think this is resolved. :)

@RalfJung RalfJung closed this as completed Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-memory Topic: Related to memory accesses C-open-question Category: An open question that we should revisit S-pending-design Status: Resolving this issue requires addressing some open design questions
Projects
None yet
Development

No branches or pull requests