-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can references to uninhabited types ever be valid? #413
Comments
Does this issue also cover |
Yes. While you can have partially initialized locals of type |
Is there a reason why they should be invalid unless |
One motivation is being able to replace the entire body of an |
This has been mentioned elsewhere and I think you agreed to it, but that is not a very compelling motivation since you can just put |
I don't, at least not without writing an unsafe block that would be UB under the OP question (more or less by definition). A variation that may or may not count is My main motivation for considering that case specially is that the borrow checker does allow for types like |
Wouldn't it be almost always trivial to prove that such a function is unreachable anyway? Assuming reachability checks occur post-monomorphisation. |
I feel like we could do an experiment here to figure out if this is actually worth anything? |
If it's unreachable, then you wouldn't even need an |
I'm trying to remember the old discussion. The concern might have been that these are trait impls generated automatically / via proc macros for tons of types. So we can't just add custom code into the body. I think it was @cramertj who raised this wish back then, not sure if they still remember any details though.
If it's a public function in a lib crate, how would that be trivial? |
The perf run for this optimization implemented as a MIR pass indicates no benefit in terms of code size: https://perf.rust-lang.org/compare.html?start=afab3662eb066f05fcdb43c421b72dd19472e752&end=0f4641ea16e650c77a03aa649330702722a5f66f&stat=size%3Alinked_artifact |
I guess the benchmarks don't contain auto-derived code for uninhabited types. Also as a MIR opt this is less effective than it could be, if generic types and their impls are monomorphized for an uninhabited type. |
I have never seen this happen, so this is news to me. What is an example of that? |
I take it back, the borrow checker will reason about partially initialized #[derive(Default)]
struct Foo {
a: String,
...
z: String,
}
let mut x = Foo::default();
drop(x.z);
|| println!("{}, ..., {}", x.a, ..., x.y); Currently, the returned closure captures |
Ah, yes that makes sense --
With per-field closure capture, don't we actually do capture just a single reference and use it for all fields? That was considered as an implementation strategy at some point. But anyway if we do that, we just have to pick a suitable type for the closure capture environment, one that is not uninhabited. Arguably actually using |
I'd expect that a lot of those trait impls will ultimately contain use serde::Serialize;
#[derive(Serialize)]
pub enum Void{} expands to pub enum Void {}
#[doc(hidden)]
#[allow(non_upper_case_globals, unused_attributes, unused_qualifications)]
const _: () =
{
#[allow(unused_extern_crates, clippy :: useless_attribute)]
extern crate serde as _serde;
#[allow(unused_macros)]
macro_rules! try {
($__expr : expr) =>
{
match $__expr
{
_serde :: __private :: Ok(__val) => __val, _serde ::
__private :: Err(__err) =>
{ return _serde :: __private :: Err(__err) ; }
}
}
}
#[automatically_derived]
impl _serde::Serialize for Void {
fn serialize<__S>(&self, __serializer: __S)
-> _serde::__private::Result<__S::Ok, __S::Error> where
__S: _serde::Serializer {
match *self {}
}
}
}; (I tried with builtin derives, and it just generates |
Now consider a struct with many fields, one if which is Void.
|
No, they do.
I moved the transform to codegen where we have monomoprhized MIR. It still has no effect, and in a stage2 build of rustc it only fires one additional time. Many of the times this optimization fires, it transforms this:
into
Hardly anything to celebrate. What we don't have is a highly artificial crate that just derives a bunch of traits on a bunch of types which are all uninhabited. Maybe then this would show up. But that form of argument can justify any optimization so I don't think it is worth considering. I do not think an argument about the value of this optimization holds up. But the try toolchain(s) are up there, so if someone has a codebase where they can demonstrate it helping, please do so. |
Given a large tuple type with an uninhabited field, we currently do generate a whole bunch of MIR / LLVM IR that is all completely pointless, right? I was told that in the Fuchsia codebase, being able to optimize this is relevant. That's all I know, and it seems plausible. You are saying in our benchmarks we have no such types. Fair, but that doesn't really say much about the code in the wild. I'm not trusting our ability to come up with real-world benchmarks and predict future coding patterns enough to conclusively say "this optimization is useless and will also never be useful in the future". (I will also note that this kind of argument hasn't been used in other UB discussions before, despite other forms of UB being a lot more subtle, having non-local effects, and having no real-world evidence of being helpful either... are we going to demand real-world optimization evidence for every single bit of UB, and make everything else defined? I'm fine with "here is an optimization that demonstrates conclusively that his UB is useful" as an argument, but I find the negative version of this a lot less convincing. It's hard to come up with good evidence for the absence of a usecase.) It seems to me to be cheap and easy to do this (have this UB), since there really isn't any reason to allow such references to exist and the extra complexity incurred on the spec is minimal. It's also easy to take back the UB and make it a mere safety invariant, if this ever causes a problem somewhere. It's impossible to go in the other direction, if we discover in the future that we really want this optimization. |
I don't mean to argue that "this optimization is useless and will also never be useful in the future". But I do think that if we are going to claim that some optimization is valuable currently, we should have data to back up that claim. I thought that claim was being made up-thread (by you someone at an all-hands you are echoing). I'll see if I can take a look at Fuchsia later, if I can figure out how to interact with it. |
Sorry, seems like I was misinterpreting your intent.
It's also possible I am misrepresenting what they said, it's been a few years. :/
|
One other reason it might make sense to consider Arguably, fixing this by making more code UB is a bad fix, but the alternative of rejecting the empty match does not seem appealing either. |
Just as a single data point, this behavior doesn't seem that weird to me. I am used to pattern matching that goes looking for exhaustive matches, and using |
… r=compiler-errors Add the `min_exhaustive_patterns` feature gate ## Motivation Pattern-matching on empty types is tricky around unsafe code. For that reason, current stable rust conservatively requires arms for empty types in all but the simplest case. It has long been the intention to allow omitting empty arms when it's safe to do so. The [`exhaustive_patterns`](rust-lang#51085) feature allows the omission of all empty arms, but hasn't been stabilized because that was deemed dangerous around unsafe code. ## Proposal This feature aims to stabilize an uncontroversial subset of exhaustive_patterns. Namely: when `min_exhaustive_patterns` is enabled and the data we're matching on is guaranteed to be valid by rust's operational semantics, then we allow empty arms to be omitted. E.g.: ```rust let x: Result<T, !> = foo(); match x { // ok Ok(y) => ..., } let Ok(y) = x; // ok ``` If the place is not guaranteed to hold valid data (namely ptr dereferences, ref dereferences (conservatively) and union field accesses), then we keep stable behavior i.e. we (usually) require arms for the empty cases. ```rust unsafe { let ptr: *const Result<u32, !> = ...; match *ptr { Ok(x) => { ... } Err(_) => { ... } // still required } } let foo: Result<u32, &!> = ...; match foo { Ok(x) => { ... } Err(&_) => { ... } // still required because of the dereference } unsafe { let ptr: *const ! = ...; match *ptr {} // already allowed on stable } ``` Note that we conservatively consider that a valid reference can point to invalid data, hence we don't allow arms of type `&!` and similar cases to be omitted. This could eventually change depending on [opsem decisions](rust-lang/unsafe-code-guidelines#413). Whenever opsem is undecided on a case, we conservatively keep today's stable behavior. I proposed this behavior in the [`never_patterns`](rust-lang#118155) feature gate but it makes sense on its own and could be stabilized more quickly. The two proposals nicely complement each other. ## Unresolved Questions Part of the question is whether this requires an RFC. I'd argue this doesn't need one since there is no design question beyond the intent to omit unreachable patterns, but I'm aware the problem can be framed in ways that require design (I'm thinking of the [original never patterns proposal](https://smallcultfollowing.com/babysteps/blog/2018/08/13/never-patterns-exhaustive-matching-and-uninhabited-types-oh-my/), which would frame this behavior as "auto-nevering" happening). EDIT: I initially proposed a future-compatibility lint as part of this feature, I don't anymore.
… r=compiler-errors Add the `min_exhaustive_patterns` feature gate ## Motivation Pattern-matching on empty types is tricky around unsafe code. For that reason, current stable rust conservatively requires arms for empty types in all but the simplest case. It has long been the intention to allow omitting empty arms when it's safe to do so. The [`exhaustive_patterns`](rust-lang#51085) feature allows the omission of all empty arms, but hasn't been stabilized because that was deemed dangerous around unsafe code. ## Proposal This feature aims to stabilize an uncontroversial subset of exhaustive_patterns. Namely: when `min_exhaustive_patterns` is enabled and the data we're matching on is guaranteed to be valid by rust's operational semantics, then we allow empty arms to be omitted. E.g.: ```rust let x: Result<T, !> = foo(); match x { // ok Ok(y) => ..., } let Ok(y) = x; // ok ``` If the place is not guaranteed to hold valid data (namely ptr dereferences, ref dereferences (conservatively) and union field accesses), then we keep stable behavior i.e. we (usually) require arms for the empty cases. ```rust unsafe { let ptr: *const Result<u32, !> = ...; match *ptr { Ok(x) => { ... } Err(_) => { ... } // still required } } let foo: Result<u32, &!> = ...; match foo { Ok(x) => { ... } Err(&_) => { ... } // still required because of the dereference } unsafe { let ptr: *const ! = ...; match *ptr {} // already allowed on stable } ``` Note that we conservatively consider that a valid reference can point to invalid data, hence we don't allow arms of type `&!` and similar cases to be omitted. This could eventually change depending on [opsem decisions](rust-lang/unsafe-code-guidelines#413). Whenever opsem is undecided on a case, we conservatively keep today's stable behavior. I proposed this behavior in the [`never_patterns`](rust-lang#118155) feature gate but it makes sense on its own and could be stabilized more quickly. The two proposals nicely complement each other. ## Unresolved Questions Part of the question is whether this requires an RFC. I'd argue this doesn't need one since there is no design question beyond the intent to omit unreachable patterns, but I'm aware the problem can be framed in ways that require design (I'm thinking of the [original never patterns proposal](https://smallcultfollowing.com/babysteps/blog/2018/08/13/never-patterns-exhaustive-matching-and-uninhabited-types-oh-my/), which would frame this behavior as "auto-nevering" happening). EDIT: I initially proposed a future-compatibility lint as part of this feature, I don't anymore.
Rollup merge of rust-lang#118803 - Nadrieril:min-exhaustive-patterns, r=compiler-errors Add the `min_exhaustive_patterns` feature gate ## Motivation Pattern-matching on empty types is tricky around unsafe code. For that reason, current stable rust conservatively requires arms for empty types in all but the simplest case. It has long been the intention to allow omitting empty arms when it's safe to do so. The [`exhaustive_patterns`](rust-lang#51085) feature allows the omission of all empty arms, but hasn't been stabilized because that was deemed dangerous around unsafe code. ## Proposal This feature aims to stabilize an uncontroversial subset of exhaustive_patterns. Namely: when `min_exhaustive_patterns` is enabled and the data we're matching on is guaranteed to be valid by rust's operational semantics, then we allow empty arms to be omitted. E.g.: ```rust let x: Result<T, !> = foo(); match x { // ok Ok(y) => ..., } let Ok(y) = x; // ok ``` If the place is not guaranteed to hold valid data (namely ptr dereferences, ref dereferences (conservatively) and union field accesses), then we keep stable behavior i.e. we (usually) require arms for the empty cases. ```rust unsafe { let ptr: *const Result<u32, !> = ...; match *ptr { Ok(x) => { ... } Err(_) => { ... } // still required } } let foo: Result<u32, &!> = ...; match foo { Ok(x) => { ... } Err(&_) => { ... } // still required because of the dereference } unsafe { let ptr: *const ! = ...; match *ptr {} // already allowed on stable } ``` Note that we conservatively consider that a valid reference can point to invalid data, hence we don't allow arms of type `&!` and similar cases to be omitted. This could eventually change depending on [opsem decisions](rust-lang/unsafe-code-guidelines#413). Whenever opsem is undecided on a case, we conservatively keep today's stable behavior. I proposed this behavior in the [`never_patterns`](rust-lang#118155) feature gate but it makes sense on its own and could be stabilized more quickly. The two proposals nicely complement each other. ## Unresolved Questions Part of the question is whether this requires an RFC. I'd argue this doesn't need one since there is no design question beyond the intent to omit unreachable patterns, but I'm aware the problem can be framed in ways that require design (I'm thinking of the [original never patterns proposal](https://smallcultfollowing.com/babysteps/blog/2018/08/13/never-patterns-exhaustive-matching-and-uninhabited-types-oh-my/), which would frame this behavior as "auto-nevering" happening). EDIT: I initially proposed a future-compatibility lint as part of this feature, I don't anymore.
Currently, with nightly, this program:
runs, but complains about an unreachable match arm. If you remove the supposedy unreachable arm,
it dies with I think this is rather strange, unless both programs have UB. |
As a possible argument in favour of In derive-adhoc I have a large enum whose variants contain "markers" that are sometimes units and sometimes uninhabited: See https://gitlab.torproject.org/Diziet/rust-derive-adhoc/-/blob/main/macros/framework.rs?ref_type=heads#L115 and https://gitlab.torproject.org/Diziet/rust-derive-adhoc/-/blob/main/macros/syntax.rs?ref_type=heads#L67 This allows the reuse of the same enum (source code) for similar value sets that occur in different contexts. Call sites that receive one of these values can call trait methods like this: https://gitlab.torproject.org/Diziet/rust-derive-adhoc/-/blob/main/macros/paste.rs?ref_type=heads#L816 This allows me as the author to statically demonstrate that the situation is impossible, and therefore I don't have to write code to handle it. It would be nice if the compiler could remove as much of this as possible. Ideally, the uninhabited enum variants, and all the code that handles uninhabited values, would completely vanish as soon as possible after monomorphisation. |
Your example is an instance of rust-lang/rust#117288, which the lang team has agreed was something they wanted to fix (the arm is deemed unreachable because the |
Here's another example where disallowing
We hope |
(Maybe bare |
Yeah that should be fixed... I thought @Nadrieril fixed this recently but I guess that was another match lint issue that got fixed?
I wouldn't want to make function calls special this way. I'd prefer making |
Yeah what I fixed is the exhaustiveness code; that example triggers a warning because of divergence tracking instead (i.e. it triggers the unreachable_code lint, and I only fixed the unreachable_pattern lint) |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Do you want full recursive inhabitedness? E.g. should |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
No worries, and thanks for opening that issue with a very clear motivating use-case. :) |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
I realized I missed this in the sequence of off-topic posts about Yes, this is recursive. If we say that references to uninhabited types are invalid, that means None of this requires recursing over a value of |
This discussion presupposes that "Do we have full recursive validity for references" (#412) is answered "no". This issue is only about references to uninhabited types; references to any other type are not relevant here.
So, in general we allow references to point to invalid data. Is the following UB?
Since there is no recursive validity for references, the answer might be "no". However, the reason we could consider making this UB is that all of the arguments in #412 do not apply: to determine whether an
&!
is valid, we do not have to check memory, do any potentially racy reads, or anything like that. We can just answer the question with "no". Recursive validity is tricky in part because references can "bounce" between being valid and invalid as the contents of memory change; but for references to uninhabited types, this concern does not apply.Basically, we can think of
!
having an impossible-to-satisfy alignment requirement, and therefore&!
is invalid since it is never properly aligned. (I don't really want to fold this into the alignment check, but the point is that we already have the validity of&T
depend onT
, namely via alignment -- I see no fundamental reason why we couldn't do the same with inhabitedness.)The text was updated successfully, but these errors were encountered: