Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand is_uninhabited #36449

Merged
merged 9 commits into from
Nov 23, 2016
Merged

Expand is_uninhabited #36449

merged 9 commits into from
Nov 23, 2016

Conversation

canndrew
Copy link
Contributor

@canndrew canndrew commented Sep 13, 2016

This allows code such as this to compile:

let x: ! = ...;
match x {};

let y: (u32, !) = ...;
match y {};

@eddyb You were worried about making this change. Do you have any idea about what could break? Are there any special tests that need to be written for it?

@rust-highfive
Copy link
Collaborator

r? @arielb1

(rust_highfive has picked a reviewer for you, use r? to override)

@eddyb
Copy link
Member

eddyb commented Sep 13, 2016

I was worried because this is a feature that would be insta-stable. cc @rust-lang/lang

@canndrew canndrew changed the title Expand is_uninhabited for ! and tuples Expand is_uninhabited Sep 14, 2016
@canndrew canndrew changed the title Expand is_uninhabited [WIP] Expand is_uninhabited Sep 14, 2016
@canndrew
Copy link
Contributor Author

I decided to expand this PR to cover (almost) all uninhabited types. There'll be at least one more commit coming.

@canndrew
Copy link
Contributor Author

Okay, so this PR now does three more things:

  • ADTs are now recursed into and deeply checked to see if they're uninhabited. This means types like Result<!, !> will be uninhabited. In order to avoid going into an infinite loop on recursive types, it builds up a set of (DefId, Substs) for all the ADTs it's already encountered so it can return early if it goes to check an ADT that it's already in the process of checking. This requires allocating a HashSet every time is_uninhabited is called on an ADT though - is there a more efficient way to implement this? Like, a quick way to check whether a type may be recursive?
  • References to uninhabited types are uninhabited. This might break unsafe code in the wild just like it did in the compiler. Some code in core::fmt was using &Void for untyped pointers (ie. as void *). This is just wrong though. A &T is defined to always point to a T. That's why you can pattern match on it (match &t { &T::Variant0 => {}, &T::Variant1 => {} }) and why you can dereference it without an unsafe block. The offending code in libcore has been changed to use *const Void instead.
  • Fixed-length arrays of uninhabited types are uninhabited if they have non-zero length.

@canndrew canndrew changed the title [WIP] Expand is_uninhabited Expand is_uninhabited Sep 15, 2016
@bors
Copy link
Contributor

bors commented Sep 22, 2016

☔ The latest upstream changes (presumably #36551) made this pull request unmergeable. Please resolve the merge conflicts.

@canndrew
Copy link
Contributor Author

One thing I just noticed about this: if a type has a private field which is uninhabited then the entire type will be uninhabited. This means something like this will work when it probably shouldn't:

struct SecretlyEmpty(!);

let res: Result<T, SecretlyEmpty> = ...;
match res {
    Ok(t) => t,
}

So, this implementation isn't unsafe but it leaks non-public information about types. A fix would be to change is_uninhabited to is_publicly_uninhabited which would still be a strict extension of the original behaviour. Thoughts?

@aturon
Copy link
Member

aturon commented Sep 27, 2016

Nominating for discussion in this week's lang meeting.

@aturon aturon added I-nominated A-allocators Area: Custom and system allocators labels Sep 27, 2016
@nrc
Copy link
Member

nrc commented Sep 27, 2016

References to uninhabited types are uninhabited.

This at the very least is a breaking change and so it would be nice to have a warning cycle. I'm not sure how that would work though.

Although the initial description of this feature seems reasonable (let x: ! = ...; match x {} seems fine), the extension to recursive checking including fields and arrays, plus the insta-stable nature of the change makes me very nervous. To what extent was this discussed on the RFC thread? Should we open another RFC for this? (I'd rather not, but I'd also like this to get more attention than this issue has).

@canndrew
Copy link
Contributor Author

This at the very least is a breaking change

Is it outside of unsafe code? There was some code in libstd that this broke because it was letting a &Void escape from an unsafe block which - even if it's never dereferenced - is still not a valid type to have in live code. What guarantees do we want to give about (mis)use of unsafe code?

@@ -180,8 +180,9 @@ enum Void {}
issue = "0")]
#[doc(hidden)]
pub struct ArgumentV1<'a> {
value: &'a Void,
formatter: fn(&Void, &mut Formatter) -> Result,
_ph: PhantomData<&'a ()>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like a better fix would be to use &'a (). And perhaps Void ought to be defined not as an empty enum, but rather as a zero-sized struct that exposes no constructor? This is sort of an interesting question. I'm actually not sure what's the best way to model the equivalent of void*. Seems like something we can discuss when it comes to the unsafe code guidelines. =)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What code broke with &'a Void? I guess there is a match somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember what broke exactly but I just tested it without those changes to core::fmt and this time it worked :/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there was a match anywhere. Just something more to do with having an uninhabited type in live code causing segfaults or something.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might have had something with old trans' _match codegen using is_uninhabited, directly or indirectly.
TBH, any old call to is_uninhabited is suspicious. You also may have had more local changes at the time.

If we change any behavior here, we should warn about non-generic types that can't have values, IMO.
Otherwise we risk a lot of subtle breakage across the ecosystem (but this is more relevant to the Error PR).

@nikomatsakis
Copy link
Contributor

@canndrew in general, we try not to break unsafe code, though I agree that it has fewer guarantees that safe code (in particular as the unsafe code guidelines work is ongoing). But I feel like &Void is probably used in the wild quite a bit, and I feel like it's a reasonable thing to want to do, so I feel fairly uncomfortable declaring it as "wrong". But it's not like we have to necessarily keep the definition ofVoid precisely as it is.

@canndrew
Copy link
Contributor Author

I feel &Void really should be uninhabited for the sake of logical consistency. Firstly, if we assume that a &T always points to a T (which we do), then it is uninhabited, QED. If we want pointers that don't necessarily point to valid data that's what *const T and *mut T are for. Secondly, not making it uninhabited will cause silly inconsistencies for things like pattern matching. Some examples:

let res: Result<u32, Void> = ...;
// valid
let Ok(x) = res;
// ERROR!
let Ok(x) = res.as_ref();

enum Three { A, B, C }
enum Two { A, B }
enum One { A }
enum Zero { }

match &some_three {
    &A => ...
    &B => ...
    &C => ...
}

match &some_two {
    &A => ...
    &B => ...
}

match &some_one {
    &A => ...
}

// ERROR!
match &some_zero {
}

A lot of people have had trouble adjusting to the concept of uninhabited types. It's a new concept to most programmers. Hopefully though this will change, particularly with the addition of ! to the language (and the manual). In particular though a lot of people have been mistaking Void for a unit-like type which is why we see things like &Void being used as a replacement for void * and why this change will break some code. But anyone whose code breaks was doing it wrong and was using unsafe code to it. Should the language really pay the price forever for their mistake?

@canndrew
Copy link
Contributor Author

canndrew commented Oct 6, 2016

Was this discussed in the lang meeting? What was the verdict?

@aturon
Copy link
Member

aturon commented Oct 11, 2016

@canndrew Gah, we missed it due to lack of T-lang label!

@aturon aturon added T-lang Relevant to the language team, which will review and decide on the PR/issue. and removed A-allocators Area: Custom and system allocators labels Oct 11, 2016
@aturon
Copy link
Member

aturon commented Oct 20, 2016

The main concern on the lang team side is that this could break existing matches against types like &Void, in the sense that the match arms that you can write today would no longer be allowed under this PR. We're going to run crater to see how common this situation is; it's possible that we can/should consider this as a bug fix for the compiler.

We'll revisit with crater data.

@nikomatsakis
Copy link
Contributor

Discussed in the @rust-lang/lang meeting briefly. I started a crater run, which seems like some basic data. This can certainly break the static semantics of existing code, right? I think we can legitimately consider this a bug fix, but we'll have to do warnings and so forth.

Crater run is comparing: 0a0215d 57ceb4b3c6e6a711124c1a87254ff5cf3de1e1a3

@eddyb
Copy link
Member

eddyb commented Oct 20, 2016

@aturon I'm confused by that worry, AFAIK the existing code only uses is_uninhabited to not error in the case of non-exhaustive matches, that is, this can only remove errors, not produce new ones.

@eddyb
Copy link
Member

eddyb commented Oct 20, 2016

I'd r+ this with the libcore changes removed, TBH.

@nikomatsakis
Copy link
Contributor

@brson hmm it appears that crater came back with null results for me

@brson
Copy link
Contributor

brson commented Nov 11, 2016

I've started a crater run.

@brson
Copy link
Contributor

brson commented Nov 11, 2016

Still working on it.

@brson
Copy link
Contributor

brson commented Nov 12, 2016

Report.

No regressions. Some false positives.

@canndrew
Copy link
Contributor Author

So is this good to go then?

@bors
Copy link
Contributor

bors commented Nov 21, 2016

☔ The latest upstream changes (presumably #37824) made this pull request unmergeable. Please resolve the merge conflicts.

@aturon
Copy link
Member

aturon commented Nov 22, 2016

@canndrew Apologies for the delay. Yes, I think this is ready to land. cc @nikomatsakis @eddyb

enum Void {}
struct Void {
_private: (),
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not have any change in libcore if possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the problem with this change? I think this is what @nikomatsakis meant by redefining Void. We need to make some change here or else that module breaks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it break practically, or conceptually?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First conceptually, then practically when tests fail.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What tests? I don't want to land this if it changes semantics of code that already compiles (as opposed to just letting more code compiles).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IOW, this is a good/necessary change long-term, but it shouldn't be needed here - if it is, something is wrong with the assumption we're making which is the basis for allowing this PR through.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... I just removed that change and now all tests pass. I definitely added it out of necessity though, something must have changed in the intervening two months.

I'll do another commit to remove it.

impl<'a, 'gcx, 'tcx> AdtDefData<'tcx, 'static> {
#[inline]
pub fn is_uninhabited_recurse(&'tcx self,
visited: &mut HashMap<(DefId, &'tcx Substs<'tcx>), ()>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a set, not a map and you should use FxHashSet.

substs: &'tcx Substs<'tcx>,
is_union: bool) -> bool {
if is_union {
self.fields.iter().all(|f| f.is_uninhabited_recurse(visited, block, cx, substs))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes an union is always one of its members. Did we ever settle this? cc @petrochenkov

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eddyb
This is certainly a desirable property, but it can be violated right now, see example in #32836 (comment). So it's not settled yet.
However, the logic "all fields are uninhabitable" -> "union is uninhabitable" seems to be correct - a union needs to be initialized before use (move checker ensures this) and it can't be initialized if it only has uninhabited fields.

@eddyb
Copy link
Member

eddyb commented Nov 23, 2016

@bors r+

@bors
Copy link
Contributor

bors commented Nov 23, 2016

📌 Commit 2121118 has been approved by eddyb

bors added a commit that referenced this pull request Nov 23, 2016
Expand is_uninhabited

This allows code such as this to compile:

``` rust
let x: ! = ...;
match x {};

let y: (u32, !) = ...;
match y {};
```

@eddyb You were worried about making this change. Do you have any idea about what could break? Are there any special tests that need to be written for it?
@bors
Copy link
Contributor

bors commented Nov 23, 2016

⌛ Testing commit 2121118 with merge d515586...

@bors bors merged commit 2121118 into rust-lang:master Nov 23, 2016
bors added a commit that referenced this pull request Nov 30, 2016
Make core::fmt::Void a non-empty type.

Adding back this change that was removed from PR #36449 because it's a fix and because I immediately hit a problem with it again when I started implementing my fix for #12609.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants