`TypeId` exposes equality-by-subtyping vs normal-form-syntactic-equality unsoundness. #97156

lcnr · 2022-05-18T22:18:24Z

type One = for<'a> fn(&'a (), &'a ());
type Two = for<'a, 'b> fn(&'a (), &'b ());

mod my_api {
    use std::any::Any;
    use std::marker::PhantomData;

    pub struct Foo<T: 'static> {
        a: &'static dyn Any,
        _p: PhantomData<*mut T>, // invariant, the type of the `dyn Any`
    }
    
    impl<T: 'static> Foo<T> {
        pub fn deref(&self) -> &'static T {
            match self.a.downcast_ref::<T>() {
                None => unsafe { std::hint::unreachable_unchecked() },
                Some(a) => a,
            }
        }
        
        pub fn new(a: T) -> Foo<T> {
           Foo::<T> {
                a: Box::leak(Box::new(a)),
                _p: PhantomData,
            } 
        }
    }
}

use my_api::*;

fn main() {
    let foo = Foo::<One>::new((|_, _| ()) as One);
    foo.deref();
    let foo: Foo<Two> = foo;
    foo.deref();
}

has UB from hitting the unreachable_unchecked because TypeId::of::<One>() is not the same as TypeId::of::<Two>() despite them being considered the same types by the type checker. Originally this was thought to be a nightly-only issue with feature(generic_const_exprs) but actually the weird behaviour of TypeId can be seen on stable and result in crashes or UB in unsafe code.

original description follows below:

#![feature(const_type_id, generic_const_exprs)]

use std::any::TypeId;
// `One` and `Two` are currently considered equal types, as both
// `One <: Two` and `One :> Two` holds.
type One = for<'a> fn(&'a (), &'a ());
type Two = for<'a, 'b> fn(&'a (), &'b ());
trait AssocCt {
    const ASSOC: usize;
}
const fn to_usize<T: 'static>() -> usize {
    const WHAT_A_TYPE: TypeId = TypeId::of::<One>();
    match TypeId::of::<T>() {
        WHAT_A_TYPE => 0,
        _ => 1000,
    } 
}
impl<T: 'static> AssocCt for T {
    const ASSOC: usize = to_usize::<T>();
}

trait WithAssoc<U> {
    type Assoc;
}
impl<T: 'static> WithAssoc<()> for T where [(); <T as AssocCt>::ASSOC]: {
    type Assoc = [u8; <T as AssocCt>::ASSOC];
}

fn generic<T: 'static, U>(x: <T as WithAssoc<U>>::Assoc) -> <T as WithAssoc<U>>::Assoc
where
    [(); <T as AssocCt>::ASSOC]:,
    T: WithAssoc<U>,
{
    x
}


fn unsound<T>(x: <One as WithAssoc<T>>::Assoc) -> <Two as WithAssoc<T>>::Assoc
where
    One: WithAssoc<T>,
{
    let x: <Two as WithAssoc<T>>::Assoc = generic::<One, T>(x);
    x
}

fn main() {
    println!("{:?}", unsound::<()>([]));
}

TypeId being different for types which are considered equal types allows us to take change the value of a projection by switching between the equal types in its substs and observing that change by looking at their TypeId. This is possible as switching between equal types is allowed even in invariant positions.

This means that stabilizing const TypeId::of and allowing constants to flow into the type system, e.g. some minimal version of feature(generic_const_exprs), will be currently unsound.

I have no idea on how to fix this. I don't expect that we're able to convert higher ranked types to some canonical representation. Ah well, cc @rust-lang/project-const-generics @nikomatsakis

The text was updated successfully, but these errors were encountered:

lcnr · 2022-05-18T23:01:53Z

I don't expect that we're able to convert higher ranked types to some canonical representation

Ok, I have an idea on how to convert higher ranked types to some canonical representation. I just don't think that idea is very good. I also don't know how I would extend this to implication types in the future

CAD97 · 2022-05-19T01:14:17Z

To clarify, is

One and Two are equal types with different type ids

the issue here, and the rest is just exploiting this to demonstrate the unsoundness? (At least this seems to break the guarantees I understand TypeId to be providing... it just isn't exploitable without a way to substitute the "equal" types.)

If this holds, then the resolution is to ether give One and Two the same type id or to make One and Two nonequal types.

Giving One and Two the same type id would be disastrous, as it would allow e.g. dynamically casting for<'a> fn(&'a T) -> &'a T to for<'a> fn(&'a T) -> &'static T.

cc also @eddyb #95845, as using the v0 mangling for TypeId would erase the lifetimes from higher ranked types AIUI. The current impl of #95845 still uses the current hash, so wouldn't be problematic, but this is something to keep in mind w.r.t. the information contained in TypeId vs the v0 type mangling.

(The issue OP ccs itself; this is likely a typo?)

eddyb · 2022-05-19T01:38:28Z

There is a lot of confusion here wrt all the pieces involved.

Both legacy and v0 mangling, and both the TypeId hash and a v0-mangling-based implementation (like #95845), use the same encoding for higher-ranked lifetimes (it's why the v0 mangling mentions lifetimes at all!), though legacy symbols/type hashes only as the input to the hash.

You couldn't not mangle them because otherwise TypeId::of::<One> and TypeId::of::<Two> would clash in their symbol names (and that's actually how I found out that v0 mangling needed to reason about for<'a>, when implementing it, heh).

That all uses "binder anonymization" so the fully normalized types are:

One: for<'0> fn(&'0 (), &'0 ())
Two: for<'0, '1> fn(&'0 (), &'1 ())

The issue here is this broken fragment of type unification code:

rust/compiler/rustc_infer/src/infer/equate.rs

Lines 154 to 159 in cd282d7

    
           if a.skip_binder().has_escaping_bound_vars() || b.skip_binder().has_escaping_bound_vars() { 
        
               self.fields.higher_ranked_sub(a, b, self.a_is_expected)?; 
        
               self.fields.higher_ranked_sub(b, a, self.a_is_expected) 
        
           } else { 
        
               // Fast path for the common case. 
        
               self.relate(a.skip_binder(), b.skip_binder())?;

This treats (T <: U) + (U <: T) as meaning T = U. But that's not sound at all!

Too much of Rust's typesystem depends on T = U matching "identity" (~~judgmental~~ syntactic equality if you want) with "enough normalization", so a definition equivalent to "subtyping both ways" only works if it's compatible, and in this case it's not.

I don't think there's an "obvious choice" for which of One or Two should be "the normal form" for the other (Two seems more likely but it still feels dubious), so the correct thing to do (at least in the interim) is to stop using subtyping entirely when dealing with strong equality, and only support anonymizing the binders (which is the only thing done for the maximal normalization used by symbol names and TypeId, tho sadly I don't think we always do it, yet).

Even if we normalized one of One or Two to the other, equality should still not use subtyping, and instead normalize to a common type first, to avoid the "unproven assertion" that "both-ways subtyping" implies "equality" (as opposed to the trivial truth in the other direction, which is given by subtyping reflexivity AFAIK).

eddyb · 2022-05-19T02:43:37Z

Retitled the issue to reflect the situation as I understand it (TypeId only exposes this the same way e.g. symbol names or any other type-equality-sensitive part of rustc might).

RalfJung · 2022-05-19T06:27:46Z

One and Two are equal types with different type ids.

In which sense are these types "equal"? Two allows references with different lifetimes, One does not, so those look like different types to me.

Going off the issue title, are you saying they each are a subtype of the other? That is different from saying they are "equal" though. Might be good to update the issue description because this confused the heck out of me until I read #97156 (comment).

Cc #97156

You are Cc'ing an issue in itself? probably a typo, not sure which issue you meant though.

oli-obk · 2022-05-19T09:45:03Z

Two allows references with different lifetimes, One does not, so those look like different types to me.

Well... from a user perspective any function that will be allowed as a function pointer of type One will also be allowed as a function pointer of type Two. Also any caller to One can just also call Two without any visible behaviour (if the function pointers actually point to the same function).

So they are structurally different, but behaviour wise they are the same. If we had sufficiently smart normalization we could normalize One to Two, just gets hard to do that normalization properly in the presence of more complicated variance (mutable references and interior mutability)

eddyb · 2022-05-26T00:45:24Z

@RalfJung FWIW I changed the issue title, it was something else before.

In which sense are these types "equal"?

Specifically unification in the Equate mode - we seem to have special-cased "each of the two types is a subtype of the other" to imply "equality" without justifying/enforcing it in a way the rest of the compiler - which mostly relies on Ty<'tcx>: Eq (~~and I assume that sort of "syntactic match" is equivalent to judgmental equality~~), especially after post-monomorphization normalization.

@oli-obk The normalization wouldn't need to be perfectly clever. You would just perform it before checking for equality (without involving subtyping), just like we do with anonymizing the lifetimes bound by the for<...> to avoid their order (or DefIds) affecting any semantics.

That is, we have to remove the unsound equality-via-subtyping logic and try to cover any breakage with a subset of the possible higher-ranked normalization (e.g. if a lifetime only appears in the outermost fn pointer type's input types, and its variance is the default for that position, so not invariant or otherwise flipped, then it can be split into one lifetime per position) - we could maybe even make this be part of anonymize_late_bound_regions, since that has to replace every one of those lifetime positions anyway.

EDIT: I regret using the word "judgmental" (or its misspelling, apparently), because I just remembered that type equality judgments can definitely involve subtyping, that distinction is more about "being built-in" vs reified proofs of type equality (or "paths" in HoTT).
("syntactic" should be correct though, I'll switch to that)

nikomatsakis · 2022-05-27T10:03:49Z

A few thoughts:

How I understand the bug

Some parts of rustc do indeed assume that T1 <: T2 and T2 <: T1 imply T1 == T2 (I think of this as "semantic equality", in this sense, though I realize it's actually a distinct concept and I should probably find another term). However, for many Rust types, there is no canonical type T3 that both T1 and T2 can be converted to, or at least, we don't know of such a type (especially given the directions I would like to go with our types, I do not think that canonical types will exist in general, see below).

For the compiler itself, lacking a canonical type T3 is not a big issue, although it does make things inconvenient, since checking semantic equality is more complex. (Associated types throw a similar wrench into these things too, hence the desire for lazy normalization.)

The typeid function however "works best" if canonical types exist, though I would say that for soundness they are not strictly required:

If you have canonical types, then you have the property that type_id::<T1>() == type_id::<T2>() iff T1 == T2
However, since we lack them, we have only one direction: type_id::<T1>() == type_id::<T2>() implies T1 == T2, but not vice versa. In other words, you kind of have to assume that, for any given type, there is some set of valid return values for type_id and that type_id chooses non-deterministically from that set (but each equality class of types has a distinct set).

Whether or not the "nondeterministic type-id" is a problem depends on how you choose to draw the lines:

We can't allow type_id in const evaluation, because it is not deterministic (or, if we were to do so, we'd have to account for its non-deterministic return value, at least). This is the unsoundness described in the OP.
At runtime, the non-deterministic behavior is surprising, and hence people may write unsafe code that would prove to be unsound. It also works against useful structures like hashmaps indexed by TypeId.

Ways to fix the bug

Based on the above analysis, it seems like we can ...

Distinguish syntactic equality (presumably alpha equivalence) from semantic equality. This is the most straight-forward solution and probably a safe (i.e., reversible) intermediate step, though I was curious to poke a few people (including you, @RalfJung) to get a sense for what surprises may lie in wait from doing so. @lcnr is exploring this in [CRATER] separate type equality from mutual subtyping #97427.
Try to establish canonical forms -- this may be possible for higher-ranked types when accounting for variance, though I feel like I convinced myself that this was not so some time ago, but I'm not sure. That said, I somehow doubt it will be possible when you bring the more complex types that a-mir-formality is aiming to include into the picture, and I want those types for various Rust features. Still, perhaps worth exploring.
Accept that TypeId is non-deterministic. We could perhaps deprecate it and introduce a version that returns Option<TypeId>, where Some is only returned for types that have a canonical form. Alternatively, this new version could include a marker trait for "types with canonical forms" (i.e., with type-ids) -- in a way, the 'static bound was meant to serve as that role, but as this bug shows, that's not sufficient.

I'm not really aware of other solutions, I'm curious if folks have any ideas there.

RalfJung · 2022-05-28T14:16:31Z

Having two notions of equality (one more fine-grained than the other) is certainly possible, but needs some care to avoid accidentally using the wrong one in the wrong spot. I am not enough of a subtyping expert to know if there are any common pitfalls with this approach. It would of course be nicer, conceptually, to make subtyping antisymmetric, i.e., "a <= b /\ b <= a -> a == b".

I'm not really aware of other solutions, I'm curious if folks have any ideas there.

Some algebraic thoughts: whatever equality on types ends up being, the property we need is that all types in an equivalence class have the same type ID. The easiest way to achieve this is to pick a unique representative for each equivalence class, have a normalization function that returns said representative, and use its type ID. However, in principle it is conceivable that one could generate a suitable type ID directly, without having such a normalization function. I don't know if that is actually any easier in this case, but it might be worth thinking about.

Jules-Bertholet · 2022-11-09T19:23:37Z

Random thought:

For the specific case of for<'a> fn(&'a (), &'a ()) and for<'a, 'b> fn(&'a (), &'b ()), could both of these be normalized to fn(&'empty (), &'empty ()) as described here?

lcnr · 2022-11-09T20:13:57Z

Afaik we're currently moving towards removing ReEmpty so that won't work 😆

also, for<'a> fn(&'a ()) needs to have a different TypeId as fn(&'static ()) (as these are different types which are both 'static) and as we erase regions during codegen, changing for<'a> fn(&'a ()) to fn(&'erased ()) would make it indistinguishable from fn(&'static ()).

Jules-Bertholet · 2022-11-09T21:42:12Z

Afaik we're currently moving towards removing ReEmpty

What's the reasoning behind this change? Any discussion I can read up on?

For context, from what I understand, 'empty is needed for an impl <T> Trait for fn(T) to include impl <T> Trait for fn(&T).
And that is needed to allow making coherence_leak_check a hard error without breaking people's use cases.
And making coherence_leak_check a hard error might be needed to make pattern types play well with type inference.

lcnr · 2022-11-10T11:58:49Z

For context, from what I understand, 'empty is needed for an impl <T> Trait for fn(T) to include impl <T> Trait for fn(&T).

Well 'empty is the reason why we even lint coherence_leak_check right now afaik 😅

I don't your idea proposed in the linked internals thread is possible:

T implements Trait. In this case, if any of T's supertypes or subtypes also implement Trait, those impls must come from the same impl block as the impl of Trait for T. Additionally, if U is a supertype of T, and U does not implement Trait, then none of U's supertypes may implement Trait. If any of these rules is violated, it's a coherence error.

I don't think this works, because not only does the same impl has to apply for coherence, but this must also recursively hold for all associated types. If we again look at for<'a> fn(&'a ()) and fn(&'static ()): These have to be different types with different TypeIds. But we currently have for<'a> fn(&'a ()) <: fn(&'static ()).

lcnr · 2022-11-10T12:00:18Z

What's the reasoning behind this change? Any discussion I can read up on?

Not sure, you can try asking about it in the #t-types stream on zulip 😅 i think that's mostly been mentioned as a sidenote during other discussions.

Jules-Bertholet · 2022-11-10T15:24:47Z

I don't think this works, because not only does the same impl has to apply for coherence, but this must also recursively hold for all associated types.

I'm not sure I understand what you are saying here?

lcnr · 2022-11-11T13:44:00Z

afaict your idea is that if a <: b holds than if a: Trait and b: Trait holds then a and b must share the Trait impl. The goal probably being that selection for a and b always has the same result.
if that impl has an associated type, then if <a as Trait>::Assoc and <b as Trait>::Assoc both implement OtherTrait, then these two must also share the same impl of OtherTrait. Otherwise selection could differentiate between a and b again.

I don't think this restriction can be enforced with the best example being const fn TypeId::of (which we should stabilize at some point) for for<'a> fn(&'a ()) and fn(&'static ()).

Does that make sense to you? I don't have a clear image of why you need that invariant, so I might not make much sense here

Jules-Bertholet · 2022-11-11T18:17:59Z

The "one impl block" rule isn't necessary for soundness. It's meant to make some footguns harder to shoot, not to uphold some absolutely necessary invariants. To elaborate:

Let's say you have a trait Trait, and a type T, such that T: Trait does not hold, but there exists a type U such that T <: U and U: Trait. Then, the following code works just fine:

fn takes_a_u(_: U) {}

fn takes_a_t(t: T) {
    // `t: T` coerced to supertype `U`
    takes_a_u(t);
}

However, the following doesn't work today:

fn takes_impl_trait(_: impl Trait) {}

fn takes_a_t(t: T) {
    // Compile error!
    // `t: T` not coerced to supertype `U`
    takes_impl_trait(t);
}

To make pattern types ergonomic, the second example most likely needs to compile. The issue is that there might be more than one possible U that satisfies the trait bounds, so there needs to be a way to pick a "best" one among them. The proposed coherence rules are meant to ensure that:

This "best" supertype can be chosen deterministically.
It should be difficult to construct a situation where adding a new trait impl to a type breaks code. Like how implementing IntoIterator for arrays needed a per-edition hack because of the interaction with auto-ref; I want to avoid that kind of thing.

The "one impl block" rule is meant to ensure #2. But "difficult" doesn't need to mean "impossible", as type inference breakage is technically "minor breakage." It's OK in terms of soundness for trait impls to have different behavior for subtypes with different TypeId's, but it should be hard to do accidentally.

Does that make sense?

…egions-on-iat, r=<try> Do not erase late bound regions when selecting inherent associated types In the fix for rust-lang#97156 we would want the following code: ```rust #![feature(inherent_associated_types)] #![allow(incomplete_features)] struct Foo<T>(T); impl Foo<fn(&'static ())> { type Assoc = u32; } trait Other {} impl Other for u32 {} // FIXME(inherent_associated_types): Avoid emitting two diagnostics (they only differ in span). // FIXME(inherent_associated_types): Enhancement: Spruce up the diagnostic by saying something like // "implementation is not general enough" as is done for traits via // `try_report_trait_placeholder_mismatch`. fn bar(_: Foo<for<'a> fn(&'a ())>::Assoc) {} //~^ ERROR mismatched types //~| ERROR mismatched types fn main() {} ``` to fail with ... ``` error[E0220]: associated type `Assoc` not found for `Foo<for<'a> fn(&'a ())>` in the current scope --> tests/ui/associated-inherent-types/issue-109789.rs:18:36 | 4 | struct Foo<T>(T); | ------------- associated item `Assoc` not found for this struct ... 18 | fn bar(_: Foo<for<'a> fn(&'a ())>::Assoc) {} | ^^^^^ associated item not found in `Foo<for<'a> fn(&'a ())>` | = note: the associated type was found for - `Foo<fn(&'static ())>` error: aborting due to previous error For more information about this error, try `rustc --explain E0220`. ``` This PR fixes the ICE we are currently getting "was a subtype of Foo<Binder(fn(&ReStatic ()), [])> during selection but now it is not" Also fixes rust-lang#112631 r? `@lcnr`

…r=<try> Fix for TypeId exposes equality-by-subtyping vs normal-form-syntactic-equality unsoundness Fixes rust-lang#97156 This PR revives rust-lang#97427 idea, it sits on top of rust-lang#118118 because the idea uncovered some problems with IATs. r? `@lcnr` This is ICEing yet for `tests/ui/traits/new-solver/escaping-bound-vars-in-writeback-normalization.rs` using the new trait solver. After rust-lang#118118 and this ICE is fixed, we would need a rebase and a crater run. Opening as a WIP for now.

…egions-on-iat, r=compiler-errors Do not erase late bound regions when selecting inherent associated types In the fix for rust-lang#97156 we would want the following code: ```rust #![feature(inherent_associated_types)] #![allow(incomplete_features)] struct Foo<T>(T); impl Foo<fn(&'static ())> { type Assoc = u32; } trait Other {} impl Other for u32 {} // FIXME(inherent_associated_types): Avoid emitting two diagnostics (they only differ in span). // FIXME(inherent_associated_types): Enhancement: Spruce up the diagnostic by saying something like // "implementation is not general enough" as is done for traits via // `try_report_trait_placeholder_mismatch`. fn bar(_: Foo<for<'a> fn(&'a ())>::Assoc) {} //~^ ERROR mismatched types //~| ERROR mismatched types fn main() {} ``` to fail with ... ``` error[E0220]: associated type `Assoc` not found for `Foo<for<'a> fn(&'a ())>` in the current scope --> tests/ui/associated-inherent-types/issue-109789.rs:18:36 | 4 | struct Foo<T>(T); | ------------- associated item `Assoc` not found for this struct ... 18 | fn bar(_: Foo<for<'a> fn(&'a ())>::Assoc) {} | ^^^^^ associated item not found in `Foo<for<'a> fn(&'a ())>` | = note: the associated type was found for - `Foo<fn(&'static ())>` error: aborting due to previous error For more information about this error, try `rustc --explain E0220`. ``` This PR fixes the ICE we are currently getting "was a subtype of Foo<Binder(fn(&ReStatic ()), [])> during selection but now it is not" Also fixes rust-lang#112631 r? `@lcnr`

…r=<try> Fix for TypeId exposes equality-by-subtyping vs normal-form-syntactic-equality unsoundness Fixes rust-lang#97156 This PR revives rust-lang#97427 idea, it sits on top of rust-lang#118118 because the idea uncovered some problems with IATs. r? `@lcnr` This is ICEing yet for `tests/ui/traits/new-solver/escaping-bound-vars-in-writeback-normalization.rs` using the new trait solver. After rust-lang#118118 and this ICE is fixed, we would need a rebase and a crater run. Opening as a WIP for now.

…r=<try> Fix for TypeId exposes equality-by-subtyping vs normal-form-syntactic-equality unsoundness Fixes rust-lang#97156 This PR revives rust-lang#97427 idea. r? `@lcnr`

…lnay Stabilize `type_name_of_val` Make the following API stable: ```rust // in core::any pub fn type_name_of_val<T: ?Sized>(_val: &T) -> &'static str ``` This is a convenience method to get the type name of a value, as opposed to `type_name` that takes a type as a generic. Const stability is not added because this relies on `type_name` which is also not const. That has a blocking issue rust-lang#97156. Wording was also changed to direct most of the details to `type_name` so we don't have as much duplicated documentation. Fixes tracking issue rust-lang#66359. There were two main concerns in the tracking issue: 1. Naming: `type_name_of` and `type_name_of_val` seem like the only mentioned options. Differences in opinion here come from `std::mem::{size_of, align_of, size_of_val, align_of_val}`. This PR leaves the name as `type_name_of_val`, but I can change if desired since it is pretty verbose. 2. What this displays for `&dyn`: I don't think that having `type_name_of_val` function resolve those is worth the headache it would be, see rust-lang#66359 (comment) for some workarounds. I also amended the docs wording to leave it open-ended, in case we have means to change that behavior in the future. `@rustbot` label -T-libs +T-libs-api +needs-fcp r? libs-api

…lnay Stabilize `type_name_of_val` Make the following API stable: ```rust // in core::any pub fn type_name_of_val<T: ?Sized>(_val: &T) -> &'static str ``` This is a convenience method to get the type name of a value, as opposed to `type_name` that takes a type as a generic. Const stability is not added because this relies on `type_name` which is also not const. That has a blocking issue rust-lang#97156. Wording was also changed to direct most of the details to `type_name` so we don't have as much duplicated documentation. Fixes tracking issue rust-lang#66359. There were two main concerns in the tracking issue: 1. Naming: `type_name_of` and `type_name_of_val` seem like the only mentioned options. Differences in opinion here come from `std::mem::{size_of, align_of, size_of_val, align_of_val}`. This PR leaves the name as `type_name_of_val`, but I can change if desired since it is pretty verbose. 2. What this displays for `&dyn`: I don't think that having `type_name_of_val` function resolve those is worth the headache it would be, see rust-lang#66359 (comment) for some workarounds. I also amended the docs wording to leave it open-ended, in case we have means to change that behavior in the future. ``@rustbot`` label -T-libs +T-libs-api +needs-fcp r? libs-api

Rollup merge of rust-lang#118234 - tgross35:type_name_of_value, r=dtolnay Stabilize `type_name_of_val` Make the following API stable: ```rust // in core::any pub fn type_name_of_val<T: ?Sized>(_val: &T) -> &'static str ``` This is a convenience method to get the type name of a value, as opposed to `type_name` that takes a type as a generic. Const stability is not added because this relies on `type_name` which is also not const. That has a blocking issue rust-lang#97156. Wording was also changed to direct most of the details to `type_name` so we don't have as much duplicated documentation. Fixes tracking issue rust-lang#66359. There were two main concerns in the tracking issue: 1. Naming: `type_name_of` and `type_name_of_val` seem like the only mentioned options. Differences in opinion here come from `std::mem::{size_of, align_of, size_of_val, align_of_val}`. This PR leaves the name as `type_name_of_val`, but I can change if desired since it is pretty verbose. 2. What this displays for `&dyn`: I don't think that having `type_name_of_val` function resolve those is worth the headache it would be, see rust-lang#66359 (comment) for some workarounds. I also amended the docs wording to leave it open-ended, in case we have means to change that behavior in the future. ``@rustbot`` label -T-libs +T-libs-api +needs-fcp r? libs-api

danielhenrymantilla · 2024-01-05T17:34:23Z

By the way this makes ::qcell's TCell{,Owner} API unsound, as pointed out by @kvverti over Discord.

use ::qcell::{TCell, TCellOwner};

type Q1 = for<'any> fn(fn(&'any ()));
type Q2 = fn(fn(&'static ()));
let a = TCellOwner::<Q1>::new();
let b = TCellOwner::<Q2>::new(); // does not panic since `TypeId::of::<Q2>()` is distinct.
let x = TCell::new(42);
::core::mem::swap(
    x.rw(&mut { a }),
    x.rw(&mut { b }), // Q2 -> Q1 even though `TCellOwner<_>` is invariant
);

Demo

Adding to what @BoxyUwU mentioned, it is interesting to observe that we currently have TypeId : Eq rather than TypeId : PartialEq, which makes @nikomatsakis'

we have only one direction: type_id::<T1>() == type_id::<T2>() implies T1 == T2, but not vice versa

even more of a footgun for the unaware.

Deprecated because of fallout from: rust-lang/rust#97156 `TypeId::of::<L>() != TypeId::of::<R>()` does not imply `L != R` Removed all uses of `with_any` constructor outside of docs/tests specific to it.

change equate for binders to not rely on subtyping *summary by `@spastorino` and `@lcnr*` ### Context The following code: ```rust type One = for<'a> fn(&'a (), &'a ()); type Two = for<'a, 'b> fn(&'a (), &'b ()); mod my_api { use std::any::Any; use std::marker::PhantomData; pub struct Foo<T: 'static> { a: &'static dyn Any, _p: PhantomData<*mut T>, // invariant, the type of the `dyn Any` } impl<T: 'static> Foo<T> { pub fn deref(&self) -> &'static T { match self.a.downcast_ref::<T>() { None => unsafe { std::hint::unreachable_unchecked() }, Some(a) => a, } } pub fn new(a: T) -> Foo<T> { Foo::<T> { a: Box::leak(Box::new(a)), _p: PhantomData, } } } } use my_api::*; fn main() { let foo = Foo::<One>::new((|_, _| ()) as One); foo.deref(); let foo: Foo<Two> = foo; foo.deref(); } ``` has UB from hitting the `unreachable_unchecked`. This happens because `TypeId::of::<One>()` is not the same as `TypeId::of::<Two>()` despite them being considered the same types by the type checker. Currently the type checker considers binders to be equal if subtyping succeeds in both directions: `for<'a> T<'a> eq for<'b> U<'b>` holds if `for<'a> exists<'b> T<'b> <: T'<a> AND for<'b> exists<'a> T<'a> <: T<'b>` holds. This results in `for<'a> fn(&'a (), &'a ())` and `for<'a, 'b> fn(&'a (), &'b ())` being equal in the type system. `TypeId` is computed by looking at the *structure* of a type. Even though these types are semantically equal, they have a different *structure* resulting in them having different `TypeId`. This can break invariants of unsafe code at runtime and is unsound when happening at compile time, e.g. when using const generics. So as seen in `main`, we can assign a value of type `Foo::<One>` to a binding of type `Foo<Two>` given those are considered the same type but then when we call `deref`, it calls `downcast_ref` that relies on `TypeId` and we would hit the `None` arm as these have different `TypeId`s. As stated in rust-lang/rust#97156 (comment), this causes the API of existing crates to be unsound. ## What should we do about this The same type resulting in different `TypeId`s is a significant footgun, breaking a very reasonable assumptions by authors of unsafe code. It will also be unsound by itself once they are usable in generic contexts with const generics. There are two options going forward here: - change how the *structure* of a type is computed before relying on it. i.e. continue considering `for<'a> fn(&'a (), &'a ())` and `for<'a, 'b> fn(&'a (), &'b ())` to be equal, but normalize them to a common representation so that their `TypeId` are also the same. - change how the semantic equality of binders to match the way we compute the structure of types. i.e. `for<'a> fn(&'a (), &'a ())` and `for<'a, 'b> fn(&'a (), &'b ())` still have different `TypeId`s but are now also considered to not be semantically equal. --- Advantages of the first approach: - with the second approach some higher ranked types stop being equal, even though they are subtypes of each other General thoughts: - changing the approach in the future will be breaking - going from first to second may break ordinary type checking, as types which were previously equal are now distinct - going from second to first may break coherence, because previously disjoint impls overlap as the used types are now equal - both of these are quite unlikely. This PR did not result in any crater failures, so this should not matter too much Advantages of the second approach: - the soundness of the first approach requires more non-local reasoning. We have to make sure that changes to subtyping do not cause the representative computation to diverge from semantic equality - e.g. we intend to consider higher ranked implied bounds when subtyping to [fix] rust-lang/rust#25860, I don't know how this will interact and don't feel confident making any prediction here. - computing a representative type is non-trivial and soundness critical, therefore adding complexity to the "core type system" --- This PR goes with the second approach. A crater run did not result in any regressions. I am personally very hesitant about trying the first approach due to the above reasons. It feels like there are more unknowns when going that route. ### Changing the way we equate binders Relating bound variables from different depths already results in a universe error in equate. We therefore only need to make sure that there is 1-to-1 correspondence between bound variables when relating binders. This results in concrete types being structurally equal after anonymizing their bound variables. We implement this by instantiating one of the binder with placeholders and the other with inference variables and then equating the instantiated types. We do so in both directions. More formally, we change the typing rules as follows: ``` for<'r0, .., 'rn> exists<'l0, .., 'ln> LHS<'l0, .., 'ln> <: RHS<'r0, .., 'rn> for<'l0, .., 'ln> exists<'r0, .., 'rn> RHS<'r0, .., 'rn> <: LHS<'l0, .., 'ln> -------------------------------------------------------------------------- for<'l0, .., 'ln> LHS<'l0, .., 'ln> eq for<'r0, .., 'rn> RHS<'r0, .., 'rn> ``` to ``` for<'r0, .., 'rn> exists<'l0, .., 'ln> LHS<'l0, .., 'ln> eq RHS<'r0, .., 'rn> for<'l0, .., 'ln> exists<'r0, .., 'rn> RHS<'r0, .., 'rn> eq LHS<'l0, .., 'ln> -------------------------------------------------------------------------- for<'l0, .., 'ln> LHS<'l0, .., 'ln> eq for<'r0, .., 'rn> RHS<'r0, .., 'rn> ``` --- Fixes #97156 r? `@lcnr`

lcnr added I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness C-bug Category: This is a bug. requires-nightly This issue requires a nightly compiler in some way. F-generic_const_exprs `#![feature(generic_const_exprs)]` labels May 18, 2022

This was referenced May 18, 2022

Tracking Issue for const fn type_id #77125

Open

TypeId yikes rust-lang/project-const-generics#46

Closed

eddyb changed the title ~~using values of TypeId in the type system is unsound~~ TypeId exposes equality-by-subtyping vs normal-form-judgemental-equality unsoundness. May 19, 2022

lcnr mentioned this issue May 26, 2022

[CRATER] separate type equality from mutual subtyping #97427

Closed

eddyb changed the title ~~TypeId exposes equality-by-subtyping vs normal-form-judgemental-equality unsoundness.~~ TypeId exposes equality-by-subtyping vs normal-form-syntactic-equality unsoundness. May 27, 2022

oli-obk mentioned this issue Jul 15, 2022

Tracking issue for const fn type_name #63084

Open

4 tasks

lcnr mentioned this issue Jul 7, 2023

name mangling and TypeId should not pass for aliases and generic params #113443

Open

compiler-errors mentioned this issue Jul 11, 2023

Deep dive for leak check/higher ranked fn ptr subtyping rust-lang/types-team#96

Open

QuineDot mentioned this issue Sep 26, 2023

Consider re-tuning the lifetime elision rules for trait objects #91302

Open

jackh726 added the S-bug-has-test Status: This bug is tracked inside the repo by a `known-bug` test. label Oct 11, 2023

lcnr assigned spastorino Nov 15, 2023

lcnr added this to T-types unsound issues Nov 15, 2023

lcnr moved this to unblocked in T-types unsound issues Nov 15, 2023

spastorino mentioned this issue Nov 21, 2023

Do not erase late bound regions when selecting inherent associated types #118118

Merged

tgross35 mentioned this issue Nov 24, 2023

Stabilize type_name_of_val #118234

Merged

spastorino mentioned this issue Nov 24, 2023

change equate for binders to not rely on subtyping #118247

Merged

dtolnay mentioned this issue Nov 25, 2023

Tracking issue for any::type_name_of_val #66359

Closed

1 task

bors closed this as completed in 878c8a2 Feb 29, 2024

github-project-automation bot moved this from unblocked to new solver everywhere in T-types unsound issues Feb 29, 2024

lcnr removed this from T-types unsound issues Apr 5, 2024

kvverti mentioned this issue May 14, 2024

Scoped impl Trait for Type rust-lang/rfcs#3634

Closed

jackh726 mentioned this issue Sep 14, 2024

Remove structural match from TypeId #103291

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`TypeId` exposes equality-by-subtyping vs normal-form-syntactic-equality unsoundness. #97156

`TypeId` exposes equality-by-subtyping vs normal-form-syntactic-equality unsoundness. #97156

lcnr commented May 18, 2022 •

edited by BoxyUwU

Loading

lcnr commented May 18, 2022

CAD97 commented May 19, 2022 •

edited

Loading

eddyb commented May 19, 2022 •

edited

Loading

eddyb commented May 19, 2022

RalfJung commented May 19, 2022 •

edited

Loading

oli-obk commented May 19, 2022

eddyb commented May 26, 2022 •

edited

Loading

nikomatsakis commented May 27, 2022 •

edited

Loading

RalfJung commented May 28, 2022

Jules-Bertholet commented Nov 9, 2022

lcnr commented Nov 9, 2022

Jules-Bertholet commented Nov 9, 2022

lcnr commented Nov 10, 2022

lcnr commented Nov 10, 2022

Jules-Bertholet commented Nov 10, 2022

lcnr commented Nov 11, 2022 •

edited

Loading

Jules-Bertholet commented Nov 11, 2022

danielhenrymantilla commented Jan 5, 2024

TypeId exposes equality-by-subtyping vs normal-form-syntactic-equality unsoundness. #97156

TypeId exposes equality-by-subtyping vs normal-form-syntactic-equality unsoundness. #97156

Comments

lcnr commented May 18, 2022 • edited by BoxyUwU Loading

lcnr commented May 18, 2022

CAD97 commented May 19, 2022 • edited Loading

eddyb commented May 19, 2022 • edited Loading

eddyb commented May 19, 2022

RalfJung commented May 19, 2022 • edited Loading

oli-obk commented May 19, 2022

eddyb commented May 26, 2022 • edited Loading

nikomatsakis commented May 27, 2022 • edited Loading

How I understand the bug

Ways to fix the bug

RalfJung commented May 28, 2022

Jules-Bertholet commented Nov 9, 2022

lcnr commented Nov 9, 2022

Jules-Bertholet commented Nov 9, 2022

lcnr commented Nov 10, 2022

lcnr commented Nov 10, 2022

Jules-Bertholet commented Nov 10, 2022

lcnr commented Nov 11, 2022 • edited Loading

Jules-Bertholet commented Nov 11, 2022

danielhenrymantilla commented Jan 5, 2024

`TypeId` exposes equality-by-subtyping vs normal-form-syntactic-equality unsoundness. #97156

`TypeId` exposes equality-by-subtyping vs normal-form-syntactic-equality unsoundness. #97156

lcnr commented May 18, 2022 •

edited by BoxyUwU

Loading

CAD97 commented May 19, 2022 •

edited

Loading

eddyb commented May 19, 2022 •

edited

Loading

RalfJung commented May 19, 2022 •

edited

Loading

eddyb commented May 26, 2022 •

edited

Loading

nikomatsakis commented May 27, 2022 •

edited

Loading

lcnr commented Nov 11, 2022 •

edited

Loading