Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement pointee metadata unsizing via a TypedMetadata<T> container #97052

Closed
wants to merge 1 commit into from

Conversation

CAD97
Copy link
Contributor

@CAD97 CAD97 commented May 15, 2022

TL;DR:

// std::ptr
struct TypedMetadata<T: ?Sized>(pub <T as Pointee>::Metadata);
impl<T: ?Sized + Unsize<U>, U: ?Sized> CoerceUnsized<TypedMetadata<U>> for TypedMetadata<T> {}

See src/test/ui/coercion/coerce-just-metadata.rs for an example of this working.

The motivating example is being able to spell a type like

struct SillyBox<T: ?Sized> {
    data: ptr::NonNull<()>,
    meta: TypedMetadata<T>,
}

and implement CoerceUnsized for it. This is shown to work in src/test/ui/coercion/silly-box.rs.

It may be desirable to support coercing DynMetadata<T> in the future (e.g. DynMetadata<dyn Trait + Sync> -> DynMetadata<dyn Trait>). This PR does not add this, but it could be added fairly easily by adding a CoerceUnsized impl for DynMetadata and allowing DynMetadata where TypedMetadata is checked for here.


FAQ

Isn't TypedMetadata<T> just a type alias for <T as Pointee>::Metadata?

Yes and no. The main problem at hand is that <T as Pointee>::Metadata no longer remembers what T is once the projection has been resolved. TypedMetadata<T> then is a "strongly typed alias" which reintroduces T and provides the ability to implement CoerceUnsized, which cannot be directly implemented on a resolved <T as Pointee>::Metadata.

There is a potential alternative -- instead of getting impl CoerceUnsized to work for TypedMetadata<T> as a builtin and then using the existing struct behavior for impl CoerceUnsized, we could teach coercion (and impl CoerceUnsized) to recognize the <T as Pointee>::Metadata projection and directly coerce that (as is currently done in TypedMetadata<T>) rather than delegating to the field's CoerceUnsized implementation. (However, this may be impractical, as type projections tend to get normalized away.)

This would make TypedMetadata a pure library type, but it would still be necessary if wanting to coerce/convert <T as Pointee>::Metadata in a function body or for some concrete T, rather than as a generic struct field.

Why a coercion instead of just a conversion?

Such that SillyBox<T> can be coerced. (See also the next question.)

A conversion is fairly simple to write: just ptr::from_raw_parts a null data pointer and the metadata, then coerce that.

Why not use *mut T instead of (*mut (), <T as Pointee>::Metadata)?

The short version: because you might not always have a pointer.

There are a few reasons you might want to elide storing the actual pointer-to-T, or store something other than a pointer alongside the pointee metadata.

One such case is if you have some allocation at a statically known location; in this case it could make sense to have a StaticRef which just stores the pointee metadata and implements Deref by constructing a normal reference from the raw parts of the static location and the stored metadata. Normally such usage would necessarily also know the full concrete type and thus not need to use non-() metadata, but situations such as using linker functionality to place a slice at a (linker-known) static address could want to have StaticRef<[T]> to avoid re-deriving the linker-provided length multiple times.

Another such case is the use of something like an arena allocator, or a compacting allocator, or any storage allocation which doesn't hand out raw pointers and pin the allocated memory. Here, your ArenaBox<T>/ArenaRef<T> would store the index into the arena (and perhaps additional metadata like a generation) and the pointee metadata, and then you ask the arena to temporarily resolve the ArenaRef<T> into a normal &T.

Yet another is StackBox<T>, which stores a potentially unsized pointee in its own inline memory, allowing for the use of unsized types (such as dyn Trait) without indirection by always reserving a fixed max size/align for them on the stack.

All of these could be serviced by storing the metadata on a dangling pointer; however, this extra stored pointer exists solely to appease the compiler, and not for any actual benefit to the program.

@rustbot rustbot added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels May 15, 2022
@rust-highfive
Copy link
Collaborator

Some changes occured to the CTFE / Miri engine

cc @rust-lang/miri

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with r? rust-lang/libs-api @rustbot label +T-libs-api -T-libs to request review from a libs-api team reviewer. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

  • Stabilizing library features
  • Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
  • Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
  • Changing public documentation in ways that create new stability guarantees
  • Changing observable runtime behavior of library APIs

@rust-highfive
Copy link
Collaborator

r? @cjgillot

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 15, 2022
@CAD97
Copy link
Contributor Author

CAD97 commented May 15, 2022

r? rust-lang/compiler @rustbot label +T-libs-api -T-libs +T-compiler

I think that's the correct tags/team

@rustbot rustbot added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. and removed T-libs Relevant to the library team, which will review and decide on the PR/issue. labels May 15, 2022
Copy link
Member

@compiler-errors compiler-errors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me from the perspective of typeck and middle, i cannot vouch for the const-eval and codegen parts but they don't look unreasonable either.

compiler/rustc_typeck/src/coherence/builtin.rs Outdated Show resolved Hide resolved
compiler/rustc_typeck/src/coherence/builtin.rs Outdated Show resolved Hide resolved
src/test/ui/coercion/coerce-just-metadata.rs Outdated Show resolved Hide resolved
/// assert_eq!(unsized_metadata.0, 5);
/// ```
#[cfg_attr(not(bootstrap), lang = "just_metadata")]
pub struct JustMetadata<T: ?Sized>(pub <T as Pointee>::Metadata);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not T-libs, but perhaps this could use some helper methods like from_raw_pointer(self, *const ()) -> *const T which does ptr::from_raw_parts. I don't know though.

Also, should this derive Copy and other common traits?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deriving would use the wrong bounds (on T instead of <T as Pointee>::Metadata), but yes, this should probably get the other common traits. Mostly I just wanted to open the PR as soon as I saw it working.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no worries, it doesn't need to be done in this PR.

@compiler-errors compiler-errors added the T-lang Relevant to the language team, which will review and decide on the PR/issue. label May 15, 2022
@rust-log-analyzer

This comment has been minimized.

@@ -194,21 +195,21 @@ pub fn unsize_ptr<'a, 'tcx, Bx: BuilderMethods<'a, 'tcx>>(
src_ty: Ty<'tcx>,
dst_ty: Ty<'tcx>,
old_info: Option<Bx::Value>,
) -> (Bx::Value, Bx::Value) {
) -> OperandValue<Bx::Value> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need to be implemented in cg_clif too. I can do this myself once this PR is merged.

@@ -342,7 +344,7 @@ pub fn coerce_unsized_info<'tcx>(tcx: TyCtxt<'tcx>, impl_did: DefId) -> CoerceUn
let param_env = tcx.param_env(impl_did);
assert!(!source.has_escaping_bound_vars());

let err_info = CoerceUnsizedInfo { custom_kind: None };
let err_info = CoerceUnsizedKind::Ptr;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to return something like None (i.e. make this function return an Option or Result? I think CoerceUnsizedInfo { custom_kind: None } was used here as both like CoerceUnsizedKind::Ptr and the "uhh i don't know" coercion kind.

Either that, or we should make a CoerceUnsizedKind::Error, though I'm not a fan of that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like it seems the code below is returning err_info on the error path (e.g. when we try to coerce two ADTs with different def_ids).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, this PR definitely doesn't make the situation worse, since the same amount of information is produced. It seems at least somewhat consistent to "treat this as if a builtin pointer cast" (for the purpose of continuing reporting errors) if the coercion is bad (since we won't get to codegen anyway).

It's also worth noting where this is actually used, which seems to always be for side effects or go through custom_coerce_unsize_info in practice, and CoerceUnsizedKind::Ptr is only used as an error in find_vtable_types_for_unsizing anyway

@rust-log-analyzer

This comment has been minimized.

@CAD97
Copy link
Contributor Author

CAD97 commented May 17, 2022

This error doesn't occur on x86_64 windows so I don't know how to go about debugging it. It was caused by 751e5e960f1388963e9e080fd12b6c014daa5ef1 deduplicating the unsize logic between operating on pointer ScalarPair and JustMetadata Scalar.

@RalfJung
Copy link
Member

Can you run a 32bit Windows target on a 64bit Windows host? That is probably enough to trigger the problem, if it is the usual pointer size mismatch kind of issue.

@michaelwoerister
Copy link
Member

Thanks for the PR, @CAD97. Is there some prior discussion about this somewhere? Is this something the lang-team needs to sign off on?

@CAD97
Copy link
Contributor Author

CAD97 commented May 17, 2022

T-lang should probably sign off on this direction being the chosen solution (rather than e.g. making <T as Pointee>::Metadata coercible directly), yeah.

Some recent discussion with @compiler-errors here (https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/unsizing.20pointee.20metadata); I know this has been generally desired (and @compiler-errors had an experiment at the same) but don't have any references at the moment, unfortunately.

I should probably write down the issues I had trying to implement CoerceUnsized<<U as Pointee>::Metadata> for ThinMetadata<T>... my partial broken tree for that approach is CAD97@882a5d2 and includes incoherent impl crimes because U isn't covered in that impl.

@rust-log-analyzer

This comment has been minimized.

@CAD97
Copy link
Contributor Author

CAD97 commented Jul 2, 2022

Rebased, squashed1, and renamed JustMetadata to TypedMetadata (agree, it's a better name).

Footnotes

  1. only having the single diff to resolve conflicts on is significantly easier, since the history previously involved some change-and-change-back.

@CAD97
Copy link
Contributor Author

CAD97 commented Jul 2, 2022

if it's possible to apply an unsizing coercion to a pointer-to-custom-pointee-kind, it would have to work on just the metadata. I can't think of any but the absolutely most contrived example where this could be an issue

Realistic example:

Consider a type Indyn<T> which is (T::Metadata, T), and &Indyn<T> is always thin. Essentially, the heap representation of ThinBox<T>. The above restriction prevents a coercion from &Indyn<dyn Trait> to &dyn Trait via CoerceUnsized; a conversion via auto[de]ref &*(_: &Indyn<dyn Trait>) is fine.

@bjorn3
Copy link
Member

bjorn3 commented Jul 3, 2022

The above restriction prevents a coercion from &Indyn to &dyn Trait via CoerceUnsized

That won't work anyway due to T::Metadata being at the start, right? &Indyn<dyn Trait> coercion to &dyn Trait would take the same pointer that points to (T::Metadata, T) and use it as pointer part of &dyn Trait, while it would need to be offset by size_of::<T::Metadata>.

@CAD97
Copy link
Contributor Author

CAD97 commented Jul 3, 2022

That coercion could work if Indyn<dyn Trait>: Trait. It still runs into the other problems with pointee-sized DSTs rather than meta-sized DSTs (see rust-lang/lang-team#166), but coercing &Indyn<dyn Trait> -> &dyn Trait could work by using a vtable which is implemented as roughly method(ptr, args...) = (*(ptr as *const trait_vtable_t)).method(ptr.offset_bytes(sizeof::<trait_vtable_t>()), args...)`.

Writing that out: this coercion itself actually does only need the (Indyn) type and its pointee metadata (()), and is not verboten by TypedMetadata coercions. I return to the position that any coercions that are address/pointee-sensitive are custom coercions to custom DSTs (such as avoiding the recursive vtable lookup by dynamically creating a vtable).

@joshtriplett
Copy link
Member

I'm still not clear on why we want this to be a coercion. Is there a use case or snippet of code that shows why that would be needed?

@CAD97 CAD97 changed the title Implement pointee metadata unsizing via a JustMetadata<T> container Implement pointee metadata unsizing via a TypedMetadata<T> container Jul 5, 2022
@CAD97
Copy link
Contributor Author

CAD97 commented Jul 5, 2022

Here is an actual pseudo-worked example of a polymorphically typed generational arena. Details have been elided to better focus on the point of TypedMetadata.

type Memory = [MaybeUninit<u8>];
struct Arena {
    memory: Vec<(Generation, Vec<u8>)>,
    /* elided: synchronization, ref-arena matching */
}

#[derive(Copy, Clone)]
type Generation = usize;

#[derive(Copy, Clone)]
struct GenerationalIndex {
    index: usize,
    generation: Generation,
}

#[derive(Copy, Clone)]
struct ArenaRef<T: ?Sized> {
    index: GenerationalIndex,
    meta: TypedMetadata<T>,
    /* elided: ref-arena matching */
}

impl Arena {
    fn index(&self, ix: GenerationalIndex) -> &Memory { /* elided */ }
    fn resolve<T>(&self, ix: ArenaRef<T>) -> &T {
        assert!(self.contains(ix));
        let memory = self.index(ix.index);
        unsafe { &*ptr::from_raw_parts(memory.as_ptr().cast(), ix.meta) }
    }
    
    /* elided: insertion, mutable access, other utilities */
}

trait Entity {
    /* elided: generic behavior on any entity */
}

impl Entity for Player {}
struct Player {
    /* elided */
}

impl Player {
    fn attack(&self, ctx: &Arena, other: ArenaRef<dyn Entity>) { /* elided */ }
}

fn pvp(ctx: &Arena, player_1: ArenaRef<Player>, player_2: ArenaRef<Player>) {
    ctx.resolve(player_1).attack(ctx, player_2);
    ctx.resolve(player_2).attack(ctx, player_1);
}

If you would like a more fully worked example, I can make a running example using explicit unsizing instead.

@bors
Copy link
Contributor

bors commented Jul 6, 2022

☔ The latest upstream changes (presumably #98206) made this pull request unmergeable. Please resolve the merge conflicts.

@joshtriplett
Copy link
Member

@CAD97 I'm not trying to be obtuse, but I think I'm still failing to follow how this relates to why we need to require coercion (as opposed to, for instance, an invoked operation).

@CAD97
Copy link
Contributor Author

CAD97 commented Jul 12, 2022

If we're okay with ArenaRef<impl Entity> not being coercible to ArenaRef<dyn Entity> and requiring an explicit conversion, then not having the coercion is fine.

But unsizing for custom user types is desirable enough to have a conversion polyfill and CoerceUnsized and Unsize have a tracking issue rather than be rustc-internal perma-unstable issue = "none". It would be very unfortunate for ArenaRef having to choose between storing *mut T (and getting unsizing coercion for free) or being more space efficient, storing <T as Pointee>::Metadata, and losing the unsizing coercion support.

The ultimate thing that makes this an actual requirement though is that I want Box to (be able to) be struct <T: ?Sized, S: Storage> { handle: S::Handle, meta: TypedMetadata<T>, storage: S }, and something has to have an unsizing coercion to avoid removing Box's. Perhaps that could be on some struct RawBox <T: ?Sized, S: Storage> (S::Handle, <T as Pointee>::Metadata) instead1, but then the impl will be the exact same on RawBox as currently on TypedMetadata, but more annoying because of the extra data/parameter.

So yes, the reasoning is the kind of circular "we need to be able to coerce metadata not on a pointer because (I believe) we want to be able to coerce metadata not on a pointer". This is essentially saying it's possible to implement *mut T as (*mut (), TypedMetadata<T>).2

But it's worth making explicit: while this PR is motivated by my Storage work, it does not exist solely for the Storage API; it exists to support code doing the things Storage seeks to generalize over (e.g. ArenaRef above). These custom pointer-like types (e.g. StackBox) using alternatives to traditional pointers imho deserve to be as convenient to use as true pointer types.

Footnotes

  1. And this type may be required to exist to be the MIR primitive instead of Box... but that still might be too generic; a primitive containing user defined data is generally problematic. Which, honestly... might also be a real problem to solve for custom pointee metadata since that can be bigger than usize and make *mut T not fit in a ScalarPair 😓

  2. Far-far-future we may even do this to provide a guaranteed memory layout for fat pointers.

@RalfJung
Copy link
Member

RalfJung commented Jul 12, 2022

Rebased, squashed, and renamed JustMetadata to TypedMetadata (agree, it's a better name).

FWIW I kind of liked JustMetadata better.^^ TypedMetadata is odd, everything is typed in Rust.
What is entirely missing from the name is the fact that this is the metadata of a pointer of pointee type T. We have a ptr::metadata function; a ptr::Metadata type might also make sense, doesn't it?

The funny part is that it's basically just an alias for Pointee<T>::Metadata, but newtyped. So PointeeMetadata might be a better option?

@CAD97
Copy link
Contributor Author

CAD97 commented Jul 12, 2022

We have a ptr::metadata function; a ptr::Metadata type might also make sense, doesn't it?

The one thing I worry about is that ptr::Metadata<T> or PointeeMetadata<T> potentially make it even less clear what the difference to <T as Pointee>::Metadata is, and might suggest it's just a transparent type alias. The point of the TypedMetadata<T> is that it attaches the T type back to the metadata, which could be the metadata of any number of pointee types.

Perhaps if ptr::metadata returned ptr::Metadata instead of <T as Pointee>::Metadata? Perhaps that would be a desirable change anyway once the retyping wrapper is available; the wrapped metadata type is still publicly available and we could even provide Deref delegation.

I opened an IRLO thread for 🚲🏚️.

@apiraino
Copy link
Contributor

apiraino commented Aug 4, 2022

Based on I-lang-nomination see this comment, I think this is waiting on some discussion to happen so I'll tentatively add S-blocked to signal that. Please remove the label when the reviewing process can move forward, thanks!

@rustbot label S-blocked

@rustbot rustbot added the S-blocked Status: Blocked on something else such as an RFC or other implementation work. label Aug 4, 2022
@pnkfelix
Copy link
Member

(This is specifically blocked on me reading over the proposal and the comment thread to better understand the problem being solved and the merits of the proposed solution.)

@CAD97
Copy link
Contributor Author

CAD97 commented Sep 19, 2022

I submitted a libs-api ACP about the ptr::Metadata<T> (previously in this thread called TypedMetadata<T>, JustMetadata<T>):

The intent there is to cover the libs portion of what embracing this type would look like. Using this type as the type returned by ptr::metadata(pointer) rather than the projected associated type imho improves the API in addition to unlocking the ability to provide coercions on it.

@pnkfelix pnkfelix removed the I-lang-nominated Nominated for discussion during a lang team meeting. label Sep 20, 2022
@jackh726 jackh726 removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 10, 2023
@Dylan-DPC Dylan-DPC added the S-waiting-on-ACP Status: PR has an ACP and is waiting for the ACP to complete. label May 13, 2023
@CAD97
Copy link
Contributor Author

CAD97 commented Nov 6, 2023

I'm closing this for now as stale. I still hold that a ptr::Metadata and ptr::MetadataKind split is a better API design than the current, and that ptr::Metadata<T> should get the same coercions that *const T gets, but it's not happening via this PR.

@CAD97 CAD97 closed this Nov 6, 2023
@apiraino apiraino removed S-blocked Status: Blocked on something else such as an RFC or other implementation work. S-waiting-on-ACP Status: PR has an ACP and is waiting for the ACP to complete. labels Nov 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.