-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extern types v2 #3396
base: master
Are you sure you want to change the base?
Extern types v2 #3396
Conversation
extern { | ||
type Foo; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the extern type Foo
shorthand still allowed under this proposal or do all extern types have to be inside an extern block?
Similarly, how do extern blocks interact with ABI tags? For example, if I have an extern "C"
block, is a type declaration inside it invalid, or just a warning? I would say that adding an ABI to these should be a deny-by-default lint that states that the tag is unused, with the potential for them having some meaning in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that extern { type Blah; }
will currently be reformatted by rustfmt to extern "C" { type Blah; }
. So this question should probably have t-style weigh in.
I believe we've guaranteed the default is "C"
anyway, so it's not clear to me that this makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This question gets at one of the niggles I have with this proposal. The ABI doesn't really mean anything as we're using extern to mean something slightly different to what it normally means. I think this is a point in favour of repr(unsized)
but extern makes so much sense in that it heavily suggests what this feature is useful for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When it comes to bikeshedding, what about something like the following?
#[repr(opaque)]
struct Foo;
I personally don't like the use of type
and I think opaque
captures the meaning better than either extern
or unsized
. I also think it just makes more sense as a repr
, and it would go nicely with specifying alignment (eg. repr(opaque, align(16))
) if we allow that, which I think we should (though that could be added later if necessary).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's your reasoning for preferring opaque
over unsized
? I'm thinking of the common header use case where you may want:
#[repr(C, unsized)]
struct Header {
field1: u8,
field2: usize,
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are described as opaque types throughout the RFC, which I think describes them well. The thing that makes them unique isn't that you don't know their size, but that you don't know (nor care about) their composition. This is in contrast to types like [T]
, str
, and dyn Trait
, which are unsized, but not opaque.
Regarding the example you gave, if I didn't already know what it meant, I would find it very confusing. I wouldn't understand what unsized
meant here, especially considering that the struct
would appear to be composed entirely of sized types. It would be a lot less confusing, though, if the struct
had no fields.
To create a type with a non-opaque header followed by opaque data, I would imagine creating two separate types: one for the opaque data itself, and a separate type for the opaque data with a header (though that would seem to require repr(align)
on the opaque type). Using the word opaque
in this context (where the opaque type has no fields) would clearly indicate not only that we don't know the size, but that we don't know (nor care about) the composition of the type. Here's how I would imagine doing that:
#[repr(opaque, align(4)]
struct Opaque;
#[repr(C)]
struct OpaqueWithHeader {
field1: u8,
field2: usize,
opaque: Opaque,
}
That said, I'm not opposed to either extern type T
nor the repr(unsized)
syntax. I just wanted to express my view that repr(opaque)
might be a better fit. However, regardless of the syntax used, I would discourage the idea of allowing fields within the opaque type itself, as that would seem quite confusing to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One issue I see with the repr
attribute is that it looks too much like it's a unit struct declaration, if we're using a struct declaration I wonder if we could have native syntax for saying there are unknown fields (maybe with the repr
too)
#[repr(opaque)]
struct Opaque {
..
}
Though, this has been proposed as syntax for #[non_exhaustive]
previously. I think it fits opaqueness better since the fields are unknown even locally, whereas #[non_exhaustive]
only affects metadata and the visible struct declaration is what it is known to be locally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's one place where extern "C" { type X; }
and extern "Rust" { type X; }
could differ: the former obviously should be FFI-safe, but the latter doesn't necessarily have to be. If and only if it's not, then it's presumably semver-compatible to replace it with a standard struct
in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I see "opaque (type)", I think of RPIT (return-position impl trait) types, and soon TAIT, RPITIT, etc.
This would provide a mechanism for allowing users to implement types with entirely custom metadata and therefore custom implementations of `size_of_val` and `align_of_val`. | ||
This would likely mean that users could implement types that acted like extern types without using the `extern type` syntax. | ||
This should not be an issue as `extern type` communicates the intent of these types well, and guarantees FFI-safety. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One other valuable extension would be to make MaybeUninit
accept ?Sized + MetaSized
types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MaybeUninit currently takes a sized T, why would this enable relaxing that bound? T: ?Sized
currently means the same as I'm proposing T: ?Sized + MetaSized
will mean, so I don't see how this helps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I misread the proposal and assumed that ?Sized
would opt out of MetaSized
too.
Without a notion of MetaSized
, MaybeUninit
has to be sized, since there's no good way to determine the layout of an unsized value. Adding MetaSized
means we can now do this with just metadata, making it valid again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to be clear the proposal is that ?Sized
will opt out of MetaSized
after the edition change. So far I've assumed that, because all types in current Rust are MetaSized
, that introducing this won't help with issues like this. It's possible that's not true and this is an exception.
How would you create a MaybeUninit<T: ?Sized + MetaSized>
, where would you get the metadata from? Can it be uninit? Aren't those the same issues you'd face if you opened an RFC to make MaybeUninit
take ?Sized
today?
* statically - the size/alignment is known by the Rust compiler at compile time. This is the current `Sized` trait. | ||
Most types in Rust are statically sized and aligned, like `u32`, `String`, `&[u8]`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a way, this is the same as knowing from metadata -- the metadata is just ()
.
(I think that this way of wording is still clearer than omitting it, but figured it's worth pointing out somehow.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I possibly should clarify that all statically sized things are metadata sized and all metadata sized things are dynamically sized (and similarly for alignment). Note there is still a meaningful distinction between these things: you can put as many statically sized&aligned things in a struct but only one metadata aligned thing in.
Yes, I think so, unless there are plans to add types that are Personally, I think it would be better to leave the meaning of fn foo<T>();
fn foo<T: Sized + MetaSized>(); as are those: fn foo<T: ?Sized>();
fn foo<T: ?Sized + MetaSized>(); In other words, Note that this is not a breaking change and requires no edition migration. |
FWIW, lang has historically said "no more So I like the edition switch approach here: the conceptual meaning of |
extern { | ||
type Foo; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I see "opaque (type)", I think of RPIT (return-position impl trait) types, and soon TAIT, RPITIT, etc.
In the 2024 edition and later, `T: ?Sized` no longer implies any knowledge of the size and alignment so opaque types can be used in generic contexts. | ||
If you require your generic type to have a computable size and alignment you can use the bound `T: ?Sized + MetaSized`, which will enable you to store the type in a struct. | ||
|
||
The automated tooling for migrating from the 2021 edition to the 2024 edition will replace `?Sized` bounds with `?Sized + MetaSized` bounds. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tooling should only do the replacement if Metasized
is required, or a lot of declarations will get a lot noisier.
I feel it would be useful to have an idea on how much of an impact this will have on the ecosystem -- how common will it be to need a Metasized
bound.
Doing the review of the standard library mentioned later would be a start.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tooling should only do the replacement if
Metasized
is required, or a lot of declarations will get a lot noisier.
How do you propose deciding which case is which? Note that (usually) relaxing the bound is non-breaking but adding it in later is breaking so a human decision is often required for what promises the API wants to make.
This requires `MetaSized` because the size must be known at run time. | ||
|
||
```rust | ||
pub struct Box<T: ?Sized + MetaSized, A: Allocator = Global>(Unique<T>, A); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without implied bounds, this will require the bound be mentioned everywhere Box
(or Arc
etc) are mentioned and can take a non-Sized
parameter, including in traits.
pub trait Trait {
fn foo(self: Box<Self>) where Self: MetaSized;
fn bar(self: Arc<Self>) where Self: MetaSized;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting but probably bad idea: MetaSized
as a default bound for all trait
, unless they opt out.
It's worth noting that since receivers are special, we could add special rules to make a Box<Self>
receiver imply Self: MetaSized
in traits. On the other hand, we'd probably prefer to make receivers less special.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implied bounds could solve this if they are ever implemented…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really sad. It feels like a blocker that can't be easily skipped. How reasonable would it be to make receivers special until implied bounds is implemented? Or is the only option here to go and implement implied bounds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just had a thought, doesn't implied bounds for self types arguably already exist given that you can write this:
trait Trait {
fn foo(self);
}
You don't have to specify where Self: Sized
on that. Granted, you can't implement it for unsized types but that feels like the correct behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, technically:
trait Trait {
fn foo(self);
}
doesn't actually imply where Self: Sized
, just that you need unstable features to implement it without where Self: Sized
. for a good example, see FnOnce
which can be used as Box<dyn FnOnce(...)>
to call fn call_once(self, ...)
with Self = dyn FnOnce(...)
Since the general consensus on the syntax is... very mixed, it might be worth focusing on Having a stable I just don't want to see more progress on refining |
@Aloso I've added a note to the alternatioves section about |
That's an option, although I'm tempted to leave everything as is for now and keep that option in my back pocket for if the extern types syntax is the only thing blocking this. Adding MetaSized without any types that can be |
text/3396-extern-types-v2.md
Outdated
- Should users be able to slap a `#[repr(align(n))]` attribute onto opaque types to give them an alignment? | ||
This would allow us to represent `CStr` properly but would necessitate splitting `MetaSized` and `MetaAligned` as they would not be "metadata sized" in general. | ||
(We may be able to get away with the [Aligned trait](https://github.com/rust-lang/rfcs/pull/3319)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've long been told that "extern type is needed to fix CStr and CString", so if this whole thing happens but doesn't definitely fix those types as part of it then things it would be extremely disappointing.
Skip round-trip tests for structs with FAMs For example: ``` strcut inotify_event { int wd; /* Watch descriptor */ uint32_t mask; /* Mask describing event */ uint32_t cookie; /* Unique cookie associating related events (for rename(2)) */ uint32_t len; /* Size of name field */ char name[]; /* Optional null-terminated name */ }; ``` the `name` field at the end of this struct is a Flexible Array Member (FAM - https://en.wikipedia.org/wiki/Flexible_array_member) and represents an inline array of _some_ length (in this case the `len` field gives the length, but FAMs in general are unsized). These forms cause problems in two ways: 1. There's no rust-equivalent for FAM-containing structs (see rust-lang/rfcs#3396 for details) - the current approach is just to omit the `name` field in the Rust representation. 2. These structs cannot be passed by value (basically the compiler cannot know how many elements of the `name` field should be copied) and different compilers do different things when asked to do so. This PR focusses on the second of these problems since it was causing test failures on my machine. If you run the libc-test suite with GCC as your C compiler you get a compiler note saying the following: ``` /out/main.c: In function ‘__test_roundtrip_inotify_event’: /out/main.c:21011:13: note: the ABI of passing struct with a flexible array member has changed in GCC 4.4 ``` and the test suite passes. OTOH if you build with clang as your compiler you get no compile time warnings/errors but the test suite fails: ``` size of struct inotify_event is 16 in C and 185207048 in Rust error: test failed, to rerun pass `--test main` Caused by: process didn't exit successfully: `/deps/main-e32ea4d2acb868af` (signal: 11, SIGSEGV: invalid memory reference) ``` (note that 185207048 is `0x08090A0B` (which is what the round-tripped value is initialized with by the test suite) so clearly the Rust and C code are disagreeing about the calling convention/register layout when passing such a struct). Given that there doesn't seem to be a specification for passing these objects by value, this PR simply omits them from the roundtrip tests and the test suite now passes on GCC _and_ clang without warnings.
## Extern types | ||
[extern-types]: #extern-types | ||
|
||
Extern types are defined as above, they are thin DSTs, that is their metadata is `()`. They cannot ever exist except behind a pointer, and so attempts to dereference them fail at compile time similar to trait objects. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume &*x
is allowed with such types. This means that dereferencing (using the *
operator) is actually fine, it's only the place-to-value coercion which is forbidden. That matches trait objects (and slices).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, exactly. I hadn't considered that distinction, I'll try to clarify that later.
This would allow us to represent `CStr` properly but would necessitate splitting `MetaSized` and `MetaAligned` as it is only "dynamically sized" but "statically aligned". | ||
(We may be able to get away with the [Aligned trait](https://github.com/rust-lang/rfcs/pull/3319)) | ||
- Should the `extern type` syntax exist, or should there just be a `repr(unsized)`? | ||
This would allow headers with opaque tails (which are very common in C code) but is a more significant departure from the original RFC, and looks more like custom DSTs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need repr(unsized)
for this? Couldn't you just allow the last field of a type to be ?MetaSized
? (Perhaps requiring the struct to be repr(C)
?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imagine we have a setup like the following:
#[repr(C)]
struct Foo {
a: u32,
b: u8,
foo: String,
}
#[repr(C)]
struct Bar {
a: u32,
b: u8,
bar: u8,
}
extern type OpaqueTail;
#[repr(C)]
struct HeaderWithTail {
a: u32,
b: u8,
tail: OpaqueTail,
}
#[repr(C, unsize)]
struct HeaderUnsized {
a: u32,
b: u8,
}
What alignment does OpaqueTail
have? &header.tail
doesn't really have a well defined meaning as it's either pointing at bar directly (assuming an alignment of one) or some random padding bytes before foo starts. However, you can freely convert between a &Foo
(or &Bar
) and a &HeaderUnsize
without worrying about that.
This alignment issue is why this RFC doesn't propose allowing "dynamically aligned" or "unknown aligned" (?MetaSized
) types as fields of structs (except as the only non-ZST field of a #[repr(transparent)]
struct).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I suppose you could forbid referencing the field entirely…
One potential middle ground would be to add a stable I don't know that I'd call that a 90% solution, but it's maybe a 60% solution. (I'd rather not punt syntax out if possible, but if it's a matter of landing an edition change in time, I'd rather the bounds change without the syntax than nothing.) |
What is the reasoning that optional traits aren't allowed in supertraits? I feel like a builtin Also, is this understanding in this table correct? It might be good to include some kind of table or diagram in the RFC, I've found it kind of hard to wrap my head around just the text.
|
An owning pointer use will generally want Interestingly, with implied bounds,
Because Traits don't have this default implied bound of I didn't actually review the RFC for this, but if it doesn't yet, it should say that traits have a default
|
@tgross35 I agree with what CAD97's said, I'm confused by your table and don't understand what you're trying to convey.
I'm beginning to worry that this RFC will require implied bounds to be acceptable, can you link to the known pitfalls with implied bounds?
I hadn't considered this actually, I'll update the RFC soon. |
For anyone following along there was a lang team design meeting about this yesterday, the minutes are available here: Extern types V2 - HackMD. There's also some Zulip discussion here: #t-lang > Design meeting 2023-08-09 - rust-lang - Zulip That meeting came out with some suggestions for investigations that would help move this RFC along, the current set of things I think need doing are:
|
|
Unless I'm missing something that could entirely prevent this kind of thing from working? fn size_of_val<T: ?Sized + MetaSized>(val: &T) -> usize { ... }
fn foo<T>(val: &T) -> usize {
size_of_val(val)
} |
I didn't put it together very well, but what I was going for was:
Based on your comment it sounds like item 1 is implied (by
Thanks for the explanation. I know the discussion of trait aliases came up before, it just seems unfortunate that two traits
|
(This is mostly a brain dump so I don't forget it) boats' post Changing the rules of Rust has made me realise that my plan of relaxing the meaning of Worse however is traits that contain functions that are generic over their argument because implementations of those traits could rely on
run over the standard library rustdoc json output, I believe the only stdlib trait this affects is |
|
||
|
||
# Drawbacks | ||
[drawbacks]: #drawbacks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another drawback is that this RFC rejects a type that is used widely inside rustc:
extern "C" {
type OpaqueListContents;
}
pub struct List<T> {
len: usize,
data: [T; 0],
opaque: OpaqueListContents,
}
This must definitely be mentioned in the RFC. Ideally there is some kind of proposal for what to do with this type...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A simple suggestion: allow structs with extern tails, but don't allow computing the offset of (or taking a reference of) the extern tail. This is sufficient for the rustc impl, which uses the data
field rather than the opaque
field.
A less simple suggestion is a way to specify a known alignment to use for an extern type; this could either be via repr(align = N)
or unsafe impl marker::Aligned { const ALIGN: usize = N; }
.
|
||
The lack of the `Sized` and `MetaSized` traits on these structs prevent you from calling `ptr::read`, `mem::size_of_val`, etc, which are not meaningful for opaque types. | ||
|
||
In the 2021 edition and earlier, these types cannot be used in generic contexts as `T: Sized` and `T: ?Sized` both imply that `T` has a computable size and alignment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if a new-edition trait has an associated type with ?Sized
bound? If now old-edition code is generic over that trait, Trait::T
would be a not-meta-sized type! IOW, old-edition code might actually have not-meta-sized types in a generic context.
Similarly, what if a new-edition trait has a generic function with T: ?Sized
bound. If now old-edition code implements this with a T: ?Sized
bound, this should raise an error since in the old edition, T: ?Sized
actually means T: ?Sized+MetaSized
and this the implementation is less generic than the trait requires. This means having generic functions with ?Sized
bounds in a new-edition trait makes the trait impossible to implement in old editions.
(Both of these examples came out of boats's recent exploration into adding a Leak
trait.)
I got curious and ended up writing a partial implementation of
Footnotes
|
I cannot parse your syntax here and have no idea what you are talking about (not even parts of it), could you explain in more detail? |
It's easiest to explain by example: // Assuming some trait associated type, e.g.
trait Deref {
type Target: ?MetaSized;
// ...
}
// when declaring the function
fn f<P: Deref>()
// this needs to be treated as having
where
<P as Deref>::Target: MetaSized,
{
// because the code might have previously relied on that bound, e.g.
_ = size_of_val::<P::Target>;
}
// This application of implicit bound needs to be applied recursively;
// consider e.g.
trait Strategy {
type Pointer: Deref
where
<Self::Pointer as Deref>::Target: ?MetaSized,
// NB: bound would be implied here without unbound
;
}
// Then with a function defined as
fn g<S: Strategy>() {
// we need both S::Pointer: MetaSized
_ = size_of_val::<S::Pointer>;
// and S::Pointer::Target: MetaSized
_ = size_of_val::<<S::Pointer as Deref>::Target>;
} It is technically an option to say that relaxing an associated type to unbound This actually makes me realize one more edge case to consider: When to not apply the implicit bound is a bit more involved. If a generic binder context (params and where clause) introduces a trait obligation The extent of needing to apply Pessimistic takeaway: extern type support in the current edition is going to feel awkward no matter what we do, in order to forbid Making Footnotes
|
You could gate the default method bodies behind the implied bound without gating the trait itself, as |
This comment has been minimized.
This comment has been minimized.
There could be use in defining extern types with known alignment, like so: extern {
#[repr(align = 8)]
type Foo;
} This type can be the last field of a struct, but it’s not |
@GoldsteinE unless I misunderstand what Would also be nice to be able to specify "the same alignment as |
@WaffleLapkin Sure, it’s also
I think that with |
@nikomatsakis has written https://smallcultfollowing.com/babysteps/blog/2024/04/23/dynsized-unsized/ which suggests an alternative to relaxing This suggestions should definitely go into the alternatives section at the very least, the way I see it the pros are:
The cons are:
|
Hmm, I overlooked that, I think we would want to use an edition for this, or possibly we could just get away with changing the default to |
Triaging: labeling as waiting on author due to
But of course there is also the other open RFC that has a strong overlap with this (and was opened after it): |
Define types external to Rust and introduce syntax to declare them.
This additionally introduces the
MetaSized
trait to allow these types to be interacted with in a generic context.This supersedes #1861, and is related to the open RFCs #2984 and #3319.
Rendered