Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for the experimental crabi ABI #111423

Open
4 tasks
joshtriplett opened this issue May 10, 2023 · 14 comments
Open
4 tasks

Tracking Issue for the experimental crabi ABI #111423

joshtriplett opened this issue May 10, 2023 · 14 comments
Labels
A-ABI Area: Concerning the application binary interface (ABI) C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-lang Relevant to the language team, which will review and decide on the PR/issue.

Comments

@joshtriplett
Copy link
Member

This is a tracking issue for the experimental crabi ABI; see #105586 and rust-lang/compiler-team#631.
The feature gate for the issue is #![feature(crabi)].

About tracking issues

Tracking issues are used to record the overall progress of implementation.
They are also used as hubs connecting to other relevant issues, e.g., bugs or open design questions.
A tracking issue is however not meant for large scale discussion, questions, or bug reports about a feature.
Instead, open a dedicated issue for the specific matter and add the relevant feature gate label.

Steps

Unresolved Questions

  • Niches: should we support cases like Option<bool> without a separate
    discriminant, or should we (for simplicity) always pass a separate
    discriminant? Likely the latter. However, what about things like Option<&T>
    and Option<NonZeroU32>, for which Rust guarantees the representation of
    None? Those work with the C ABI, and they have to work with crABI, but can
    we make them work with crABI using the same encoding of None?
  • What subset of lifetimes can, and should, we support? We can't enforce them
    cross-language, but they may be useful as an advisory/documentation
    mechanism. Or we could leave them out entirely.
  • To what extent should crABI make any attempt to specify things that can't
    be enforced, rather than ignoring semantics entirely and only specifying
    how types get passed?
  • How can we make it easy to support data structures without having to do
    translation from repr(Rust) to repr(crabi) and have parallel structures?
    Can we make that less painful to express, and ideally mostly free at runtime?
    • Related: how can we handle tuples? Do we need a way to express
      repr(crabi) tuples? How can we do that conveniently?
  • Should we provide support for extensible enums, such that we don't assume the
    discriminant matches one of the known variants? Would doing so make using
    enums less ergonomic? Could we address that with language changes?
  • For handling objects, could we avoid having to pass in-memory function
    pointers via a vtable, and instead reference specific symbols? This wouldn't
    work for generics, though. Can we do any better than a vtable?
  • For ranges, should we provide a concrete range type or types, or should we
    defer that and handle ranges as opaque objects or traits?
  • Do we get any value out of supporting (), other than completeness? Passing
    () by value should just be ignored as if it weren't specified. Do we want
    people using pointers to (), and do those have any advantage over pointers
    to void?
  • Should we do anything special about i128 and u128, or should we just push
    for getting those supported correctly in extern "C"?
  • For generics, such as Option<u64> or Result<u32, ConcreteError> or
    [u8; 16], does the rule "all generic parameters must be bound to concrete
    types in the function signature" suffice, or do we need a more complex rule
    than that?
  • Unwinding: The default extern "crabi" should not support unwind, and most
    languages don't tend to have support for unwinding through C-ABI functions,
    but should we have a crabi-unwind variant? Would doing so provide value?

Implementation history

@joshtriplett joshtriplett added T-lang Relevant to the language team, which will review and decide on the PR/issue. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC labels May 10, 2023
@workingjubilee
Copy link
Member

During discussion, we arrived at the conclusion that the notion of a "C ABI" is somewhat of an existential question, because the C Standard does not define a C ABI, and currently our extern "C" does not even support all Standard C types, like long double and _Complex. And the C language can add new types, of course (it in fact did in C23, non-optionally, a new type that all compilers will now be expected to support in order to claim they support C23). Thus saying "superset of a C ABI" is slightly dubious. It may prove beneficial for this experiment to hammer down what we mean, anyways, when we say "C ABI", extern "C", and so on, because there is a definable meaning that we in-practice work with.

Until then, it is another unanswered question to address.

@joshtriplett
Copy link
Member Author

joshtriplett commented May 11, 2023

@workingjubilee I already modified the proposal to just say "The crABI support for Rust will be a strict superset of the C ABI support for Rust."; that avoids implying support for things like long double or u128 or _Complex that we don't currently support.

@workingjubilee workingjubilee added the A-ABI Area: Concerning the application binary interface (ABI) label May 15, 2023
@LilyIsTrans
Copy link

I think at the very least Option<&T> should be able to represent None as a null pointer. If this ABI will support dynamic linking to something like libc for more modern languages (and if supporting that use case is not already a goal, I think it should be), the performance hit of the extra data for the discriminant in something as simple as a pointer is simply not acceptable, as it would, much like libc, be used by many programs on nearly every system.

@workingjubilee
Copy link
Member

@LilyIsTrans That is already a guaranteed repr of Rust for sized types: https://doc.rust-lang.org/std/option/index.html#representation

@LilyIsTrans
Copy link

@LilyIsTrans That is already a guaranteed repr of Rust for sized types: https://doc.rust-lang.org/std/option/index.html#representation

Yes, but it is not clear to me that that would imply it's necessarily going to represented like that in crabi, which is a FFI and therefore presumably not necessarily subject to the normal rules of repr(Rust). My point is that if I call a dynamically linked extern "crabi" function, Option<&T> should still be guaranteed to have the same representation as &T would in that function's interface.

@SturdyFool10
Copy link

As for lifetimes, I think we can encode the certainties into them: that if a variable is dropped during execution that it consumes the parameter which can be used to semantically check the code at compile time, and if it is unspecified, try to “mod” the existing code with our copy of the function to check if it works during static analysis, by temporarily loading a “ground truth” version of libraries temporarily for code check purposes it can just be assumed that this is the copy of the lib the program will always have around it anyway

@Eclipse32767
Copy link

"Do we get any value out of supporting (), other than completeness? Passing
() by value should just be ignored as if it weren't specified. Do we want
people using pointers to (), and do those have any advantage over pointers
to void?"

Here's my take on this, () should be supported in the strict sense of a no-op, but should ABSOLUTELY NOT actually get carried over a language barrier. C++ and probably several other languages act with the assumption that size_t can never be zero, and so Unit would release absolute armageddon upon them.

Unit pointers are handy simply because C is a hellscape that requires pointers to "what is this again?". Unless the need for void pointers can be fixed, it would not be a good idea to eliminate unit pointers. Unit pointers should essentially be treated as a more rusty void pointer.

@Scripter17
Copy link
Contributor

C++ and probably several other languages act with the assumption that size_t can never be zero, and so Unit would release absolute armageddon upon them.

You can just tell them to get good like they tell everyone who wants memory safety. "Skill issue", etc..

Regardless, there's more ZSTs than just unit; struct MyCustomError; seems like a pretty important thing to preserve. Any inter-language bans on ZSTs would be a pretty big issue. Even just unit would be an issue because, for reasons unknown, the url crate has several instances of pub fn ...(...) -> Result<(), ()>.

I know I'm not a relevant voice in this discussion but it does need to be mentioned that ZSTs aren't something you can ban without consequence like !.

Actually ! is just fundamentally conceptually incompatible with most languages so that probably does need an inter-language ban. (Maybe some kinda unsafe system for CrABI that marks ! as unsafe???)

Actually unsafe types would handle ZSTs pretty well. Is that on the table or am I just wildly off the mark? I hope that's on the table because if done well it's a perfect solution.

@Eclipse32767
Copy link

Eclipse32767 commented May 13, 2024

You can just tell them to get good like they tell everyone who wants memory safety. "Skill issue", etc..

Regardless, there's more ZSTs than just unit; struct MyCustomError; seems like a pretty important thing to preserve. Any inter-language bans on ZSTs would be a pretty big issue. Even just unit would be an issue because, for reasons unknown, the url crate has several instances of pub fn ...(...) -> Result<(), ()>.

I know I'm not a relevant voice in this discussion but it does need to be mentioned that ZSTs aren't something you can ban without consequence like !.

Actually ! is just fundamentally conceptually incompatible with most languages so that probably does need an inter-language ban. (Maybe some kinda unsafe system for CrABI that marks ! as unsafe???)

Actually unsafe types would handle ZSTs pretty well. Is that on the table or am I just wildly off the mark? I hope that's on the table because if done well it's a perfect solution.

oh wow I was unaware that the url crate uses Result<(), ()>, ! needs a ban, but ZSTs are grounds where we must tread carefully.

also don't worry about not being a relevant voice, this is my first major contribution to talks on issues like this.

my suggestion was meant as a more extreme approach to raise the issue that ZSTs will cause havok if shipped over a language border

addendum: crabi could also do what C++ does and represent ZSTs as empty chars

@adsouza
Copy link

adsouza commented May 13, 2024

For those of us not in the know, would somebody mind expanding the abbreviation ZST?

@Alcaro
Copy link
Contributor

Alcaro commented May 13, 2024

Zero size type. In Rust, most or all types with only one valid value have size zero, like an empty struct, empty tuple, or an enum with a single member and no data.

@Eclipse32767
Copy link

Actually unsafe types would handle ZSTs pretty well. Is that on the table or am I just wildly off the mark? I hope that's on the table because if done well it's a perfect solution.

the more I think about it, the more a way to assert that a symbol is only for rust -> rust usage makes sense, or rather an “if you’re using this symbol in other languages, you’re on your own”. this way we don’t have to throw out things like ZSTs and !, instead moving them into a group where using them from outside rust -> rust is a bad idea, but not completely forbidden

@Eclipse32767
Copy link

What subset of lifetimes can, and should, we support? We can't enforce them cross-language, but they may be useful as an advisory/documentation mechanism. Or we could leave them out entirely.

I think lifetimes as a documentation mechanism could be very useful, as they provide a reasonably ergonomic way of communicating pointer lifetime expectations and the relations between them.

Niches: should we support cases like Option without a separate discriminant, or should we (for simplicity) always pass a separate discriminant? Likely the latter. However, what about things like Option<&T> and Option, for which Rust guarantees the representation of None? Those work with the C ABI, and they have to work with crABI, but can we make them work with crABI using the same encoding of None?

I think keeping the NPO is a victimless crime, as languages with nullable pointers (gross, I know) can then digest Option<&T> like a normal pointer.
As for cases like Option, I think a separate discriminant is ok.

Related: how can we handle tuples? Do we need a way to express repr(crabi) tuples? How can we do that conveniently?

I think the cleanest way to lower tuples into a FFI-friendly form is to turn (i32, String) into struct MyGoofyTuple {0: i32, 1: String}

Should we provide support for extensible enums, such that we don't assume the discriminant matches one of the known variants? Would doing so make using enums less ergonomic? Could we address that with language changes?

I think the best way to do this is for crabi to have some way to communicate an enum being #[non_exhaustive], but assume exhaustiveness if not specified

For ranges, should we provide a concrete range type or types, or should we defer that and handle ranges as opaque objects or traits?

A concrete range type would be nice, but not at all necessary, as they can probably be opaque without too much pain.

Should we do anything special about i128 and u128, or should we just push for getting those supported correctly in extern "C"?

I think the easiest way to handle i128 and u128 is to just get them working in extern "C". This way, they should work for everyone, even outside crabi

For generics, such as Option or Result<u32, ConcreteError> or [u8; 16], does the rule "all generic parameters must be bound to concrete types in the function signature" suffice, or do we need a more complex rule than that?

I don't see a problem with all generic parameters needing to be bound to concrete types, but I might be missing some nuance here.

@phdye
Copy link

phdye commented Jun 15, 2024

Worrying about how other languages will interoperate with crabi would seem to be significantly reducing the usefulness of crabi as a rust to rust ABI for compiled modules.

How about focusing on representing rust elements as fully as feasible with each release having a detailed interop feature set that an implementor can opt in piecemeal rather than having to support everything out the gate? Obviously such support would need to be exposed programmatically.

This should permit early adopters to implement what they need leaving additional features to be implemented as needed down the road.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ABI Area: Concerning the application binary interface (ABI) C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

9 participants