-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PhantomData fields in repr(C) structs change ABI on aarch64 #56877
Comments
Here is the non-optimized llvm-ir:
For reference, the llvm-ir for the same source code, for x86_64:
The notable difference is i64 vs. double. Note that the above is the output of |
It's interesting to note that x86-64 takes the two floats as a single double, while aarch64 takes two floats, but the main problem here is that |
BTW, this also happens on aarch64-unknown-linux-gnu, and I've now confirmed that the IR before the first optimization pass is the same. |
This happens because homogeneous aggregates are treated differently in the PCS compared to inhomogeneous aggregates. It depends how zero sized types are defined to behave in |
Yes, or at least that's the assumption that all of bindgen / cbindgen / the improper_ctypes lint make. |
Nominating: this is likely to be a very high priority for FF over the next month or so, and if we can get a fix in now that will be a big win. |
There's nothing platform-dependent in that code, how come this only affects aarch64? Edit: Oh, right #56877 (comment) |
discussed at T-compiler meeting. P-high. assigning to self to make sure it doesn't get lost. (@nikomatsakis says "maybe we're close to a fix", which I assume is based on the comment thread here). |
FWIW, cbindgen explicitly ignores PhantomData. Other ZST are not ignored. However, that seems like a bug in cbindgen: mozilla/cbindgen#262 |
If the struct has actual padding, then it won't be a homogeneous aggregate. Therefore, the only risk here is for structs whose alignment is increased by a ZST, but only up to the size of the type itself. I don't think there's a hazard in regarding these as homogeneous aggregates. For example, #[repr(C)]
struct Foo {
a: f32,
b: f32,
c: [f64; 0]
}
No. |
Unfortunately I don't think that's the case. In C
is not a homogeneous aggregate. I think we have to have a special case just for |
Or some other kind of emptyness-tracking that treats |
Discussed at the T-compiler. It feels as if this needs a strategy of some sort before any implementation work can proceed. |
Looking over the homogeneous-aggregate code, it already checks the following conditions:
It seems like we could basically keep all of these conditions, but filter the field list to those with non-zero size, and everything would be fine. In particular, the concerns that @arielb1 raised here regarding alignment are being checked I believe. So roughly speaking the definition of a "homogeneous aggregate" would be something like:
If all conditions are met, the type is a "homogeneous aggregate". |
Does that sound about right? This seems like it wouldn't be too hard to implement. |
This does not handle the C case with an empty array struct X {
float a;
float b;
int x []; // or int x[0]
}; We need some way of defining empty arrays as "homogeneous aggregate breaking" while defining |
I see, I missed that subtlety. It comes down what the "filter" is that we apply to the types -- I specified "non-zero-size", but it could easily be "exclude empty structs". Or, perhaps, exclude anything that is zero-sized except for arrays whose length is non-zero (or whose element types are not zero-sized). It seems to come down to whether we want a "whitelist" or a "blacklist" here. It occurs to me that there is one other concern: It is possible to define zero-sized structs in C with suitable compiler extensions (see this section of the nascent unsafe code guidelines for examples). Can anyone validate whether a struct like |
It seems clear though that if you have a We have some freedom to do otherwise for structs and types that are not |
I used struct Foo {};
struct Baz {
float x;
float y;
struct Foo b;
}; |
Can someone who knows the code look at it? I think C compilers specifically have a problem with zero-length arrays and structs, but not with positive-length arrays or structs. More examples: // still an homogeneous aggregate, despite having a non-0-sized struct and float.
struct Foo2 { float t; };
struct Baz2 {
struct Foo2 x;
float y[1];
};
float sizeof_baz2(struct Baz2 b) {
return b.y[0];
} |
OK, so let me try to open up this branch as a PR and we'll take it from there. I have to figure out the testing, most of all. |
The filter at the high level is "all fields that are removed from the rust struct declaration when you convert it to a C struct declaration", e.g. The
To be clear, there are three potential problematic array fields in C to deal with
The latter I don't think can even be passed by value in rust currently, e.g.,
The first two look to be handled differently on x86 😟 https://godbolt.org/z/o0z52H (although my x86 assembly isn't very good so maybe someone else can verify). Something I found interesting in https://github.com/rust-rfcs/unsafe-code-guidelines/blob/9c9840297ca47d3085876cec7b59bb92d8554591/reference/src/layout/structs-and-tuples.md#function-call-abi-compatibility
This seems like the kind of filter we want here too, is there code for that somewhere we can reuse? Alternatively that got me thinking, can the use case of adding
do
then make |
Actually, it looks like ABI for flexible array members for x86 changed in GCC 4.4 but clang uses the old behavior, am I interpreting that right? Does rust claim that flexible array members are representable in rust as zero sized arrays, if so there might be an issue, if not it's a potential foot-gun if someone assumes so. |
By the way, I’ve found just today that struct banana {
int peach[0];
} and struct banana {
int peach[];
} are not equivalent. Namely the following code is UB with the first structure but fine with the second, suggesting that only the second structure is variable-length... int foo(struct banana x) {
return x.peach[0];
} EDIT: I’m surprised I’ve found it only today as codebase at my $dayjob is littered with the former variant for VL aggregates... |
This doesn't sound right. If a
I don't think we have another way to declare it, really, so presently the answer is yes. I presume here that "flexible array member" means |
@nagisa Edit: That is, if reading OOB of a zero or one sized array in tail position results in a miscompile, that's usually a compiler bug, even though it's technically UB. Your particular example is a bit odd because the struct is passed by value, and the usual guarantees for the struct hack patterns probably don't apply there. |
Some notes:
I am wondering if we can agree on a pragmatic compromise to get us going forward, and leave some amount of work for later. To some extent, I think my PR represents a decent shot at such a compromise, since it seems to match clang (maybe?) and older gcc, and it allows one to represent flexible arrays using zero-length arrays. It's not perfect if the C code uses zero-length arrays in the newer sense (not the older, flexible sense), though you can add a "dummy" member after the ZLA (e.g., A more conservative rule might be to just filter out zero-sized, repr(Rust) structs (which would certainly cover phantomdata). This would not match the C behavior for zero-sized C structs or arrays, but those are quite unusual. On the other hand, it feels kind of strictly worse than my current PR, because it is basically just wrong more often (put another way, my PR does match C behavior for zero-sized structs). At least, this is how I understand it now. |
(I should probably try to draw up some concrete examples to make my points here) |
Yes, but what I meant in the sentence following that we don't need to worry about it in the ABI computation because code like that would be ill formed regardless of the function call ABI, i.e., their memory layouts wouldn't be compatible. To be clear, what I gather from the disassembly here is
are treated the same on:
but differently on:
IMO rust should match the second one (
I think this is the way to go. As I understand it, it would match the C behaviour in those cases, no? |
If I understand you, you are arguing that we should not consider
However, that leaves me in a bit of a quandary. In particular, what set of ZST are we going to exclude when performing the aggregate test? At minimum, that set should include I'm not sure what
but it would not match:
Given that GCC changed behavior here, I think matching the new behavior makes sense. And obviously matching AArch64 is our real goal here (I'm not sure how important this test is for x86_64).
I'm not entirely sure what you meant by this. My "conservative" proposal is just to filter out repr-rust things, but in that case we would NOT match the C behavior, since a struct with a zero-sized repr-C thing would not be considered a homogeneous aggregate. This seems like an ok intermediate step but obviously not the final goal, which ought to be compatibility with C compilers. |
I'd like to land something ASAP. I'd be happy to modify my PR to one of two things:
Either fixes the immediate problem. The latter seems to me to be more compatible with C-Rust interop. The question boils down to what you consider the Rust type If we think about compatibility with newer GCC 4.4, then I think the two options work out like this:
The first line corresponds to this issue. However, as @parched points out, the final line may be a moot point, since such types ought not to be passed by value. (Note: I wonder if we want something like |
After some discussion on Zulip, I am currently leaning towards "just filter rust types" as a simple, "conservative" step. I think likely we will want to make further steps here eventually but this should solve the immediate problem and seems uncontroversial. |
I agree that filtering all ZSTs seems better. In particular, because:
But doing it first for rust zsts and then for all zsts sounds fine too. |
I've updated the PR to that approach, FYI. |
Discussed some with @eddyb on Discord (privmsg). They too feel that we should just screen out all ZSTs. They expressed a preference to modify the "homogeneous aggregate" test -- instead of "screening out fields", we would change the return value for zero-size types so that it conveys that there is no data there. Getting this return value would not invalidate homogeneity. I can try to mock that up. |
(my point was that the right intuition, IMO, is that "no leaf fields" should not be an error condition - it so happens that ZST and "no leaf fields" are equivalent, but I prefer modelling the latter in the return type, e.g. |
visiting for triage. Seems like things are progressing smoothly here. |
I'm not sure this correct, at least for AArch64 it's
Note I added an extra line to cover @emilio's use case of when before |
…ates, r=eddyb distinguish "no data" from "heterogeneous" in ABI Ignore zero-sized types when computing whether something is a homogeneous aggregate, except be careful of VLA. cc rust-lang#56877 r? @arielb1 cc @eddyb
@parched I see, I misinterpreted your slot regarding gcc somehow. You wrote that
It seem strange to me that this would vary per platform, but I guess I can imagine this being something that is part of a platform's ABI. That's certainly a pain. =) Right now, we at least treat the "homogeneous aggregate" test as being independent of the platform. Hmm. Well that's a wrench in the works. :) I still think we should land my PR, even though it's going to behave differently on AArch64 for such structs (because it seems strictly better than the status quo), but I guess we may want to do a follow-up. |
Take this testcase:
Compile with:
And check the output
test.s
file:One would expect
foo
andbar
having the same code, but that's not the case here.This is the root cause of https://bugzilla.mozilla.org/show_bug.cgi?id=1512519
The text was updated successfully, but these errors were encountered: