Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove @setAlignStack, replacing it with function incoming stack alignment attributes #21209

Closed
mlugg opened this issue Aug 26, 2024 · 12 comments
Labels
accepted This proposal is planned. breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@mlugg
Copy link
Member

mlugg commented Aug 26, 2024

Background

The @setAlignStack builtin is weird. It doesn't provide any meaningful guarantees: all it does is enforce a stack alignment in the function's prologue. There is no guarantee that any local variable is aligned to this boundary, nor that calls to other functions preserve the stack alignment.

It does have one theoretical use case, which is correcting the stack alignment when a function will be called with a lower stack alignment than it expects. For instance, the Linux kernel on x86_64 compiles with -mpreferred-stack-boundary=3, which overrides the default 16-byte stack alignment on calls mandated by the System V ABI to a lower 8-byte stack alignment. This means that when compiling Zig code to use in a Linux kernel module, the functions are called with a lower stack alignment than necessary. This isn't catastrophic, since the only x86_64 instructions which rely on alignment are for things like SSE which are disabled in kernel code, but LLVM could still theoretically use this incorrect assumption of stack alignment, triggering a miscompilation.

Today, @setAlignStack does work for this use case with the LLVM backend: however, this is kind of an implementation detail, and self-hosted backends don't do it.

We could redefine @setAlignStack to be used for this purpose. However, it's a bit weird for this to be a builtin at that point, and its purpose becomes unintuitive.

Proposal

Remove the @setAlignStack builtin. Add a new syntax to functions, in or after callconv, to override their incoming stack alignment, i.e. the alignment callers will enforce before a call instruction.

Since we're effectively overriding part of the calling convention here, I think it would make sense for this information to be embedded in the callconv(...) syntax. For instance, perhaps callconv(.c : .{ .incoming_stack_align = 8 }), where the init expression there is for a std.builtin.CallingConventionOverrides:

/// This data structure is used by the Zig language code generation and
/// therefore must be kept in sync with the compiler implementation.
pub const CallingConventionOverrides = struct {
    /// The alignment of the stack before this function is called.
    incoming_stack_align: usize,
};

This would allow Zig functions which will be called by e.g. the Linux kernel to specify their incoming stack alignment as 8 bytes. This would instruct code generation to re-align the stack as needed in the function prologue.

Using a struct here opens the door to adding more ways to customise your calling convention in the future. Taking this to an extreme, one could imagine a system where std.builtin.CallingConvention becomes a struct detailing every part of the calling convention, but that's probably not desirable due to the complexity of many callconvs.

If we want to avoid the callconv(foo : bar) syntax (which is arguably a little ugly), we could make CallingConvention a struct with a "base" callconv and override fields, and make use of #9938 to make today's uses work:

/// This data structure is used by the Zig language code generation and
/// therefore must be kept in sync with the compiler implementation.
pub const CallingConvention = struct {
    base: Base,
    /// Overrides the alignment of the stack before this function is called.
    incoming_stack_align: ?usize = null,

    pub const Base = enum(u8) {
        unspecified,
        c,
        // ...
    };

    pub const unspecified: CallingConvention = .{ .base = .unspecified };
    pub const c: CallingConvention = .{ .base = .c };
    // ...
};

I'm indifferent as to what we go with: I think callconv(.c : .{ .incoming_stack_align = 8 }) would be fine, but I can certainly see the appeal of unifying this into one struct.

@mlugg mlugg added breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. use case Describes a real use case that is difficult or impossible, but does not propose a solution. labels Aug 26, 2024
@mlugg mlugg added this to the 0.15.0 milestone Aug 26, 2024
@ziglang ziglang deleted a comment Aug 26, 2024
@BratishkaErik
Copy link
Contributor

If I understood correctly this proposal also has a nice bonus of exposing this alignment information to reflection via @typeInfo right? Unlike @setAlignStack which is not extractable now.

@190n
Copy link
Contributor

190n commented Aug 26, 2024

I like this.

This would allow Zig functions which will be called by e.g. the Linux kernel to specify their incoming stack alignment as 8 bytes. This would instruct code generation to re-align the stack as needed in the function prologue.

Should functions in safe builds assert that the stack pointer is correctly aligned upon entry? This would help catch issues like the one you described calling Zig functions in the kernel, if the user forgot to specify that the stack pointer is expected to be under-aligned. Anecdotally, I had a similar issue where I didn't align the stack correctly while calling Zig from some inline assembly, and it was very hard to debug as the only segfault was deep inside some std.fmt code that was doing 16-byte-aligned loads or stores on the stack.

My hesitations with that safety check are:

  • Performance. But it could maybe be avoided in a lot of cases. Functions which are not exported or whose address is never taken probably can omit this check.
  • Describing the necessary alignment. From the caller's side -- i.e., what should the alignment be in the instant before executing a call -- this is easy. But from the callee's side it is complicated on x86_64, as after the caller performs a call with a 16-byte-aligned stack, the callee will start running with the stack offset by 8 bytes since the CPU stored the return address. So the callee's stack pointer must be 8 less than a multiple of 16. So the callee would need to know an alignment and offset in order to generate a proper safety check for what the stack pointer should be, not just an alignment.

@mlugg
Copy link
Member Author

mlugg commented Aug 26, 2024

A safety check would probably be a nice idea. As you say, it could be omitted on all internal functions. A simple rule is that we include the check if a function is export, or we take its address and it has a callconv other than Unspecified. That sounds pretty trivial performance-wise.

Regarding describing the necessary alignment, this should be fine -- this proposal requires Zig to be aware of the details of stack alignment anyway. Implementation-wise, this safety check would probably be a single safety_check_stack_align() AIR instruction, so semantic analysis doesn't need to be aware of the alignment requirement, only backends. It's still easy to check in the offset case -- for instance, just check that rsp & 0xF == 0x8 on x86_64.

@mlugg mlugg moved this to Blockers in Linux kernel module Aug 26, 2024
@andrewrk andrewrk removed the use case Describes a real use case that is difficult or impossible, but does not propose a solution. label Aug 27, 2024
@mlugg mlugg added the accepted This proposal is planned. label Aug 27, 2024
@mlugg
Copy link
Member Author

mlugg commented Aug 27, 2024

Accepted for now is the const CallingConvention = struct { ... } approach at the bottom of the proposal text, using decl literals. Decl literals hence become a blocker for this proposal. Also, c is not a tag in Base; instead, it's defined as the right "concrete" callconv using a switch on @import("builtin").target. This will probably require adding a few more target-specific callconvs.

@jacobly0
Copy link
Member

jacobly0 commented Aug 27, 2024

You don't need a safety check on things that would only ever fail due to a compiler bug. Once you safety check the incoming stack pointer, the alignment at all call sites is guaranteed by the backend and cannot be affected by user error in a way that would not be better served by just checking that certain things don't affect the stack pointer (inline asm, calls with certain calling conventions, etc.). Also an air instruction wouldn't really work because you need to safety check the alignment before the stack is realigned to where it needs to be for the current function which happens before the first air instruction.

@mlugg
Copy link
Member Author

mlugg commented Aug 27, 2024

You don't need a safety check on things that would only ever fail due to a compiler bug.

Yes, the safety check I am proposing is precisely what you seem to be talking about there -- a safety check for the incoming stack pointer when a function might be called into from external code. Hence my proposed rule which models the situations in which external code could make such a call:

it could be omitted on all internal functions. A simple rule is that we include the check if a function is export, or if we take its address and it has a callconv other than Unspecified.

@jacobly0
Copy link
Member

jacobly0 commented Aug 27, 2024

A simple rule is that we include the check if a function is export, or if we take its address and it has a callconv other than Unspecified.

I'm not sure where Unspecified comes in here, you still want to be able to detect this case:

fn foo() callconv(.{ .base = .Unspecified, .incoming_stack_align = 32 }) void {}
const bar: *const fn foo() callconv(.{ .base = .Unspecified, .incoming_stack_align = 1 }) = @ptrCast(&foo);
export fn entry() void {
    bar();
}

That said, I'm not sure how you know whether the address of a function is taken the first time you codegen it.

@mlugg
Copy link
Member Author

mlugg commented Aug 27, 2024

you still want to be able to detect this case:

Eh, I think it's fine for that to be unchecked. It's not like we can check for this more generally -- for instance, you could also change the actual base cc with a ptrcast, and there's no way to check for safety there. The safety check here is really about ABI boundaries, because they're the thing that's very easy to get wrong.

I'm not sure how you know whether the address of a function is taken the first time you codegen it.

Um, good point! Okay, simpler rule: we just include the check for any callconv other than .Unspecified. That callconv can only be used within the ZCU, so you can't trigger this UB without a very intentional pointer cast, so it's not a big risk; but we get the safety check at any possible ABI boundary. And functions with a callconv other than Unspecified are used pretty much exclusively at ABI boundaries, so this shouldn't have any noticeable performance impact either.

@alexrp
Copy link
Member

alexrp commented Sep 2, 2024

Another good argument for CallingConvention becoming more advanced than an enum is interrupt. This CC needs to be able to take an architecture-specific value that indicates what kind of interrupt handler is being defined. That kind of makes me think that Base should be a tagged union.

@alexrp
Copy link
Member

alexrp commented Sep 2, 2024

@mlugg here's the branch I was working on: https://github.com/alexrp/zig/commits/call-convs

I don't think I have more to do at this point; further interrupt support will require the ability to attach more info to the CC as mentioned. So if you like, maybe you can base your work off this branch and I can fill in all the interrupt nonsense when the fundamentals are in place.

(Can also help dig up whatever ABI details you might need to decide on the direction for this.)

@mlugg
Copy link
Member Author

mlugg commented Sep 2, 2024

@alexrp

That kind of makes me think that Base should be a tagged union.

When discussing this change with @andrewrk and @jacobly0, I floated the idea of CallingConvention itself being a tagged union, with e.g. the incoming_stack_align override existing as a field on the payload of CCs where that makes sense (admittedly, that's probably most of them in this particular case). Perhaps that idea would be worth revisiting?

@alexrp
Copy link
Member

alexrp commented Sep 2, 2024

I can imagine a world where CallingConvention itself being a tagged union gets out of hand if we ever get more (near-)universal CC properties like stack alignment... but I can't think of any off the top of my head. So I guess my vote would be to go with CallingConvention itself as a tagged union first, and change to CallingConvention as struct + Base as tagged union if it turns out that something other than stack alignment comes up. But I don't feel super strongly about it either way.

mlugg added a commit to mlugg/zig that referenced this issue Oct 14, 2024
This commit begins implementing accepted proposal ziglang#21209 by making
`std.builtin.CallingConvention` a tagged union.

The stage1 dance here is a little convoluted. This commit introduces the
new type as `NewCallingConvention`, keeping the old `CallingConvention`
around. The compiler uses `std.builtin.NewCallingConvention`
exclusively, but when fetching the type from `std` when running the
compiler (e.g. with `getBuiltinType`), the name `CallingConvention` is
used. This allows a prior build of Zig to be used to build this commit.
The next commit will update `zig1.wasm`, and then the compiler and
standard library can be updated to completely replace
`CallingConvention` with `NewCallingConvention`.

The second half of ziglang#21209 is to remove `@setAlignStack`, which will be
implemented in another commit after updating `zig1.wasm`.
mlugg added a commit to mlugg/zig that referenced this issue Oct 14, 2024
This commit finishes implementing ziglang#21209 by removing the
`@setAlignStack` builtin in favour of `CallingConvention` payloads. The
x86_64 backend is updated to use the stack alignment given in the
calling convention (the LLVM backend was already updated in a previous
commit).

Resolves: ziglang#21209
mlugg added a commit to mlugg/zig that referenced this issue Oct 17, 2024
This commit begins implementing accepted proposal ziglang#21209 by making
`std.builtin.CallingConvention` a tagged union.

The stage1 dance here is a little convoluted. This commit introduces the
new type as `NewCallingConvention`, keeping the old `CallingConvention`
around. The compiler uses `std.builtin.NewCallingConvention`
exclusively, but when fetching the type from `std` when running the
compiler (e.g. with `getBuiltinType`), the name `CallingConvention` is
used. This allows a prior build of Zig to be used to build this commit.
The next commit will update `zig1.wasm`, and then the compiler and
standard library can be updated to completely replace
`CallingConvention` with `NewCallingConvention`.

The second half of ziglang#21209 is to remove `@setAlignStack`, which will be
implemented in another commit after updating `zig1.wasm`.
mlugg added a commit to mlugg/zig that referenced this issue Oct 17, 2024
This commit finishes implementing ziglang#21209 by removing the
`@setAlignStack` builtin in favour of `CallingConvention` payloads. The
x86_64 backend is updated to use the stack alignment given in the
calling convention (the LLVM backend was already updated in a previous
commit).

Resolves: ziglang#21209
mlugg added a commit to mlugg/zig that referenced this issue Oct 19, 2024
This commit begins implementing accepted proposal ziglang#21209 by making
`std.builtin.CallingConvention` a tagged union.

The stage1 dance here is a little convoluted. This commit introduces the
new type as `NewCallingConvention`, keeping the old `CallingConvention`
around. The compiler uses `std.builtin.NewCallingConvention`
exclusively, but when fetching the type from `std` when running the
compiler (e.g. with `getBuiltinType`), the name `CallingConvention` is
used. This allows a prior build of Zig to be used to build this commit.
The next commit will update `zig1.wasm`, and then the compiler and
standard library can be updated to completely replace
`CallingConvention` with `NewCallingConvention`.

The second half of ziglang#21209 is to remove `@setAlignStack`, which will be
implemented in another commit after updating `zig1.wasm`.
@mlugg mlugg closed this as completed in ec19086 Oct 23, 2024
richerfu pushed a commit to richerfu/zig that referenced this issue Oct 28, 2024
This commit begins implementing accepted proposal ziglang#21209 by making
`std.builtin.CallingConvention` a tagged union.

The stage1 dance here is a little convoluted. This commit introduces the
new type as `NewCallingConvention`, keeping the old `CallingConvention`
around. The compiler uses `std.builtin.NewCallingConvention`
exclusively, but when fetching the type from `std` when running the
compiler (e.g. with `getBuiltinType`), the name `CallingConvention` is
used. This allows a prior build of Zig to be used to build this commit.
The next commit will update `zig1.wasm`, and then the compiler and
standard library can be updated to completely replace
`CallingConvention` with `NewCallingConvention`.

The second half of ziglang#21209 is to remove `@setAlignStack`, which will be
implemented in another commit after updating `zig1.wasm`.
richerfu pushed a commit to richerfu/zig that referenced this issue Oct 28, 2024
This commit finishes implementing ziglang#21209 by removing the
`@setAlignStack` builtin in favour of `CallingConvention` payloads. The
x86_64 backend is updated to use the stack alignment given in the
calling convention (the LLVM backend was already updated in a previous
commit).

Resolves: ziglang#21209
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted This proposal is planned. breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
Status: Blockers
Development

No branches or pull requests

6 participants