-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect handling of lateout pairs in inline asm #101346
Comments
Yeah, the 32-bit cpuid implementation at https://doc.rust-lang.org/src/core/up/up/stdarch/crates/core_arch/src/x86/cpuid.rs.html#62 is compiling to something wrong. This is compiling to 0x565ba631 <+193>: mov %ebx,%eax
0x565ba633 <+195>: cpuid
0x565ba635 <+197>: xchg %eax,%ebx on
|
Yeah, after your post in the RustCrypto issue I also tried it in godbolt and got the same weird assembly: https://rust.godbolt.org/z/4z9nrY1Eh Rust: use std::arch::x86::__cpuid;
pub unsafe fn foo() -> u32 {
// get manufacture ID
let res = __cpuid(0);
res.ebx
} Generated assembly which is obviously wrong: example::foo:
xor eax, eax ; set EAX to zero
xor ecx, ecx ; set ECX to zero
mov eax, ebx ; "cache" calee-saved EBX to EAX
cpuid ; WHOOPS not only we now use EAX which contains EBX's value,
; but also CPUID overwrites EAX, making the caching useless
xchg eax, ebx ; move EBX into EAX as return result
ret Bad news is that on x68-64 we get the same wrong assembly: example::foo:
xor eax, eax
xor ecx, ecx
mov rax, rbx
cpuid
xchg rax, rbx
ret I think correct codegen should cache EBX to stack. Honestly, I am amazed that this issue has not surfaced earlier. If example::foo:
push esi
xor eax, eax
xor ecx, ecx
mov esi, ebx
cpuid
xchg esi, ebx
pop esi
ret |
It's supposed to pick a different register and save it there, and that both should work (with proper clobbers) and should be better than pushing to the stack. Changing it to |
https://rust.godbolt.org/z/Gr1ve77a6 suggests that it might be that LLVM considers unused lateout arguments to be fair game to delete along with the clobber. |
I think the issue is with incorrect use of This code produces incorrect result: https://rust.godbolt.org/z/j9v6v69Pz But by changing |
Yeah, the thing is that that doesn't really make sense in general - |
|
That one is probably just |
You are right, A better example, which probably can be used as a minified demonstration of the issue: https://rust.godbolt.org/z/3xMenGjMe pub fn foo() -> u32 {
let t1: u32;
let t2: u32;
unsafe {
asm!(
"mov {0}, 1",
"mov eax, 42",
lateout(reg) t1,
lateout("eax") t2, // `t2` can be replaced by `_`
options(nostack),
);
}
t1
} example::foo:
mov rax, 1
mov eax, 42
ret |
Either way, the |
For what it's worth, the original But that's not what's happening here, so it does seem like LLVM is miscompiling the code. |
You can't specify |
That said, I'm not sure that that's true anymore - while I vaguely recall some sort of error along those lines, I can't trigger one from the PLT usage of Oh it's 64-bit that uses rbx and 32-bit uses esi, so the 32-bit cpuid could just do that. |
Trying to clobber
Honestly, it's quite annoying and I wish we did not have such exception. |
WG-prioritization assigning priority (Zulip discussion). @rustbot label -I-prioritize +P-critical T-compiler |
Does the change that landed in rust-lang/stdarch#1329 resolve this? (In other words, do we just need to pull in those changes in some manner to resolve this issue?) Or is there something else that will need to happen? |
also, given that this is a P-critical issue that affects 1.57 and later: Should we be talking about beta backports of the changes associated with fixing this? |
That change should resolve this, yes. |
The issue affects all uses of |
Update stdarch This pulls in the following changes: - [Use simd_bitmask intrinsic in a couple of places](rust-lang/stdarch@9f09287) - [Remove simd_shuffle<n> usage in favor of simd_shuffle](rust-lang/stdarch@3fd17e4) - [Remove late specifiers in __cpuid_count](rust-lang/stdarch@f1db941) - Helps with rust-lang#101346 - [Use mov and xchg instead of movl(q) and xchgl(q)](rust-lang/stdarch@3049a31) - [Bump cfg-if dependency to 1.0](rust-lang/stdarch@f305cc8) - [Fix documentation of __m256bh and __m512bh structs](rust-lang/stdarch@699c093) r? `@Amanieu`
Update stdarch This pulls in the following changes: - [Use simd_bitmask intrinsic in a couple of places](rust-lang/stdarch@9f09287) - [Remove simd_shuffle<n> usage in favor of simd_shuffle](rust-lang/stdarch@3fd17e4) - [Remove late specifiers in __cpuid_count](rust-lang/stdarch@f1db941) - Helps with rust-lang#101346 - [Use mov and xchg instead of movl(q) and xchgl(q)](rust-lang/stdarch@3049a31) - [Bump cfg-if dependency to 1.0](rust-lang/stdarch@f305cc8) - [Fix documentation of __m256bh and __m512bh structs](rust-lang/stdarch@699c093) r? ``@Amanieu``
deps: update cpufeatures, swap difference to dissimilar Updating cpufeatures v0.2.1 -> v0.2.5: https://github.com/RustCrypto/utils/blob/master/cpufeatures/CHANGELOG.md#025-2022-09-04, was yanked bc of miscompile (RustCrypto/utils#800, rust-lang#101346) Removing difference v2.0.0 Adding dissimilar v1.0.4 Updating expect-test v1.0.1 -> v1.4.0 difference unmaintened https://rustsec.org/advisories/RUSTSEC-2020-0095.html, so replaced with https://github.com/dtolnay/dissimilar (as dependency of `expect-test`)
deps: update cpufeatures, swap difference to dissimilar Updating cpufeatures v0.2.1 -> v0.2.5: https://github.com/RustCrypto/utils/blob/master/cpufeatures/CHANGELOG.md#025-2022-09-04, was yanked bc of miscompile (RustCrypto/utils#800, rust-lang/rust#101346) Removing difference v2.0.0 Adding dissimilar v1.0.4 Updating expect-test v1.0.1 -> v1.4.0 difference unmaintened https://rustsec.org/advisories/RUSTSEC-2020-0095.html, so replaced with https://github.com/dtolnay/dissimilar (as dependency of `expect-test`)
(issue is loosely owned, in terms of P-high tracking, by @wesleywiser and @pnkfelix keeping their eyes on llvm/llvm-project#57550)
Updated bug description
The following Rust function:
Gets compiled into this obviously incorrect assembly:
Godbolt link: https://rust.godbolt.org/z/Yb9v7WobM
LLVM incorrectly reuses register for a pair of
lateout
s if it can see that one of those does not get used later.Original description below
We get spurious segfaults in the
chacha20
crate when we run tests compiled for i686 target (i686-unknown-linux-gnu
to be exact), see RustCrypto/stream-ciphers#304 for more information. Interestingly enough, Rust 1.56 does not have this issue, only 1.57 and later. Changes in thecpufeatures
crate which revealed the issue look quite innocent. The issue also disappears ifzeroize
feature is disabled, which is quite tangential tocpufeatures
(it only addsDrop
impls). Even weirder, @aumetra's findings show that segfault happens at the following line:Granted, the
chacha20
crate contains a fair bit of unsafe code, as well as its dependencies, so the issue may be caused by unsoundness somewhere in our crates. But the circumstantial evidence makes a potential compiler bug quite probable.The text was updated successfully, but these errors were encountered: