-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance regression from Rust 1.37 to 1.38 when using unreachable_unchecked #74615
Comments
Assigning |
As far as I can tell there's a regression in LLVM 9: https://rust.godbolt.org/z/b9frfW Check the assembly of these Rust (nightly) and C (clang trunk) examples then switch Rust to 1.37 and clang to 8.0.0. Looks familiar? |
Probably LLVM should drop an unreachable default case if that allows folding a large number of cases into the default case. |
#74693 (which replaced |
Replacing |
I just checked the assembly produced on playground with nightly and it still generates redundant checks. |
The three tests are now within an instruction of each other per https://godbolt.org/z/qccfs9cK5 optimized assembly for 1.37complex_test:
cmp rdi, 8
je .LBB0_3
cmp rdi, 16
je .LBB0_3
xor eax, eax
ret
.LBB0_3:
mov al, 1
ret
simpler_test:
cmp rdi, 16
je .LBB1_3
cmp rdi, 8
jne .LBB1_2
.LBB1_3:
mov al, 1
ret
.LBB1_2:
xor eax, eax
ret
complex_test:
test dil, 24
setne al
ret optimized assembly for 1.38complex_test:
xor eax, eax
cmp rdi, 127
jg .LBB0_5
add rdi, -1
cmp rdi, 63
ja .LBB0_8
movabs rcx, -9223372034707292149
bt rcx, rdi
jb .LBB0_7
mov eax, 32896
bt rax, rdi
jae .LBB0_8
mov al, 1
ret
.LBB0_5:
cmp rdi, 128
je .LBB0_7
cmp rdi, 256
.LBB0_7:
ret
.LBB0_8:
ud2
simpler_test:
cmp rdi, 16
je .LBB1_3
cmp rdi, 8
jne .LBB1_2
.LBB1_3:
mov al, 1
ret
.LBB1_2:
xor eax, eax
ret
simplest_test:
test dil, 24
setne al
ret optimized assembly for 1.81complex_test:
rep bsf rax, rdi
add rax, -3
cmp rax, 2
setb al
ret
simpler_test:
add rdi, -8
test rdi, -9
sete al
ret
simplest_test:
test dil, 24
setne al
ret I think we should declare victory here with a regression test. |
…r=Mark-Simulacrum Add Four Codegen Tests Closes rust-lang#74615 Closes rust-lang#123216 Closes rust-lang#49572 Closes rust-lang#93514 This PR adds four codegen tests. The FileCheck assertions were generated with the help of `update_test_checks.py` and `update_llc_test_checks.py` from the LLVM project.
I tried this code when looking at the issue #73015.
I expect all other match arms to be optimised out, so basically the code would be equivalent to
or maybe even
On Rust 1.37 it is quite optimised but a huge chunk of code is generated for 1.38. Interestingly, if the unreachable_unchecked is replaced with a value such as
FpCategory::Nan
,test
could still be optimised.@rustbot modify labels: +I-slow +I-heavy +T-compiler +A-codegen
The text was updated successfully, but these errors were encountered: