-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failure to optimise std::cmp::{min,max}
on bool
#114653
Comments
here is the llvm ir this generates:
@example::min_and = unnamed_addr alias i1 (i1, i1), ptr @example::min_if
; example::min_std
define noundef zeroext i1 @example::min_std(i1 noundef zeroext %a, i1 noundef zeroext %b) unnamed_addr personality ptr @rust_eh_personality {
start:
%.neg.i = sext i1 %a to i8
%0 = zext i1 %b to i8
%1 = xor i8 %0, -1
%switch.i = icmp eq i8 %1, %.neg.i
%_0.0.i = select i1 %switch.i, i1 %b, i1 %a
ret i1 %_0.0.i
}
; example::max_std
define noundef zeroext i1 @example::max_std(i1 noundef zeroext %a, i1 noundef zeroext %b) unnamed_addr personality ptr @rust_eh_personality {
start:
%.neg.i = sext i1 %a to i8
%0 = zext i1 %b to i8
%1 = xor i8 %0, -1
%switch.i = icmp eq i8 %1, %.neg.i
%_0.0.i = select i1 %switch.i, i1 %a, i1 %b
ret i1 %_0.0.i
}
; example::min_if
define noundef zeroext i1 @example::min_if(i1 noundef zeroext %a, i1 noundef zeroext %b) unnamed_addr {
start:
%a.b = and i1 %a, %b
ret i1 %a.b
}
; example::max_if
define noundef zeroext i1 @example::max_if(i1 noundef zeroext %a, i1 noundef zeroext %b) unnamed_addr {
start:
%0 = xor i1 %b, true
%_3 = and i1 %0, %a
%a.b = select i1 %_3, i1 %a, i1 %b
ret i1 %a.b
}
; example::max_or
define noundef zeroext i1 @example::max_or(i1 noundef zeroext %a, i1 noundef zeroext %b) unnamed_addr {
start:
%_0 = or i1 %a, %b
ret i1 %_0
}
declare noundef i32 @rust_eh_personality(i32 noundef, i32 noundef, i64 noundef, ptr noundef, ptr noundef) unnamed_addr #1 it first extends the i1s to i8s, negating b and sign extending a, and then finally does an icmp, which seems very convoluted to me. |
https://github.com/rust-lang/rust/tree/master/library/core/src#L1397 explains the zero and sign extensions. ( |
@rustbot label +I-slow |
Corresponding LLVM issue: llvm/llvm-project#64537 |
std::min::{min,max}
on bool
std::cmp::{min,max}
on bool
…r=cuviper Optimizing the rest of bool's Ord implementation After coming across issue rust-lang#66780, I realized that the other functions provided by Ord (`min`, `max`, and `clamp`) were similarly inefficient for bool. This change provides implementations for them in terms of boolean operators, resulting in much simpler assembly and faster code. Fixes issue rust-lang#114653 [Comparison on Godbolt](https://rust.godbolt.org/z/5nb5P8e8j) `max` assembly before: ```assembly example::max: mov eax, edi mov ecx, eax neg cl mov edx, esi not dl cmp dl, cl cmove eax, esi ret ``` `max` assembly after: ```assembly example::max: mov eax, edi or eax, esi ret ``` `clamp` assembly before: ```assembly example::clamp: mov eax, esi sub al, dl inc al cmp al, 2 jae .LBB1_1 mov eax, edi sub al, sil movzx ecx, dil sub dil, dl cmp dil, 1 movzx edx, dl cmovne edx, ecx cmp al, -1 movzx eax, sil cmovne eax, edx ret .LBB1_1: ; identical assert! code ``` `clamp` assembly after: ```assembly example::clamp: test edx, edx jne .LBB1_2 test sil, sil jne .LBB1_3 .LBB1_2: or dil, sil and dil, dl mov eax, edi ret .LBB1_3: ; identical assert! code ```
…r=cuviper Optimizing the rest of bool's Ord implementation After coming across issue rust-lang#66780, I realized that the other functions provided by Ord (`min`, `max`, and `clamp`) were similarly inefficient for bool. This change provides implementations for them in terms of boolean operators, resulting in much simpler assembly and faster code. Fixes issue rust-lang#114653 [Comparison on Godbolt](https://rust.godbolt.org/z/5nb5P8e8j) `max` assembly before: ```assembly example::max: mov eax, edi mov ecx, eax neg cl mov edx, esi not dl cmp dl, cl cmove eax, esi ret ``` `max` assembly after: ```assembly example::max: mov eax, edi or eax, esi ret ``` `clamp` assembly before: ```assembly example::clamp: mov eax, esi sub al, dl inc al cmp al, 2 jae .LBB1_1 mov eax, edi sub al, sil movzx ecx, dil sub dil, dl cmp dil, 1 movzx edx, dl cmovne edx, ecx cmp al, -1 movzx eax, sil cmovne eax, edx ret .LBB1_1: ; identical assert! code ``` `clamp` assembly after: ```assembly example::clamp: test edx, edx jne .LBB1_2 test sil, sil jne .LBB1_3 .LBB1_2: or dil, sil and dil, dl mov eax, edi ret .LBB1_3: ; identical assert! code ```
I made a PR to llvm-project to optimize the |
Actually this feature has already been implemented by 074f23e3e199. This issue can be closed. |
I tried this code: (godbolt)
I expected to see this happen:
All implementations of max should optimise to
a & b
. All implementations of min should optimise toa | b
Instead, this happened:
The generated assembly code is suboptimal for
min_std
,max_std
,max_ternary
andmax_if
(on AArch64)min_std
andmax_std
can be fixed by overriding the default implementation ofmin
andmax
forbool
:But it might be worth investigating the LLVM-IR that rustc generates in
min_if
/max_if
too to see why it isn't optimised tollvm.umin()
/llvm.umax()
Meta
rustc --version --verbose
:Backtrace
The text was updated successfully, but these errors were encountered: