split_at fails to optimize bounds check #74938

lcnr · 2020-07-30T09:21:57Z

https://rust.godbolt.org/z/E1PnPj

const N: usize = 3;
const T = u8;

pub fn split_mutiple(slice: &[T]) -> (&[T], &[T]) {
    let len = slice.len() / N;
    slice.split_at(len * N)
}

results in the following assembly

example::split_mutiple:
        push    rax
        mov     r8, rdx
        mov     rcx, rdi
        movabs  rdx, -6148914691236517205
        mov     rax, r8
        mul     rdx
        shr     rdx
        lea     rdx, [rdx + 2*rdx]
        mov     rax, r8
        sub     rax, rdx
        mov     rdi, r8
        sub     rdi, rax
        jb      .LBB0_1  ; Note that this check can never be actually hit
        mov     qword ptr [rcx], rsi
        add     rsi, rdi
        mov     qword ptr [rcx + 8], rdi
        mov     qword ptr [rcx + 16], rsi
        mov     qword ptr [rcx + 24], rax
        mov     rax, rcx
        pop     rcx
        ret
.LBB0_1:
        lea     rdx, [rip + .L__unnamed_1]
        mov     rsi, r8
        call    qword ptr [rip + core::slice::slice_index_len_fail@GOTPCREL]
        ud2

When using const N: usize = 4 the check is correctly optimized away:

example::split_mutiple:
        mov     rax, rdi
        mov     rcx, rdx
        and     rcx, -4
        and     edx, 3
        mov     qword ptr [rdi], rsi
        add     rsi, rcx
        mov     qword ptr [rdi + 8], rcx
        mov     qword ptr [rdi + 16], rsi
        mov     qword ptr [rdi + 24], rdx
        ret

This happens both on all stable versions I have tested and on the most recent nightly.

The text was updated successfully, but these errors were encountered:

bugadani · 2020-07-31T08:12:22Z

In fact, this has nothing to do with slices: https://rust.godbolt.org/z/7oG6qE change from 3 to 4 (or other powers of 2) to see everything optimized away.

Edit: reduced it some more: https://rust.godbolt.org/z/bEoj8x

tesuji · 2020-07-31T08:31:20Z

There are no optimization in GCC and Clang either: https://godbolt.org/z/M88ndP

#include <stddef.h>

const size_t N = 3;

int foo(size_t len) {
	size_t newlen = (len / N) * N;
	return newlen <= len;
}

Should foo be optimized to only return true?

tesuji · 2020-07-31T08:41:54Z

So this is more as LLVM part: @rustbot modify labels: +A-LLVM +A-mir-opt
Of course this might be improved with MIR-opt. But how hard it would be? I don't know.

bugadani · 2020-07-31T08:45:10Z

So this is more as LLVM part

What's interesting is that the power of 2 optimization is done by rustc somewhere - the same MIR is translated to radically different LLVM-IR when N is 2^n.

Trying to find where this happens without knowledge of the compiler is rather difficult, though :)

tesuji · 2020-07-31T08:47:03Z

Clang could do that, change N=4 in my snippet about to see. But still GCC doesn't do that.

nikic · 2020-07-31T09:00:52Z

https://github.com/llvm/llvm-project/blob/63d3aeb529a7b0fb95c2092ca38ad21c1f5cfd74/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp#L311 needs to set a nuw flag. Then it will probably fold.

bugadani · 2020-07-31T10:05:02Z

Cool. So if I understand correctly, the following is happening:

The bounds check boils down to basically (length / N) * N <= length.
This can be transformed to length - length % N <= length (fair enough)
This is further transformed to (as the llvm ir shows) length % N <= length; (~~I don't see how this transform is equivalent, but OK~~)

And for some reason, LLVM can't seem to deduce that this comparison will always be false. Adding this as an assumption will make the check be optimized:

https://rust.godbolt.org/z/vM1KPM

xldenis · 2020-07-31T21:07:27Z

~~@bugadani, what's weird is that if you feed this IR straight into llc it does optimize it away: https://rust.godbolt.org/z/o6d6nM. Are we not calling a magic pass on the IR in rustc?~~

Ignore me, I missed that the assume caused the IR to have ret true

However, it does optimize if a udiv is inserted instead of a urem, InstructionSimplify has the following opt: x udiv y <=u x I don't know why the same doesn't exist for urem? maybe no one considered it until now?

LLVM doesn't know about this optimization but it is apparently valid (thanks John Regehr) so I'll submit a patch to LLVM.

xldenis · 2020-08-02T11:44:31Z

I've submitted a patch to LLVM adding this optimization: https://reviews.llvm.org/D85092

tesuji · 2020-08-02T11:46:35Z

Does anyone want to file a bug against GCC? I registered an account two days ago but they hasn't accepted my account yet.

This revision adds the following peephole optimization and it's negation: %a = urem i64 %x, %y %b = icmp ule i64 %a, %x ====> %b = true With John Regehr's help this optimization was checked with Alive2 which suggests it should be valid. This pattern occurs in the bound checks of Rust code, the program const N: usize = 3; const T = u8; pub fn split_mutiple(slice: &[T]) -> (&[T], &[T]) { let len = slice.len() / N; slice.split_at(len * N) } the method call slice.split_at will check that len * N is within the bounds of slice, this bounds check is after some transformations turned into the urem seen above and then LLVM fails to optimize it any further. Adding this optimization would cause this bounds check to be fully optimized away. ref: rust-lang/rust#74938 Differential Revision: https://reviews.llvm.org/D85092

xldenis · 2020-08-05T09:22:22Z

^ as you can see above this optimization has landed in LLVM master, so whenever LLVM is bumped we should get this for free.

tesuji · 2020-08-05T09:32:14Z

Does it land with LLVM 11?

xldenis · 2020-08-05T09:39:21Z

it's not in RC-1 which was released a week ago, I don't know if they will include it in a future release candidate or not.

xldenis · 2020-08-07T12:38:31Z

This will be solved in LLVM 12 as my commit missed the cutoff (it was before the issue was even filed)

jrmuizel · 2020-10-22T14:51:19Z

I filed the gcc bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97529

jrmuizel · 2021-03-02T03:47:46Z

I confirmed that with #81451 the bounds check is removed.

This revision adds the following peephole optimization and it's negation: %a = urem i64 %x, %y %b = icmp ule i64 %a, %x ====> %b = true With John Regehr's help this optimization was checked with Alive2 which suggests it should be valid. This pattern occurs in the bound checks of Rust code, the program const N: usize = 3; const T = u8; pub fn split_mutiple(slice: &[T]) -> (&[T], &[T]) { let len = slice.len() / N; slice.split_at(len * N) } the method call slice.split_at will check that len * N is within the bounds of slice, this bounds check is after some transformations turned into the urem seen above and then LLVM fails to optimize it any further. Adding this optimization would cause this bounds check to be fully optimized away. ref: rust-lang/rust#74938 Differential Revision: https://reviews.llvm.org/D85092

xldenis · 2021-06-25T12:05:52Z

Can we close this issue?

mati865 · 2022-01-14T00:53:38Z

@lcnr could you test again and close this issue?

lcnr · 2022-01-14T11:56:48Z

this has been fixed, would be nice to add a codegen test for this, asserting that split_multiple can't panic

This revision adds the following peephole optimization and it's negation: %a = urem i64 %x, %y %b = icmp ule i64 %a, %x ====> %b = true With John Regehr's help this optimization was checked with Alive2 which suggests it should be valid. This pattern occurs in the bound checks of Rust code, the program const N: usize = 3; const T = u8; pub fn split_mutiple(slice: &[T]) -> (&[T], &[T]) { let len = slice.len() / N; slice.split_at(len * N) } the method call slice.split_at will check that len * N is within the bounds of slice, this bounds check is after some transformations turned into the urem seen above and then LLVM fails to optimize it any further. Adding this optimization would cause this bounds check to be fully optimized away. ref: rust-lang/rust#74938 Differential Revision: https://reviews.llvm.org/D85092

Add codegen tests for E-needs-test close rust-lang#36010 close rust-lang#68667 close rust-lang#74938 close rust-lang#83585 close rust-lang#93036 close rust-lang#109328 close rust-lang#110797 close rust-lang#111508 close rust-lang#112509 close rust-lang#113757 close rust-lang#120440 close rust-lang#118392 close rust-lang#71096 r? nikic

jonas-schievink added C-enhancement Category: An issue proposing an enhancement or a PR with one. I-slow Issue: Problems and improvements with respect to performance of generated code. labels Jul 30, 2020

rustbot added the A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. label Jul 31, 2020

rustbot added the A-mir-opt Area: MIR optimizations label Jul 31, 2020

tesuji mentioned this issue Aug 26, 2020

slice::iter() does not preserve number of iterations information for optimizer causing unneeded bounds checks #75935

Open

sdroege mentioned this issue Aug 26, 2020

Get rid of bounds check in slice::chunks_exact() and related function… #75936

Merged

Stargateur mentioned this issue Oct 29, 2021

split_at family should return an Result #90410

Closed

lcnr added the E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. label Jan 14, 2022

Noratrieb added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Apr 5, 2023

workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023

tesuji added a commit to tesuji/rustc that referenced this issue May 20, 2024

add codegen test for rust-lang#74938

ea02967

tesuji mentioned this issue May 20, 2024

Add codegen tests for E-needs-test #125347

Merged

tesuji added a commit to tesuji/rustc that referenced this issue May 20, 2024

add codegen test for rust-lang#74938

ca3f53d

tesuji added a commit to tesuji/rustc that referenced this issue Jun 8, 2024

add codegen test for rust-lang#74938

320dcf4

tesuji added a commit to tesuji/rustc that referenced this issue Jun 9, 2024

add codegen test for rust-lang#74938

bf17818

bors closed this as completed in 7ac6c2f Jun 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

split_at fails to optimize bounds check #74938

split_at fails to optimize bounds check #74938

lcnr commented Jul 30, 2020 •

edited

Loading

bugadani commented Jul 31, 2020 •

edited

Loading

tesuji commented Jul 31, 2020 •

edited

Loading

tesuji commented Jul 31, 2020 •

edited

Loading

bugadani commented Jul 31, 2020

tesuji commented Jul 31, 2020

nikic commented Jul 31, 2020

bugadani commented Jul 31, 2020 •

edited

Loading

xldenis commented Jul 31, 2020 •

edited

Loading

xldenis commented Aug 2, 2020

tesuji commented Aug 2, 2020 •

edited

Loading

xldenis commented Aug 5, 2020

tesuji commented Aug 5, 2020

xldenis commented Aug 5, 2020

xldenis commented Aug 7, 2020

jrmuizel commented Oct 22, 2020

jrmuizel commented Mar 2, 2021

xldenis commented Jun 25, 2021

mati865 commented Jan 14, 2022

lcnr commented Jan 14, 2022

split_at fails to optimize bounds check #74938

split_at fails to optimize bounds check #74938

Comments

lcnr commented Jul 30, 2020 • edited Loading

bugadani commented Jul 31, 2020 • edited Loading

tesuji commented Jul 31, 2020 • edited Loading

tesuji commented Jul 31, 2020 • edited Loading

bugadani commented Jul 31, 2020

tesuji commented Jul 31, 2020

nikic commented Jul 31, 2020

bugadani commented Jul 31, 2020 • edited Loading

xldenis commented Jul 31, 2020 • edited Loading

xldenis commented Aug 2, 2020

tesuji commented Aug 2, 2020 • edited Loading

xldenis commented Aug 5, 2020

tesuji commented Aug 5, 2020

xldenis commented Aug 5, 2020

xldenis commented Aug 7, 2020

jrmuizel commented Oct 22, 2020

jrmuizel commented Mar 2, 2021

xldenis commented Jun 25, 2021

mati865 commented Jan 14, 2022

lcnr commented Jan 14, 2022

lcnr commented Jul 30, 2020 •

edited

Loading

bugadani commented Jul 31, 2020 •

edited

Loading

tesuji commented Jul 31, 2020 •

edited

Loading

tesuji commented Jul 31, 2020 •

edited

Loading

bugadani commented Jul 31, 2020 •

edited

Loading

xldenis commented Jul 31, 2020 •

edited

Loading

tesuji commented Aug 2, 2020 •

edited

Loading