Incomplete optimization with opt-level=z compared to clang for possible pre-compiled expressions #102312

arctic-penguin · 2022-09-26T13:43:58Z

I noted that rust does not apply some size optimizations when opt-level=z is supplied, whereas in c they are applied.

See here: https://godbolt.org/z/1955WjcT8

I tried this code:

#[no_mangle]
fn iterate() -> i32 {
    let mut result = 0;
    for i in 0..=100 {
        result += i;
    }
    result
}

With opt-level=3

iterate:
        mov     eax, 5050
        ret

With opt-level=z

iterate:
        xor     ecx, ecx
        xor     edx, edx
        xor     eax, eax
.LBB0_1:
        test    dl, dl
        jne     .LBB0_3
        lea     esi, [rcx + 1]
        cmp     ecx, 100
        sete    dl
        cmove   esi, ecx
        add     eax, ecx
        mov     ecx, esi
        jmp     .LBB0_1
.LBB0_3:
        ret

I would expect opt-level=z and opt-level=3 to have the same output for this fairly simple case.

In contrast, clang 15.0.0 does this:

int something() {
    int result = 0;
    for (int i=0; i<=100; i++) {
        result += i;
    }
    return result;
}

with -O3

something:                              # @something
        mov     eax, 5050
        ret

with -Oz

something:                              # @something
        mov     eax, 5050
        ret

Meta

rustc --version --verbose:

1.64.0 (godbolt.org), I assume that's a55dd71d5

I understand that the c code is far easier to optimize, but nevertheless the rust-produced assembly code is about 7 x as long.

The text was updated successfully, but these errors were encountered:

Rageking8 · 2022-09-26T14:54:48Z

@rustbot label +T-compiler +A-codegen +I-slow

the8472 · 2022-09-26T17:47:54Z

This is a known issue with RangeInclusive. Either use a regular Range or iterate via iter.for_each() instead of a for _ in iter loop.

#45222

the8472 · 2022-09-26T17:56:59Z

It's a bit surprising that 1.64 did manage to optimize it on O3 (but not O2) and then nightly and beta again even on O3.

arctic-penguin added the C-bug Category: This is a bug. label Sep 26, 2022

rustbot added A-codegen Area: Code generation I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Sep 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incomplete optimization with opt-level=z compared to clang for possible pre-compiled expressions #102312

Incomplete optimization with opt-level=z compared to clang for possible pre-compiled expressions #102312

arctic-penguin commented Sep 26, 2022 •

edited

Loading

Rageking8 commented Sep 26, 2022

the8472 commented Sep 26, 2022 •

edited

Loading

the8472 commented Sep 26, 2022

Incomplete optimization with opt-level=z compared to clang for possible pre-compiled expressions #102312

Incomplete optimization with opt-level=z compared to clang for possible pre-compiled expressions #102312

Comments

arctic-penguin commented Sep 26, 2022 • edited Loading

Meta

Rageking8 commented Sep 26, 2022

the8472 commented Sep 26, 2022 • edited Loading

the8472 commented Sep 26, 2022

arctic-penguin commented Sep 26, 2022 •

edited

Loading

the8472 commented Sep 26, 2022 •

edited

Loading