Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suboptimal codegen when using unwrap_or_else with unreachable_unchecked #98468

Closed
james7132 opened this issue Jun 24, 2022 · 6 comments
Closed
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@james7132
Copy link

james7132 commented Jun 24, 2022

I tried this code:

pub struct Foo {
    i: Vec<usize>,
    v: Vec<usize>,
}

impl Foo {
    fn get(&self, i: usize) -> Option<&usize> {
        let index = self.i.get(i)?;
        self.v.get(*index)
    }

    unsafe fn get_unchecked(&self, i: usize) -> &usize {
        let index = self.i.get_unchecked(i);
        self.v.get_unchecked(*index)
    }
}

pub unsafe fn get_unchecked(f: &Foo, idx: usize) -> &usize {
    f.get_unchecked(idx)
}

pub unsafe fn get_unwrap_checked_unreachable(f: &Foo, idx: usize) -> &usize {
    f.get(idx).unwrap_or_else(|| std::hint::unreachable_unchecked())
}

pub fn get(f: &Foo, idx: usize) -> Option<&usize> {
    f.get(idx)
}

I expected to see this happen: The unwrap_or_else case generates branch-free assembly.

Instead, this happened:

This was the generated assembly (opt-level 3):

example::get_unchecked:
        mov     rax, qword ptr [rdi]
        mov     rax, qword ptr [rax + 8*rsi]
        shl     rax, 3
        add     rax, qword ptr [rdi + 24]
        ret

example::get_unwrap_checked_unreachable:
        xor     eax, eax
        cmp     qword ptr [rdi + 16], rsi
        jbe     .LBB1_3
        mov     rcx, qword ptr [rdi]
        test    rcx, rcx
        je      .LBB1_3
        mov     rcx, qword ptr [rcx + 8*rsi]
        lea     rdx, [8*rcx]
        add     rdx, qword ptr [rdi + 24]
        xor     eax, eax
        cmp     rcx, qword ptr [rdi + 40]
        cmovb   rax, rdx
.LBB1_3:
        ret

example::get:
        xor     eax, eax
        cmp     qword ptr [rdi + 16], rsi
        jbe     .LBB2_3
        mov     rcx, qword ptr [rdi]
        test    rcx, rcx
        je      .LBB2_3
        mov     rcx, qword ptr [rcx + 8*rsi]
        lea     rdx, [8*rcx]
        add     rdx, qword ptr [rdi + 24]
        xor     eax, eax
        cmp     rcx, qword ptr [rdi + 40]
        cmovb   rax, rdx
.LBB2_3:
        ret

Meta

Rust version rustc 1.61.0 (fe5b13d 2022-05-18) (on Compiler Explorer)

rustc --version --verbose:

TODO

(not at my home machine right now, will fill in these details soon)

Context

The primary use case is to have an alternative version of std::hint::unreachable_unchecked that is checked in debug builds via unreachable!(), but unchecked in release builds.

@james7132 james7132 added the C-bug Category: This is a bug. label Jun 24, 2022
@thomcc
Copy link
Member

thomcc commented Jun 24, 2022

It's worth noting that this repros even with -Ztrap-unreachable=no: https://godbolt.org/z/6eWGWjebM, but IIUC LLVM has a hard time making use of these hints, so this isn't that surprising. Still, it would be nice to fix if there's anything that can be done.

@thomcc thomcc added I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 24, 2022
@the8472
Copy link
Member

the8472 commented Jun 24, 2022

option.unwrap_unchecked() doesn't optimize either

@nikic nikic added the A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. label Jun 24, 2022
@nikic
Copy link
Contributor

nikic commented Jun 24, 2022

Should be possible to fix this by teaching passingValueIsAlwaysUndefined() about the assume(x != null) pattern.

@andjo403
Copy link
Contributor

andjo403 commented Aug 1, 2022

the examples in this issue seems to be fixed by the Enabling of MIR inlining in #91743
see generated code in https://godbolt.org/z/vYr3ab96Y with and without MIR inlining enabled.

@james7132
Copy link
Author

This seems to be fixed in nightly (at least with the version Godbolt is currently running): https://godbolt.org/z/9PzKWfY1K.

@james7132
Copy link
Author

The example in the original issue seems to have been addressed as of 1.64's release. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

5 participants