Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missed optimization on value range of slice::len #67186

Closed
HeroicKatora opened this issue Dec 9, 2019 · 2 comments · Fixed by #77023
Closed

Missed optimization on value range of slice::len #67186

HeroicKatora opened this issue Dec 9, 2019 · 2 comments · Fixed by #77023
Labels
A-codegen Area: Code generation C-enhancement Category: An issue proposing an enhancement or a PR with one. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@HeroicKatora
Copy link
Contributor

HeroicKatora commented Dec 9, 2019

The internals of llvm do not permit creation of slices spanning more
than isize::MAX bytes, as a limit resulting from pointer offsetting
inbounds. It must be both possible to represent arbitrary pointer
differences of the same allocation within an isize and there must be
no wrapping in the ptr::offset and array indexing. This has influenced
the surface documentation for slice::from_raw_parts:

The total size of the slice must be no larger than isize::MAX bytes in
memory. See the safety documentation of pointer::offset.

However, the implementation in the core library can not leverage this
property for optimization. To illustrate this, the following code should never
require an overflow check and never panic but the compiler is unable to
remove the panic path.

pub fn len_add(a: &[u8], b: &[u8]) -> usize {
    a.len().checked_add(b.len()).unwrap()
}

play

The core library could add an optimization hint in its implementation
of slice::len asserting that the value range is in fact at most as large as
the maximum possible values of isize. This information should then be
propagated automatically at each call site in optimizer passes.

// This is libcore, referring to the stabilized `unreachable_unchecked`.
use crate::intrinsics::unreachable;

// Alternative of `[T]::len` in libcore
pub const fn len(&self) -> usize {
    let raw_len = unsafe {
        crate::ptr::Repr { rust: self }.raw.len
    };
    
    if mem::size_of::<T>() > 0 {
        if raw_len > isize::max_value() as usize / mem::size_of::<T>() {
            // SAFETY: allocations can not be larger than `isize::MAX`
            // bytes. Since slices refer to a single allocation, this extends
            // to slice lengths of sized types. At this point we would have
            // however:
            //
            // > `len() * mem::size_of::<T>() > isize::MAX`.
            //
            // Since this path can be deduced as unreachable due to the 
            // below hint, the optimizer can restrict the value range of
            // return values for the purpose of optimization at each
            // call site.
            unsafe { unreachable() }
        }
    }
    
    raw_len
}

This currently allows the optimizer to remove the bounds check, as can be seen in this implementation: https://play.rust-lang.org/?version=nightly&mode=release&edition=2018&gist=6e46ce72b9affea080a33eee72b53267

I don't see a way to enable this on current nightly as slice::len is a const fn and several parts of the above implementation will consequently not yet work.

@jonas-schievink jonas-schievink added A-codegen Area: Code generation C-enhancement Category: An issue proposing an enhancement or a PR with one. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 9, 2019
@ecstatic-morse
Copy link
Contributor

ecstatic-morse commented Dec 10, 2019

Without weighing on the merits of the proposed change, I will note that the only thing preventing the implementation above is lack of support for the unreachable intrinsic in the Miri engine. I don't see a reason adding it would be difficult. Everything else can, AFAIK, be enabled via feature flags (min_const_unsafe_fn and const_if_match) .

@HeroicKatora
Copy link
Contributor Author

The availability of unreachable has been resolved so I went ahead and implemented it.

HeroicKatora added a commit to HeroicKatora/rust that referenced this issue Oct 4, 2020
Uses assume to check the length against a constant upper bound. The
inlined result then informs the optimizer of the sound value range.

This was tried with unreachable_unchecked before which introduces a
branch. This has the advantage of not being executed in sound code but
complicates basic blocks. It resulted in ~2% increased compile time in
some worst cases.

Add a codegen test for the assumption, testing the issue from rust-lang#67186
@bors bors closed this as completed in beb5ae4 Oct 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation C-enhancement Category: An issue proposing an enhancement or a PR with one. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants