Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The NonZero types don't tell LLVM that they're non-zero on get #49572

Open
scottmcm opened this issue Apr 2, 2018 · 22 comments · May be fixed by #134626
Open

The NonZero types don't tell LLVM that they're non-zero on get #49572

scottmcm opened this issue Apr 2, 2018 · 22 comments · May be fixed by #134626
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-llvm Working group: LLVM backend code generation

Comments

@scottmcm
Copy link
Member

scottmcm commented Apr 2, 2018

Repro: https://play.rust-lang.org/?gist=0981601dda9c5800e353e31b22682bb5&version=nightly&mode=release

pub fn foo(x: std::num::NonZeroU32) -> bool {
    x.get() != 0
}

Actual:

  %0 = icmp ne i32 %x, 0
  ret i1 %0

Expected:

  ret i1 true

cc #49137

@scottmcm scottmcm added I-slow Issue: Problems and improvements with respect to performance of generated code. WG-llvm Working group: LLVM backend code generation labels Apr 3, 2018
sophiebits added a commit to sophiebits/wasm-bindgen that referenced this issue Apr 21, 2018
I noticed in the generated wasm for examples/dom that `JsStatic` takes two words of storage (for its `Option<JsValue>`) when it would seem to only need one, if we mandate that `0` is not a valid heap index. Changing the `.idx` type to `NonZeroU32` fixes this.

However, LLVM doesn't know that these values are non-zero yet (rust-lang/rust#49572) which means in the implementation of Option's `get_or_insert_with` (https://doc.rust-lang.org/nightly/src/core/option.rs.html#774) LLVM is no longer able to prove that the `unreachable!()` branch is never hit and thus JsStatic ends up pulling in all the panic infrastructure where it previously didn't. This results in the examples/dom output being much larger, rather than smaller as I had hoped. Probably makes sense to wait for the upstream issue to be fixed so that this can be landed without regressing anything.
@hanna-kruppe
Copy link
Contributor

In writing #50156 I found out that LLVM supports range metadata on call instructions. So you can have %x.2 = call i32 @NonZeroU32_get(i32 %x.1), !range !33 and it will use the range metadata to reason about the return value. If I add the range metadata manually to the IR for the above test case, the optimization indeed fires. So we can probably fix some of this (maybe it falls apart in more complex cases) by emitting range metadata for return values, which we currently don't. cc @eddyb

@hanna-kruppe
Copy link
Contributor

Filed #50157 for adding metadata to calls, but this issue is more general -- once the get() gets inlined (or if it never gets called to begin with, e.g., because the u32 is accessed by type punning), we'd lose the information from the metadata on the call. The code in the OP is optimized because instcombine turns the compare into true before inlining happens, but in more complex cases #50156 might also help (though that only helps if the NonZero* is passed in as argument, it doesn't help with locals or if foo is inlined).

@hanna-kruppe
Copy link
Contributor

hanna-kruppe commented Apr 22, 2018

Scratch that, range metadata on return values doesn't help with this issue since get returns a plain old u32. (thanks @nox)

@XAMPPRocky XAMPPRocky added C-enhancement Category: An issue proposing an enhancement or a PR with one. A-codegen Area: Code generation T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 29, 2018
@cuviper
Copy link
Member

cuviper commented Sep 21, 2018

With debuginfo, we emit an alloca for %x that does have range metadata when loaded from. But it appears that is still not sufficiently propagated through optimization. https://rust.godbolt.org/z/sYjptQ

define internal i32 @_ZN4core3num10NonZeroU323get17h3d02812f86f2e907E(i32) unnamed_addr #0 !dbg !4 {
  %self = alloca i32, align 4
  store i32 %0, i32* %self, align 4
  call void @llvm.dbg.declare(metadata i32* %self, metadata !19, metadata !DIExpression()), !dbg !21
  %1 = load i32, i32* %self, align 4, !dbg !22
  ret i32 %1, !dbg !23
}

define zeroext i1 @_ZN7example3foo17h7a9a6aa0879e3714E(i32) unnamed_addr #1 !dbg !24 {
  %x = alloca i32, align 4
  store i32 %0, i32* %x, align 4
  call void @llvm.dbg.declare(metadata i32* %x, metadata !29, metadata !DIExpression()), !dbg !30
  %1 = load i32, i32* %x, align 4, !dbg !31, !range !32
  %2 = call i32 @_ZN4core3num10NonZeroU323get17h3d02812f86f2e907E(i32 %1), !dbg !31
  br label %bb1, !dbg !31

bb1:                                              ; preds = %start
  %3 = icmp ne i32 %2, 0, !dbg !31
  ret i1 %3, !dbg !33
}

[...]
!32 = !{i32 1, i32 0}

optimized:

define zeroext i1 @_ZN7example3foo17h7a9a6aa0879e3714E(i32) unnamed_addr #0 !dbg !4 {
  call void @llvm.dbg.value(metadata i32 %0, metadata !21, metadata !DIExpression()), !dbg !22
  %1 = icmp ne i32 %0, 0, !dbg !23
  ret i1 %1, !dbg !24
}

@scottmcm
Copy link
Member Author

scottmcm commented Dec 6, 2018

Copying another example of an unnecessary check from #54868:

use std::num::NonZeroU64;
pub fn foo(x: u64, y: NonZeroU64) -> u64 {
    x / y.get()
}

https://play.rust-lang.org/?version=nightly&mode=release&gist=432cdd4e395ad81d031858ce7b5e3980

@nikic
Copy link
Contributor

nikic commented Dec 6, 2018

Would it be okay to special-case NonZero*::get() in the compiler to add the range metadata to the call?

@nikic nikic added the A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. label Dec 6, 2018
@eddyb
Copy link
Member

eddyb commented Dec 6, 2018

Maybe we can start adding range metadata on calls and change NonZero::get to do id(self).0? Will LLVM preserve that information?

@hanna-kruppe
Copy link
Contributor

The id(self) part will have range metadata but it won't be visible to callers of NonZero::get -- at least until after inlining, but inlining should/will remove the id call too.

@eddyb
Copy link
Member

eddyb commented Dec 6, 2018

@rkruppe I was hoping inlining would put the range metadata on the calls of NonZero::get, but I guess there's no place to keep it around in the meanwhile.

Why can't LLVM just come up with a design for "value restrictions" (e.g. range, nonnull, align, dereferenceable) that are reused between loads, calls and function signature argument/returns?

Is it really that hard, compared to the mess it has now?

@nikic
Copy link
Contributor

nikic commented Dec 6, 2018

The three possibilities I see are:

  • Add support to LLVM to propagate range metadata from a return value to callsites. I think this will not be particularly fruitful, because it would likely happen too late anyway. E.g. current func attr inference passes run post-inlining.
  • Add support for metadata on function parameters, as suggested in Implement & use metadata on arguments in LLVM #50156. This will at least cover the cases discussed here, where the NonZeroN is passed as an argument.
  • Special-case NonZeroN::get() calls on our side to also include range metadata. This will also get lost after inlining, but it will be usable for early optimization.

@hanna-kruppe
Copy link
Contributor

@eddyb

I was hoping inlining would put the range metadata on the calls of NonZero::get, but I guess there's no place to keep it around in the meanwhile.

The NonZero::get calls should/will also get inlined, so after inlining runs there are no calls left to put metadata on.

Why can't LLVM just come up with a design for "value restrictions" (e.g. range, nonnull, align, dereferenceable) that are reused between loads, calls and function signature argument/returns?

Metadata and attributes mostly cover all of these locations (AFAIK the only combination missing is metadata on parameters and returns), what's really lacking is keeping this information around when the calls and loads that have these annotations are eliminated (by SROA and inlining). Why that is difficult is a big topic and I'm not even sure what factors dominate, but for example we both know the downsides of putting lots of explicit instructions asserting these fact into the instruction stream (e.g. via calls to llvm.assume).

@eddyb
Copy link
Member

eddyb commented Dec 6, 2018

Metadata and attributes mostly cover all of these locations

IMO a lot of the difficulty in getting everything working is the lack of an unified system and representation which means you have to handle 2-3 different systems manually.

@hanna-kruppe
Copy link
Contributor

I don't really follow. The work of emitting all the relevant attributes and metadata that exist today has already been done and we still have this issue, and when metadata on parameters are added, we'll still have serious issues (e.g., when a NonZero is constructed locally -- optimizing checks there requires flow-sensitive analysis which LLVM isn't always very good at, and frankly it's a hard problem).

@tstellar
Copy link

You may want to experiment using the llvm.assume intrinsic: https://llvm.org/docs/LangRef.html#llvm-assume-intrinsic

@crlf0710
Copy link
Member

Sorry if i'm being too naive, but isn't adding a core::intrinsics::assume() call within the get() function implementation enough to get it right?

@eddyb
Copy link
Member

eddyb commented Feb 21, 2020

assume tends to slow down LLVM a lot and even (ironically) block optimizations. I wonder if that got better recently.

@workingjubilee workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023
AngelicosPhosphoros added a commit to AngelicosPhosphoros/rust that referenced this issue Dec 30, 2023
LLVM currently don't support range metadata for function arguments so it fails to optimize non zero integers using their invariant if they are provided using by-value function arguments.

Related to rust-lang#119422
Related to llvm/llvm-project#76628
Related to rust-lang#49572
@AngelicosPhosphoros
Copy link
Contributor

Made a LLVM issue for adding niche info to function parameters.
llvm/llvm-project#76628

bors added a commit to rust-lang-ci/rust that referenced this issue Dec 30, 2023
…get_assume_nonzero, r=<try>

Add assume into `NonZeroIntX::get`

LLVM currently don't support range metadata for function arguments so it fails to optimize non zero integers using their invariant if they are provided using by-value function arguments.

Related to rust-lang#119422
Related to llvm/llvm-project#76628
Related to rust-lang#49572
AngelicosPhosphoros added a commit to AngelicosPhosphoros/rust that referenced this issue Jan 2, 2024
LLVM currently don't support range metadata for function arguments so it fails to optimize non zero integers using their invariant if they are provided using by-value function arguments.

Related to rust-lang#119422
Related to llvm/llvm-project#76628
Related to rust-lang#49572
AngelicosPhosphoros added a commit to AngelicosPhosphoros/rust that referenced this issue Jan 6, 2024
LLVM currently don't support range metadata for function arguments so it fails to optimize non zero integers using their invariant if they are provided using by-value function arguments.

Related to rust-lang#119422
Related to llvm/llvm-project#76628
Related to rust-lang#49572
bors added a commit to rust-lang-ci/rust that referenced this issue Jan 12, 2024
…get_assume_nonzero, r=<try>

Add assume into `NonZeroIntX::get`

LLVM currently don't support range metadata for function arguments so it fails to optimize non zero integers using their invariant if they are provided using by-value function arguments.

Related to rust-lang#119422
Related to llvm/llvm-project#76628
Related to rust-lang#49572
bors added a commit to rust-lang-ci/rust that referenced this issue Jan 12, 2024
…get_assume_nonzero, r=scottmcm

Add assume into `NonZeroIntX::get`

LLVM currently don't support range metadata for function arguments so it fails to optimize non zero integers using their invariant if they are provided using by-value function arguments.

Related to rust-lang#119422
Related to llvm/llvm-project#76628
Related to rust-lang#49572
github-actions bot pushed a commit to rust-lang/miri that referenced this issue Jan 13, 2024
…e_nonzero, r=scottmcm

Add assume into `NonZeroIntX::get`

LLVM currently don't support range metadata for function arguments so it fails to optimize non zero integers using their invariant if they are provided using by-value function arguments.

Related to rust-lang/rust#119422
Related to llvm/llvm-project#76628
Related to rust-lang/rust#49572
@mvelbaum
Copy link

mvelbaum commented Mar 8, 2024

Made a LLVM issue for adding niche info to function parameters. llvm/llvm-project#76628

Looks like this was done in llvm/llvm-project#83171

@mickvangelderen
Copy link
Contributor

Made a LLVM issue for adding niche info to function parameters. llvm/llvm-project#76628

Looks like this was done in llvm/llvm-project#83171

And now it's being reverted due to a memory leak 😅

@scottmcm
Copy link
Member Author

...but it's back in llvm/llvm-project#84617

lnicola pushed a commit to lnicola/rust-analyzer that referenced this issue Apr 7, 2024
…e_nonzero, r=scottmcm

Add assume into `NonZeroIntX::get`

LLVM currently don't support range metadata for function arguments so it fails to optimize non zero integers using their invariant if they are provided using by-value function arguments.

Related to rust-lang/rust#119422
Related to llvm/llvm-project#76628
Related to rust-lang/rust#49572
RalfJung pushed a commit to RalfJung/rust-analyzer that referenced this issue Apr 27, 2024
…e_nonzero, r=scottmcm

Add assume into `NonZeroIntX::get`

LLVM currently don't support range metadata for function arguments so it fails to optimize non zero integers using their invariant if they are provided using by-value function arguments.

Related to rust-lang/rust#119422
Related to llvm/llvm-project#76628
Related to rust-lang/rust#49572
bors added a commit to rust-lang-ci/rust that referenced this issue Jul 31, 2024
Add range attribute to scalar function results and arguments

as LLVM 19 adds the range attribute this starts to use it for better optimization.
hade been interesting to see a perf run with the rust-lang#127513

closes rust-lang#50156
cc rust-lang#49572 shall be fixed but not possible to see as there is asserts that already trigger the optimization.

r​? `@nikic`
bors added a commit to rust-lang-ci/rust that referenced this issue Aug 7, 2024
Add range attribute to scalar function results and arguments

as LLVM 19 adds the range attribute this starts to use it for better optimization.
hade been interesting to see a perf run with the rust-lang#127513

closes rust-lang#50156
cc rust-lang#49572 shall be fixed but not possible to see as there is asserts that already trigger the optimization.

r​? `@nikic`
bors added a commit to rust-lang-ci/rust that referenced this issue Aug 9, 2024
Add range attribute to scalar function results and arguments

as LLVM 19 adds the range attribute this starts to use it for better optimization.
hade been interesting to see a perf run with the rust-lang#127513

closes rust-lang#50156
cc rust-lang#49572 shall be fixed but not possible to see as there is asserts that already trigger the optimization.

r​? `@nikic`
bors added a commit to rust-lang-ci/rust that referenced this issue Aug 9, 2024
Add range attribute to scalar function results and arguments

as LLVM 19 adds the range attribute this starts to use it for better optimization.
hade been interesting to see a perf run with the rust-lang#127513

closes rust-lang#50156
cc rust-lang#49572 shall be fixed but not possible to see as there is asserts that already trigger the optimization.

try-job: i686-gnu
bors added a commit to rust-lang-ci/rust that referenced this issue Aug 10, 2024
Add range attribute to scalar function results and arguments

as LLVM 19 adds the range attribute this starts to use it for better optimization.
hade been interesting to see a perf run with the rust-lang#127513

closes rust-lang#50156
cc rust-lang#49572 shall be fixed but not possible to see as there is asserts that already trigger the optimization.
bors added a commit to rust-lang-ci/rust that referenced this issue Aug 12, 2024
Add range attribute to scalar function results and arguments

as LLVM 19 adds the range attribute this starts to use it for better optimization.
hade been interesting to see a perf run with the rust-lang#127513

closes rust-lang#50156
cc rust-lang#49572 shall be fixed but not possible to see as there is asserts that already trigger the optimization.
bors added a commit to rust-lang-ci/rust that referenced this issue Aug 12, 2024
Add range attribute to scalar function results and arguments

as LLVM 19 adds the range attribute this starts to use it for better optimization.
hade been interesting to see a perf run with the rust-lang#127513

closes rust-lang#50156
cc rust-lang#49572 shall be fixed but not possible to see as there is asserts that already trigger the optimization.
lnicola pushed a commit to lnicola/rust-analyzer that referenced this issue Aug 13, 2024
Add range attribute to scalar function results and arguments

as LLVM 19 adds the range attribute this starts to use it for better optimization.
hade been interesting to see a perf run with the rust-lang/rust#127513

closes rust-lang/rust#50156
cc rust-lang/rust#49572 shall be fixed but not possible to see as there is asserts that already trigger the optimization.
RalfJung pushed a commit to RalfJung/miri that referenced this issue Aug 14, 2024
Add range attribute to scalar function results and arguments

as LLVM 19 adds the range attribute this starts to use it for better optimization.
hade been interesting to see a perf run with the rust-lang/rust#127513

closes rust-lang/rust#50156
cc rust-lang/rust#49572 shall be fixed but not possible to see as there is asserts that already trigger the optimization.
@veera-sivarajan veera-sivarajan linked a pull request Dec 21, 2024 that will close this issue
bors added a commit to rust-lang-ci/rust that referenced this issue Dec 27, 2024
…r=Mark-Simulacrum

Add Four Codegen Tests

Closes rust-lang#74615
Closes rust-lang#123216
Closes rust-lang#49572
Closes rust-lang#93514

This PR adds four codegen tests. The FileCheck assertions were generated with the help of `update_test_checks.py` and `update_llc_test_checks.py` from the LLVM project.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-llvm Working group: LLVM backend code generation
Projects
None yet
Development

Successfully merging a pull request may close this issue.