-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce deduced parameter attributes, and use them for deducing readonly
on indirect immutable freeze by-value function parameters.
#103172
Conversation
Some changes occurred to MIR optimizations cc @rust-lang/wg-mir-opt |
(rust-highfive has picked a reviewer for you, use r? to override) |
9cdbc31
to
2d670be
Compare
This comment has been minimized.
This comment has been minimized.
2d670be
to
9603c77
Compare
The patch is updated to use the Visitor to detect mutations of parameters. I'll mark it as non-draft if there are no more comments once the tests pass locally. |
9603c77
to
0e8a4e6
Compare
This comment has been minimized.
This comment has been minimized.
This seems ready. Those two failures confuse me—I don't mutate the MIR at all, and these aren't codegen tests… |
0e8a4e6
to
bf18d56
Compare
This comment has been minimized.
This comment has been minimized.
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit bf18d564e3d54e08ba4372e1d4b20ef1b6e3afaa with merge 9077e397fbfc2a1a5945228575b6ce77985fd1f8... |
bf18d56
to
11fc0a7
Compare
Updated the PR to address comments. I added a new test to ensure that we don't mark non-freeze types as readonly. I also added a comment explaining why I don't think that the fact that moves semantically store undef to the moved-from value invalidates the optimization. |
☀️ Try build successful - checks-actions |
Queued a5cf94e7f6c6d3272682f3eeeb831ec529decd2f with parent dcb3761, future comparison URL. |
Finished benchmarking commit (a5cf94e7f6c6d3272682f3eeeb831ec529decd2f): comparison URL. Overall result: ❌✅ regressions and improvements - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)This benchmark run did not return any relevant results for this metric. CyclesThis benchmark run did not return any relevant results for this metric. Footnotes |
@bors delegate+ code and perf lgtm now. r=me with commits squashed |
✌️ @pcwalton can now approve this pull request |
…adonly` on indirect immutable freeze by-value function parameters. Right now, `rustc` only examines function signatures and the platform ABI when determining the LLVM attributes to apply to parameters. This results in missed optimizations, because there are some attributes that can be determined via analysis of the MIR making up the function body. In particular, `readonly` could be applied to most indirectly-passed by-value function arguments (specifically, those that are freeze and are observed not to be mutated), but it currently is not. This patch introduces the machinery that allows `rustc` to determine those attributes. It consists of a query, `deduced_param_attrs`, that, when evaluated, analyzes the MIR of the function to determine supplementary attributes. The results of this query for each function are written into the crate metadata so that the deduced parameter attributes can be applied to cross-crate functions. In this patch, we simply check the parameter for mutations to determine whether the `readonly` attribute should be applied to parameters that are indirect immutable freeze by-value. More attributes could conceivably be deduced in the future: `nocapture` and `noalias` come to mind. Adding `readonly` to indirect function parameters where applicable enables some potential optimizations in LLVM that are discussed in [issue 103103] and [PR 103070] around avoiding stack-to-stack memory copies that appear in functions like `core::fmt::Write::write_fmt` and `core::panicking::assert_failed`. These functions pass a large structure unchanged by value to a subfunction that also doesn't mutate it. Since the structure in this case is passed as an indirect parameter, it's a pointer from LLVM's perspective. As a result, the intermediate copy of the structure that our codegen emits could be optimized away by LLVM's MemCpyOptimizer if it knew that the pointer is `readonly nocapture noalias` in both the caller and callee. We already pass `nocapture noalias`, but we're missing `readonly`, as we can't determine whether a by-value parameter is mutated by examining the signature in Rust. I didn't have much success with having LLVM infer the `readonly` attribute, even with fat LTO; it seems that deducing it at the MIR level is necessary. No large benefits should be expected from this optimization *now*; LLVM needs some changes (discussed in [PR 103070]) to more aggressively use the `noalias nocapture readonly` combination in its alias analysis. I have some LLVM patches for these optimizations and have had them looked over. With all the patches applied locally, I enabled LLVM to remove all the `memcpy`s from the following code: ```rust fn main() { println!("Hello {}", 3); } ``` which is a significant codegen improvement over the status quo. I expect that if this optimization kicks in in multiple places even for such a simple program, then it will apply to Rust code all over the place. [issue 103103]: rust-lang#103103 [PR 103070]: rust-lang#103070
e4e37f0
to
da630ac
Compare
cc @rust-lang/wg-unsafe-code-guidelines From my understanding, this optimization won't change the behavior of any sound programs. If we create a pointer/reference to a function argument (e.g. However, I haven't seen any mention of |
☀️ Test successful - checks-actions |
1 similar comment
☀️ Test successful - checks-actions |
Finished benchmarking commit (eecde58): comparison URL. Overall result: ❌✅ regressions and improvements - no action needed@rustbot label: -perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Footnotes |
@Aaron1011 is there a summary of what happens here for someone who doesn't live and breathe LLVM IR?^^ My question is basically the same as in #103103: What do requirements do we need to impose on the MIR level to justify this attribute? "indirect immutable freeze by-value function parameter" is using a lot of terms that don't exist in MIR so I don't understand what this means. How can a parameter be both indirect and by-value?!? |
PlaceContext::MutatingUse(..) | ||
| PlaceContext::NonMutatingUse(NonMutatingUseContext::Move) => { | ||
// This is a mutation, so mark it as such. | ||
self.mutable_args.insert(local.index() - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NonMutatingUseContext::Move
is a mutation? Looks like naming went wrong somewhere...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a mutation for borrowck purposes, since you don't need let mut
to move out. The opsem might disagree with that, but we should decide the opsem first and then consider renaming something here
What is |
@RalfJung as far as I can tell, this only affects |
Introduce deduced parameter attributes, and use them for deducing
readonly
onindirect immutable freeze by-value function parameters.
Right now,
rustc
only examines function signatures and the platform ABI whendetermining the LLVM attributes to apply to parameters. This results in missed
optimizations, because there are some attributes that can be determined via
analysis of the MIR making up the function body. In particular,
readonly
could be applied to most indirectly-passed by-value function arguments
(specifically, those that are freeze and are observed not to be mutated), but
it currently is not.
This patch introduces the machinery that allows
rustc
to determine thoseattributes. It consists of a query,
deduced_param_attrs
, that, whenevaluated, analyzes the MIR of the function to determine supplementary
attributes. The results of this query for each function are written into the
crate metadata so that the deduced parameter attributes can be applied to
cross-crate functions. In this patch, we simply check the parameter for
mutations to determine whether the
readonly
attribute should be applied toparameters that are indirect immutable freeze by-value. More attributes could
conceivably be deduced in the future:
nocapture
andnoalias
come to mind.Adding
readonly
to indirect function parameters where applicable enables somepotential optimizations in LLVM that are discussed in issue 103103 and PR
103070 around avoiding stack-to-stack memory copies that appear in functions
like
core::fmt::Write::write_fmt
andcore::panicking::assert_failed
. Thesefunctions pass a large structure unchanged by value to a subfunction that also
doesn't mutate it. Since the structure in this case is passed as an indirect
parameter, it's a pointer from LLVM's perspective. As a result, the
intermediate copy of the structure that our codegen emits could be optimized
away by LLVM's MemCpyOptimizer if it knew that the pointer is
readonly nocapture noalias
in both the caller and callee. We already passnocapture noalias
, but we're missingreadonly
, as we can't determine whether aby-value parameter is mutated by examining the signature in Rust. I didn't have
much success with having LLVM infer the
readonly
attribute, even with fatLTO; it seems that deducing it at the MIR level is necessary.
No large benefits should be expected from this optimization now; LLVM needs
some changes (discussed in PR 103070) to more aggressively use the
noalias nocapture readonly
combination in its alias analysis. I have some LLVM patchesfor these optimizations and have had them looked over. With all the patches
applied locally, I enabled LLVM to remove all the
memcpy
s from the followingcode:
which is a significant codegen improvement over the status quo. I expect that if this optimization kicks in in multiple places even for such a simple program, then it will apply to Rust code all over the place.