-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suboptimal code generation for omittable bound checks #91109
Comments
To get memset as in the clang version you can replace for j in 0..n {
d[i][j] = 0;
} with d[i][0..n].fill(0); |
The code example is pretty synthetic and doesn't make much sense on its own; however, similar effects can be observed in slightly more complicated programs, e.g. Floyd—Warshall algorithm. My point is that the missed optimization is pretty important in real-life programs, as it's not uncommon to be unable to avoid array indexing, and bounds-checking overhead is often claimed to be negligible as the checks are likely to be optimized out. And of course I expect them to be optimized out when they are optimized out by clang in an equivalent C++ program. |
There are multiple ways to solve the problem, like: pub fn zero(d: &mut [Vec<i32>]) {
let n = d.len();
for i in 0 .. n {
let row = &mut d[i][.. n];
for j in 0 .. n {
row[j] = 0;
}
}
} What's the best way to solve this problem depends on the specific situation. |
True. However, the issue is about poor codegen for this particular piece of code, not about looking for a way to rewrite it. |
Your original code contains an assert, its purpose is to tell something to the optimizer. Often that helps, but you can't be sure the compiler will use the information given by the assert to its fullest. It's a "best effort" for the compiler. So you are waiting for the compiler to do what you want based on "best efforts". |
Thanks for including the I suggest that you prefer re-slicing to pub fn zero(d: &mut [Vec<i32>]) {
let n = d.len();
for i in 0..n {
let di = &mut d[i][..n];
for j in 0..n {
di[j] = 0;
}
}
} https://rust.godbolt.org/z/f19nasGPh And that's a general strategy for improving bound-check-elimination. It's how the the new |
Compiler Explorer link: https://godbolt.org/z/KMTfY18Wf
I tried this code:
With
-C llvm-args=-enable-constraint-elimination
, I expected the bound checks to be optimized out as they're performed manually byassert!
. However, both assert and bound checks were present in the compiled code. The inner loop looks really dumb in terms of code generation:rsi != rdx
is the loop invariant, but it's verified on each iteration.At first I suspected the issue to be entirely on the LLVM side, but bound checks are omitted in equivalent C++ code (which is also present at the Compiler Explorer link), and the whole inner loop is replaced with a single
memset
call.Meta
rustc --version --verbose
:The text was updated successfully, but these errors were encountered: