-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed worst-case miri performance with lossy string decoding #98592
Conversation
Why does this help Miri performance? |
@saethlin knows the "why" better than I do, but could you start a perf run to see if there's any performance impact? |
Phew that's a bit bigger of a diff than I was hoping for. Since this is basically optimizing the debug codegen (but almost a more dramatic version of that) I think it would be best if this PR were very small and had a clear comment explaining exactly why we want to write this code in a particular way. For example, I wonder if simply removing the But the speedup is valuable as confirmation that we correctly identified the cause of the problem. I pointed out this pattern: while i < self.source.len() {
let byte = unsafe { *self.source.get_unchecked(i) }; I long suspected that something like this could cause Stacked Borrows blowup, but I have yet failed to build a tight reproducer of the blowup. Perhaps one can be extracted based on this. I think this ends up with a large allocation where the stack-merging code totally fails because each byte has a different pattern of tags. Since this is a slice, each call to |
Sure can, but that will just test perf of rustc itself, which I doubt is a heavy user of this code. @bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit 0633c81 with merge dc2897de64ccc546d1d9a10d8d93ac2bd641ef30... |
I don't think this will show on a compiler perf run. Imo we should be benchmarking the implementation directly. Edit: Agh, didn't see Ralf's message |
|
Ah, that makes a lot of sense. |
I ran some benches and it looks like there's little to no difference between the two Current:
This PR:
|
Did you try running the benchmarks in |
☀️ Try build successful - checks-actions |
Queued dc2897de64ccc546d1d9a10d8d93ac2bd641ef30 with parent 8e52fa8, future comparison URL. |
Finished benchmarking commit (dc2897de64ccc546d1d9a10d8d93ac2bd641ef30): comparison url. Instruction count
Max RSS (memory usage)Results
CyclesThis benchmark run did not return any relevant results for this metric. If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Footnotes |
Ran the alloc
This PR:
|
@bors delegate=saethlin |
✌️ @saethlin can now approve this pull request |
while i < self.source.len() { | ||
// SAFETY: `i < self.source.len()` per previous line. | ||
let length = self.source.len(); | ||
let mut current = self.source.as_ptr(); | ||
// SAFETY: current + length is one past the end of the allocation | ||
let (start, end, mut valid_up_to) = unsafe { (current, current.add(length), current) }; | ||
|
||
while current < end { | ||
// SAFETY: `current < end` per previous line. | ||
// For some reason the following are both significantly slower: | ||
// while let Some(&byte) = self.source.get(i) { | ||
// while let Some(byte) = self.source.get(i).copied() { | ||
let byte = unsafe { *self.source.get_unchecked(i) }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR as-written adds a a lot of unsafe
. I do not think that we should do that without sufficient justification.
How much of a perf impact do we see on Miri backtrace printing and the standard library benchmarks from only adjusting these lines? The specific pattern we want to avoid is this: https://github.com/rust-lang/miri/blob/8fdb720329d7674a878a8252fe4b79ef93d6ffec/bench-cargo-miri/slice-get-unchecked/src/main.rs#L8-L9
while i < x.len() {
let _element = unsafe { *x.get_unchecked(i) };
I'm not completely opposed the pervasive sort of changes that you've implemented in the rest of this function, but we need benchmarking that makes the case for those changes, as opposed to just tweaking the ASCII fast path.
Addresses miri/#2273 (I'm not sure I'd call it a fix since this situation could conceivably happen in other code, so it's more of a bandaid)
Reduces the runtime of the test case from miri/#2273 (comment) from ~forever to a little over 5 seconds on my machine (windows, could possibly be faster on linux)
cc @RalfJung
r? @saethlin