Revert #83357 and achieve the same impact on serde-json by adding a check in Vec::spec_extend #83797

saethlin · 2021-04-03T00:12:46Z

No description provided.

rust-highfive · 2021-04-03T00:12:48Z

r? @m-ou-se

(rust-highfive has picked a reviewer for you, use r? to override)

saethlin · 2021-04-03T00:13:29Z

r? @dtolnay
@rylev @klensy per #83357 (comment)

jyn514 · 2021-04-03T02:04:55Z

@bors try @rust-timer queue

rust-timer · 2021-04-03T02:04:57Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2021-04-03T02:05:04Z

⌛ Trying commit 0dc772e with merge 7e666f9e8f9482542bd6b4dd7e39126881298982...

bors · 2021-04-03T02:54:01Z

☀️ Try build successful - checks-actions
Build commit: 7e666f9e8f9482542bd6b4dd7e39126881298982 (7e666f9e8f9482542bd6b4dd7e39126881298982)

rust-timer · 2021-04-03T02:54:02Z

Queued 7e666f9e8f9482542bd6b4dd7e39126881298982 with parent 9b6c9b6, future comparison URL.

rust-timer · 2021-04-03T05:22:09Z

Finished benchmarking try commit (7e666f9e8f9482542bd6b4dd7e39126881298982): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

saethlin · 2021-04-03T16:31:24Z

I do not know why this is slower. I'm going to stand up the rustc-perf benchmarking locally and try to figure this out.

the8472 · 2021-04-03T16:59:08Z

Note that there have been some changes to the spec_extend code path (see #83726) since #83357 landed so it may optimize differently now compared to when the last perf run was made. Maybe measure the revert and the new check separately.

This reverts commit 0dc772e.

bjorn3 · 2021-04-04T07:55:36Z

@bors try @rust-timer queue

rust-timer · 2021-04-04T07:55:37Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2021-04-04T07:55:47Z

⌛ Trying commit 85bd865 with merge 13e6b40edb3de60b505f197a5a470cdd98f8fa74...

bors · 2021-04-04T08:56:19Z

☀️ Try build successful - checks-actions
Build commit: 13e6b40edb3de60b505f197a5a470cdd98f8fa74 (13e6b40edb3de60b505f197a5a470cdd98f8fa74)

rust-timer · 2021-04-04T08:56:21Z

Queued 13e6b40edb3de60b505f197a5a470cdd98f8fa74 with parent 0850c37, future comparison URL.

rust-timer · 2021-04-04T12:40:32Z

Finished benchmarking try commit (13e6b40edb3de60b505f197a5a470cdd98f8fa74): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

saethlin · 2021-04-04T17:20:59Z

I'm just more confused now. There's evidence posted in #83357 that the inlining tweaks were a regression, but now undoing them is also a regression??

the8472 · 2021-04-04T17:30:31Z

Oh, the link in #83357 (comment) points to the max-rss report, not instructions (the default).

max-rss tends to be noisy, but the reverts seem to be biased towards an improvement, i.e. it recovers the max-rss regression but also undoes the compile time improvements.

saethlin · 2021-04-04T18:02:20Z

Wow I was really looking in the wrong direction.

I have no idea why max-rss would be impacted by these changes. My instinct is that blaming a max-rss regression on #83357 is a mistake; it doesn't change when and with what arguments any allocator is invoked. If it changes the size of anything, it should reduce code size.

the8472 · 2021-04-04T18:58:11Z

Wow I was really looking in the wrong direction.

I have no idea why max-rss would be impacted by these changes. My instinct is that blaming a max-rss regression on #83357 is a mistake; it doesn't change when and with what arguments any allocator is invoked. If it changes the size of anything, it should reduce code size.

The PR was not part of a rollup, so the perf runs are a direct comparison with its parent commit. And we have three comparisons now and all point in the same direction so that makes it less likely to be noise.

Vec code is a significant fraction of all code, vec iterator code doubly so, due to monomorization. So things are really sensitive to changes there.

So, reverting fixes the memory footprint issue. But it also undoes the instruction count improvements on some of the parts that involve little to no heavy lifting on the LLVM side (e.g. -check runs), so they're probably not due to more IR being generated but due to something else in the compiler returning to less efficient code.

So perhaps run one of the non-opt benchmarks under perf and see what code in the compiler is affected and whether we could improve the performance without the max-rss increase.

One possibility might be that you focused only on extend while the reserve changes had a broader impact.

the8472 · 2021-04-07T01:57:39Z

@saethlin do the serde speedups mostly come from Extend or FromIterator improvements? If it's about the latter then I think we can get reserve out of some FromIterator code paths.

saethlin · 2021-04-07T02:02:08Z

@the8472 The serde improvement(s) come from making std::io::Write::write_all into a Vec<u8> cheaper. I believe that forwards down to the SpecExtend code path I was poking at in this PR, or at least that's what benchmarking indicates.

the8472 · 2021-04-07T02:21:38Z

rust/library/std/src/io/impls.rs

Lines 384 to 387 in c051c5d

    
           fn write_all(&mut self, buf: &[u8]) -> io::Result<()> { 
        
               self.extend_from_slice(buf); 
        
               Ok(()) 
        
           }

calls

rust/library/alloc/src/vec/mod.rs

Lines 2118 to 2120 in c051c5d

    
           pub fn extend_from_slice(&mut self, other: &[T]) { 
        
               self.spec_extend(other.iter()) 
        
           }

which should specialize to

rust/library/alloc/src/vec/spec_extend.rs

Lines 83 to 86 in c051c5d

    
           fn spec_extend(&mut self, iterator: slice::Iter<'a, T>) { 
        
               let slice = iterator.as_slice(); 
        
               unsafe { self.append_elements(slice) }; 
        
           }

calls

rust/library/alloc/src/vec/mod.rs

Lines 1693 to 1700 in c051c5d

    
           #[inline] 
        
           unsafe fn append_elements(&mut self, other: *const [T]) { 
        
               let count = unsafe { (*other).len() }; 
        
               self.reserve(count); 
        
               let len = self.len(); 
        
               unsafe { ptr::copy_nonoverlapping(other as *const T, self.as_mut_ptr().add(len), count) }; 
        
               self.len += count; 
        
           }

So there's a reserve, but it's not the one in spec_extend that's relevant if you want to optimize serde.
And we should collapse some of that call-chain.

I think that warrents a few separate PRs. One for serde improvements. One purely for the revert, assuming the libs team prefers to keep max-rss low at the expense of the instructions:u losses (that pretty much seems to be the trade-off here).
And then I have an idea how to avoid one or two reserve calls which might claw back some of the instruction count losses.

nnethercote · 2021-04-09T06:14:50Z

FWIW, I made a bunch of improvements to these code paths last year to reduce code size. At first the changes gave obvious improvements, but by the end things were getting unpredictable and it was hard to improve some things without regressing other things.

saethlin · 2021-05-22T17:13:40Z

I was never sure what the next steps were on here, and the codebase seems to have moved on a bit. I'm still here, but this PR isn't really.

saethlin added 2 commits April 2, 2021 19:49

Revert rust-lang#83357

42f909a

Check before calling Vec::reserve in spec_extend

0dc772e

rust-highfive assigned m-ou-se Apr 3, 2021

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 3, 2021

rust-highfive assigned dtolnay and unassigned m-ou-se Apr 3, 2021

saethlin marked this pull request as draft April 3, 2021 00:24

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 3, 2021

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 3, 2021

Revert "Check before calling Vec::reserve in spec_extend"

85bd865

This reverts commit 0dc772e.

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 4, 2021

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 4, 2021

the8472 mentioned this pull request Apr 16, 2021

extract code path shared between FromIterator and Extend #84255

Closed

saethlin closed this May 22, 2021

saethlin deleted the check-before-vec-reserve branch May 16, 2022 04:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert #83357 and achieve the same impact on serde-json by adding a check in Vec::spec_extend #83797

Revert #83357 and achieve the same impact on serde-json by adding a check in Vec::spec_extend #83797

saethlin commented Apr 3, 2021

rust-highfive commented Apr 3, 2021

saethlin commented Apr 3, 2021

jyn514 commented Apr 3, 2021

rust-timer commented Apr 3, 2021

bors commented Apr 3, 2021

bors commented Apr 3, 2021

rust-timer commented Apr 3, 2021

rust-timer commented Apr 3, 2021

saethlin commented Apr 3, 2021

the8472 commented Apr 3, 2021 •

edited

Loading

bjorn3 commented Apr 4, 2021

rust-timer commented Apr 4, 2021

bors commented Apr 4, 2021

bors commented Apr 4, 2021

rust-timer commented Apr 4, 2021

rust-timer commented Apr 4, 2021

saethlin commented Apr 4, 2021 •

edited

Loading

the8472 commented Apr 4, 2021

saethlin commented Apr 4, 2021

the8472 commented Apr 4, 2021

the8472 commented Apr 7, 2021

saethlin commented Apr 7, 2021

the8472 commented Apr 7, 2021

nnethercote commented Apr 9, 2021

saethlin commented May 22, 2021

Revert #83357 and achieve the same impact on serde-json by adding a check in Vec::spec_extend #83797

Revert #83357 and achieve the same impact on serde-json by adding a check in Vec::spec_extend #83797

Conversation

saethlin commented Apr 3, 2021

rust-highfive commented Apr 3, 2021

saethlin commented Apr 3, 2021

jyn514 commented Apr 3, 2021

rust-timer commented Apr 3, 2021

bors commented Apr 3, 2021

bors commented Apr 3, 2021

rust-timer commented Apr 3, 2021

rust-timer commented Apr 3, 2021

saethlin commented Apr 3, 2021

the8472 commented Apr 3, 2021 • edited Loading

bjorn3 commented Apr 4, 2021

rust-timer commented Apr 4, 2021

bors commented Apr 4, 2021

bors commented Apr 4, 2021

rust-timer commented Apr 4, 2021

rust-timer commented Apr 4, 2021

saethlin commented Apr 4, 2021 • edited Loading

the8472 commented Apr 4, 2021

saethlin commented Apr 4, 2021

the8472 commented Apr 4, 2021

the8472 commented Apr 7, 2021

saethlin commented Apr 7, 2021

the8472 commented Apr 7, 2021

nnethercote commented Apr 9, 2021

saethlin commented May 22, 2021

the8472 commented Apr 3, 2021 •

edited

Loading

saethlin commented Apr 4, 2021 •

edited

Loading