Speed up leb128 encoding and decoding for unsigned values. #46919

michaelwoerister · 2017-12-21T17:17:26Z

Make the implementation for some leb128 functions potentially faster.

@Mark-Simulacrum, could you please trigger a perf.rlo run?

rust-highfive · 2017-12-21T17:17:29Z

r? @aidanhs

(rust_highfive has picked a reviewer for you, use r? to override)

kennytm · 2017-12-21T18:53:57Z

@bors try

bors · 2017-12-21T18:54:07Z

⌛ Trying commit 43ad4fd with merge 2d54fb61881368133d872d8f878adcde8621da7f...

bors · 2017-12-21T20:15:49Z

☀️ Test successful - status-travis
State: approved= try=True

aidanhs · 2017-12-21T22:45:09Z

r? @sfackler

sfackler · 2017-12-21T22:47:35Z

src/libserialize/leb128.rs

+macro_rules! impl_read_unsigned_leb128 {
+    ($fn_name:ident, $int_ty:ident) => (
+        #[inline]
+        pub fn $fn_name(data: &[u8], start_position: usize) -> ($int_ty, usize) {


This was here before, bu t why does this take a slice and an offset instead of just a slice?

Good question. Maybe there was a reason in the past. I'll change it.

Mark-Simulacrum · 2017-12-22T15:48:22Z

Perf run queued.

michaelwoerister · 2017-12-23T11:09:27Z

http://perf.rust-lang.org/compare.html?start=eff3de0927c36e6483ccb8a35c3d2da6e063de0b&end=2d54fb61881368133d872d8f878adcde8621da7f&stat=wall-time

Huh, that's very different from what the microbenchmarks showed. Seems like I need to iterate some more on this.

alexcrichton · 2018-01-04T15:43:41Z

ping @michaelwoerister, just wanna make sure this doesn't fall off your radar!

michaelwoerister · 2018-01-04T18:30:41Z

I'm still in the process of setting up some good benchmarks that work with real-world data: https://github.com/michaelwoerister/encoding-bench

It's a bit of a side-project, so it will take a while.

michaelwoerister · 2018-01-09T15:54:43Z

@bors try

bors · 2018-01-09T15:54:54Z

⌛ Trying commit 53c2f44 with merge c70d579...

@Mark-Simulacrum

Speed up leb128 encoding and decoding for unsigned values. Make the implementation for some leb128 functions potentially faster. @Mark-Simulacrum, could you please trigger a perf.rlo run?

bors · 2018-01-09T17:40:37Z

☀️ Test successful - status-travis
State: approved= try=True

michaelwoerister · 2018-01-09T21:25:44Z

@Mark-Simulacrum, could you do another perf run please?

Mark-Simulacrum · 2018-01-10T13:24:15Z

The try commit is done, we're waiting for perf to collect data for the previous auto branch commit -- should be next in queue.

michaelwoerister · 2018-01-10T15:14:50Z

http://perf.rust-lang.org/compare.html?start=2e33c89ff1518359c4bd5fbed1571ea00cb3b146&end=c70d5799e8445489a1db19a30f0e2a1f4aa31789&stat=instructions%3Au

So ... this looks very good in all cases, except for style-servo-opt - clean incremental-opt -- which regresses 19%? That kind of looks like a fluke. Let's wait for #47243 and #47181 to be merged and then do another test run, I guess. Both of these should change the distribution of values fed into leb128 quite a bit.

michaelwoerister · 2018-01-14T17:46:02Z

@bors try

Mark-Simulacrum · 2018-01-15T02:25:54Z

@bors try

Mark-Simulacrum · 2018-01-15T02:27:10Z

@bors retry

bors · 2018-01-15T02:27:19Z

⌛ Trying commit 53c2f44 with merge 8def86f...

@Mark-Simulacrum

Speed up leb128 encoding and decoding for unsigned values. Make the implementation for some leb128 functions potentially faster. @Mark-Simulacrum, could you please trigger a perf.rlo run?

bors · 2018-01-15T04:06:32Z

☀️ Test successful - status-travis
State: approved= try=True

michaelwoerister · 2018-01-15T10:44:10Z

OK, success. Let's see how it does now.

Mark-Simulacrum · 2018-01-15T14:42:06Z

Alright, queued. Should be a couple hours.

michaelwoerister · 2018-01-15T15:50:06Z

Posting the link for later. Doesn't work yet.
http://perf.rust-lang.org/compare.html?start=3f92e8d89861f0f5408ad9381a7467ec6e7d76bc&end=8def86fe7f5872cbbe29803aac0c64ee92bde697&stat=instructions%3Au

michaelwoerister · 2018-01-16T09:53:00Z

@Mark-Simulacrum Hm, the link doesn't seem to be working. Did I do something wrong?

michaelwoerister · 2018-01-17T11:21:27Z

The perf.rlo link works now. Numbers look good, I think.

re-r? @sfackler

sfackler · 2018-01-17T17:24:06Z

src/libserialize/leb128.rs

+macro_rules! impl_write_unsigned_leb128 {
+    ($fn_name:ident, $int_ty:ident) => (
+        #[inline]
+        pub fn $fn_name(out: &mut Vec<u8>, start_position: usize, mut value: $int_ty) -> usize {


It would be more verbose, but another strategy I've seen for this is just branching on the size of the value and avoiding the loop. Not sure which would be faster in rustc though.

So, running some tests shows that the following implementation for u32 is 10% faster when encoding metadata (while showing no improvement for the query-cache and the dep-graph):

#[inline] pub fn write_leb128_u32(out: &mut Vec<u8>, start_position: usize, value: u32) -> usize { if value <= (1 << 7) { write_to_vec(out, start_position, value as u8); 1 } else if value <= (1 << 14) { write_to_vec(out, start_position, (value as u8) | 0x80); write_to_vec(out, start_position + 1, (value >> 7) as u8); 2 } else if value <= (1 << 21) { write_to_vec(out, start_position, (value as u8) | 0x80); write_to_vec(out, start_position + 1, ((value >> 7) as u8) | 0x80); write_to_vec(out, start_position + 2, (value >> 14) as u8); 3 } else if value <= (1 << 28) { write_to_vec(out, start_position, (value as u8) | 0x80); write_to_vec(out, start_position + 1, ((value >> 7) as u8) | 0x80); write_to_vec(out, start_position + 2, (value >> 14) as u8 | 0x80); write_to_vec(out, start_position + 3, (value >> 21) as u8); 4 } else { write_to_vec(out, start_position, (value as u8) | 0x80); write_to_vec(out, start_position + 1, ((value >> 7) as u8) | 0x80); write_to_vec(out, start_position + 2, (value >> 14) as u8 | 0x80); write_to_vec(out, start_position + 3, (value >> 21) as u8 | 0x80); write_to_vec(out, start_position + 4, (value >> 28) as u8); 5 } }

A similar implementation for usize does a lot worse than the one from the PR. Not sure if it's worth the trouble since my test data is only from one crate.

Cool, thanks for checking it out!

sfackler · 2018-01-18T18:23:29Z

@bors r+

bors · 2018-01-18T18:23:30Z

📌 Commit 53c2f44 has been approved by sfackler

bors · 2018-01-20T02:00:20Z

⌛ Testing commit 53c2f44 with merge 816d765...

@Mark-Simulacrum

Speed up leb128 encoding and decoding for unsigned values. Make the implementation for some leb128 functions potentially faster. @Mark-Simulacrum, could you please trigger a perf.rlo run?

bors · 2018-01-20T04:48:54Z

☀️ Test successful - status-appveyor, status-travis
Approved by: sfackler
Pushing 816d765 to master...

rust-highfive assigned aidanhs Dec 21, 2017

kennytm added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Dec 21, 2017

rust-highfive assigned sfackler and unassigned aidanhs Dec 21, 2017

sfackler reviewed Dec 21, 2017

View reviewed changes

kennytm added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 23, 2017

Make leb128 coding a bit faster.

53c2f44

michaelwoerister force-pushed the new-leb128 branch from 43ad4fd to 53c2f44 Compare January 9, 2018 15:54

kennytm added S-blocked Status: Blocked on something else such as an RFC or other implementation work. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jan 10, 2018

kennytm added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Jan 14, 2018

kennytm removed the S-blocked Status: Blocked on something else such as an RFC or other implementation work. label Jan 14, 2018

kennytm added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jan 17, 2018

sfackler reviewed Jan 17, 2018

View reviewed changes

kennytm added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 18, 2018

bors merged commit 53c2f44 into rust-lang:master Jan 20, 2018

michaelwoerister mentioned this pull request Jan 22, 2018

Tracking Issue for Incremental Compilation #47660

Open

32 tasks

michaelwoerister mentioned this pull request May 25, 2018

SipHasher takes up lots of time in incremental builds #51054

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up leb128 encoding and decoding for unsigned values. #46919

Speed up leb128 encoding and decoding for unsigned values. #46919

michaelwoerister commented Dec 21, 2017

rust-highfive commented Dec 21, 2017

kennytm commented Dec 21, 2017

bors commented Dec 21, 2017

bors commented Dec 21, 2017

aidanhs commented Dec 21, 2017

sfackler Dec 21, 2017

michaelwoerister Dec 22, 2017

Mark-Simulacrum commented Dec 22, 2017

michaelwoerister commented Dec 23, 2017

alexcrichton commented Jan 4, 2018

michaelwoerister commented Jan 4, 2018

michaelwoerister commented Jan 9, 2018

bors commented Jan 9, 2018

bors commented Jan 9, 2018

michaelwoerister commented Jan 9, 2018

Mark-Simulacrum commented Jan 10, 2018

michaelwoerister commented Jan 10, 2018

michaelwoerister commented Jan 14, 2018

Mark-Simulacrum commented Jan 15, 2018

Mark-Simulacrum commented Jan 15, 2018

bors commented Jan 15, 2018

bors commented Jan 15, 2018

michaelwoerister commented Jan 15, 2018

Mark-Simulacrum commented Jan 15, 2018

michaelwoerister commented Jan 15, 2018

michaelwoerister commented Jan 16, 2018

michaelwoerister commented Jan 17, 2018

sfackler Jan 17, 2018

michaelwoerister Jan 18, 2018

sfackler Jan 18, 2018

sfackler commented Jan 18, 2018

bors commented Jan 18, 2018

bors commented Jan 20, 2018

bors commented Jan 20, 2018

Speed up leb128 encoding and decoding for unsigned values. #46919

Speed up leb128 encoding and decoding for unsigned values. #46919

Conversation

michaelwoerister commented Dec 21, 2017

rust-highfive commented Dec 21, 2017

kennytm commented Dec 21, 2017

bors commented Dec 21, 2017

bors commented Dec 21, 2017

aidanhs commented Dec 21, 2017

sfackler Dec 21, 2017

Choose a reason for hiding this comment

michaelwoerister Dec 22, 2017

Choose a reason for hiding this comment

Mark-Simulacrum commented Dec 22, 2017

michaelwoerister commented Dec 23, 2017

alexcrichton commented Jan 4, 2018

michaelwoerister commented Jan 4, 2018

michaelwoerister commented Jan 9, 2018

bors commented Jan 9, 2018

bors commented Jan 9, 2018

michaelwoerister commented Jan 9, 2018

Mark-Simulacrum commented Jan 10, 2018

michaelwoerister commented Jan 10, 2018

michaelwoerister commented Jan 14, 2018

Mark-Simulacrum commented Jan 15, 2018

Mark-Simulacrum commented Jan 15, 2018

bors commented Jan 15, 2018

bors commented Jan 15, 2018

michaelwoerister commented Jan 15, 2018

Mark-Simulacrum commented Jan 15, 2018

michaelwoerister commented Jan 15, 2018

michaelwoerister commented Jan 16, 2018

michaelwoerister commented Jan 17, 2018

sfackler Jan 17, 2018

Choose a reason for hiding this comment

michaelwoerister Jan 18, 2018

Choose a reason for hiding this comment

sfackler Jan 18, 2018

Choose a reason for hiding this comment

sfackler commented Jan 18, 2018

bors commented Jan 18, 2018

bors commented Jan 20, 2018

bors commented Jan 20, 2018