Consider performance improvements #50

ndmitchell · 2019-04-15T14:27:22Z

We at Digital Asset took your work and made a bunch of performance improvements to the Decode module. The code is at https://github.com/digital-asset/daml/blob/master/nix/third-party/proto3-wire/src/Proto3/Wire/Decode.hs and the approximate list of changes I've been able to find are:

Switch to the Applicative instance for proto3-wire.
Use List instead of Map for RawMessage decoding
More INLINE instances.
Remove cereal dependency, do things more directly
Leave RawMessage a HashMap instead of converting
Get rid of Seq

Are you interested in taking changes that optimise the performance?

ixmatus · 2019-04-15T14:46:49Z

@ndmitchell yes!

gbaz · 2019-04-15T14:49:20Z

Absolutely!

Note that I think we have some divergence because we've made performance improvements to Decode as well in the meantime, some of which are the same, some of which are different. (E.g. Seq is gone too, but also we have a manually unrolled varint encoding, etc.)

C.f. #46 for the changes.

So it would require thinking a bit to figure out which changes in your branch can port over and make a difference in the code as it currently stands.

ndmitchell · 2019-04-15T14:52:22Z

Did you write any benchmarks? Our benchmarks were in the context of our larger project, so probably don't reproduce easily.

gbaz · 2019-04-15T16:47:14Z

Same for us, unfortunately.

ndmitchell · 2019-04-16T20:48:20Z

FWIW I ran the encoding benchmark I had and your version is 20% faster than ours. Yay! If you can't see anything in our patch set that looks valuable I'll close this ticket.

I didn't really measure decoding performance. That used to be the bottleneck for us, and where we spent most energy optimising. Now it's a code path we don't use that often.

gbaz · 2019-04-16T20:53:30Z

Good to know that all that performance work paid off! (It was rather exhausting). Both encoding and decoding are still pretty costly to us given the volume of data we work with, but we didn't have any great ideas for further improvements. If your team has any more bright ideas, please chime in!

As for the current patch set, the one thing we didn't really do or consider at all was the INLINE annotations. If you have benches that show that they bring improvement on the current codebase, that would be great.

ndmitchell · 2019-04-16T21:00:29Z

At some point in time in the past the INLINE pragmas helped. I can't say much about now.

I also note we switched to unchecked shifts. Not sure if GHC automatically optimises around that?

We also used HashMap rather than IntMap. Although that may have been because the rest of our stuff was using that.

There is also a BS.copy in the bytes that I was surprised to see surviving optimisation work.

gbaz · 2019-04-17T14:27:01Z

Maybe @j6carey can speak to the copy. I think we want it for our purposes to reduce overall retention.

Also not sure if the unchecked shifts are an improvement or not -- we might get equivalent results by simply having specialized to a single dictionary. Both are certainly worth considering...

j6carey · 2019-04-17T17:31:29Z

@gbaz , @ndmitchell , do you mean the copy during decoding? I have not studied decoding very much.

I think checked shifts whose shift distance is a compile-time constant get optimized to the same thing as unchecked shifts.

ndmitchell · 2019-04-17T18:02:44Z

@j6carey yes, the copy during decoding. We omitted that in our fork.

neil-da mentioned this issue Apr 17, 2019

Move to upstream versions of Haskell grpc libraries digital-asset/daml#501

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider performance improvements #50

Consider performance improvements #50

ndmitchell commented Apr 15, 2019

ixmatus commented Apr 15, 2019

gbaz commented Apr 15, 2019

ndmitchell commented Apr 15, 2019

gbaz commented Apr 15, 2019

ndmitchell commented Apr 16, 2019

gbaz commented Apr 16, 2019

ndmitchell commented Apr 16, 2019

gbaz commented Apr 17, 2019

j6carey commented Apr 17, 2019 •

edited

Loading

ndmitchell commented Apr 17, 2019

Consider performance improvements #50

Consider performance improvements #50

Comments

ndmitchell commented Apr 15, 2019

ixmatus commented Apr 15, 2019

gbaz commented Apr 15, 2019

ndmitchell commented Apr 15, 2019

gbaz commented Apr 15, 2019

ndmitchell commented Apr 16, 2019

gbaz commented Apr 16, 2019

ndmitchell commented Apr 16, 2019

gbaz commented Apr 17, 2019

j6carey commented Apr 17, 2019 • edited Loading

ndmitchell commented Apr 17, 2019

j6carey commented Apr 17, 2019 •

edited

Loading