Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bincode is not optimal #26406

Closed
4 tasks
sakridge opened this issue Jul 5, 2022 · 4 comments
Closed
4 tasks

bincode is not optimal #26406

sakridge opened this issue Jul 5, 2022 · 4 comments
Labels
stale [bot only] Added to stale content; results in auto-close after a week.

Comments

@sakridge
Copy link
Member

sakridge commented Jul 5, 2022

Problem

For serialize/deserialize the validator uses bincode in many cases but bincode is known to not be optimal.

Proposed Solution

Convert bincode use-cases to other faster serialization libraries like speedy/borsh/capnproto. A more universal format would be nicer for interconnection with other languages and environments as well.

  • Transactions
  • Repair
  • Gossip
  • Shred payload
@bw-solana
Copy link
Contributor

The major hangup I've encountered hacking at this quickly is deriving traits for the GenericArray struct (part of Signature struct)

@bw-solana
Copy link
Contributor

bw-solana commented Sep 2, 2022

Quick comparison of SerDes methods running cargo test --release --package solana-ledger --lib -- shredder::tests::test_data_shredder --exact --nocapture
In all cases, 1000 entries of 64B were used to generate the shreds.

Bincode (Default)

  • serialize_time = 209us
  • 251 shreds with size 184
  • deserialize_time = 845us

Borsh

  • serialize_time = 169us
  • 262 shreds with size 184
  • deserialize_time = 311us

speedy

  • serialize_time = 60us
  • 288 shreds with size 184
  • deserialize_time = 460us

postcard

  • serialize_time = 193us
  • 237 shreds with size 184
  • deserialize_time = 383us

Code can be found on this branch. Note the SerDes library must be selected via feature (sdbincode, sdborsh, sdspeedy, sdpostcard) and does not include changes made to local version of GenericArray crate (trivial trait derivations)

@bw-solana
Copy link
Contributor

Given how sensitive our system is to shred volume, I'm wondering if something like postcard is the best option

@bw-solana
Copy link
Contributor

bw-solana commented Sep 8, 2022

Some more challenges uncovered looking into converting Shreds to use a different SerDes library:

  1. There are assumptions around being able to use static offsets into byte vector to extract fields. This doesn't work with postcard as variables will be sized dynamically (e.g. only 1 byte to store slot if the value is <=255 whereas bincode will always use the full 8 bytes)
  2. There is heavy reliance on bincode functions such as serialize_into and serialized_size that don't have direct equivalents in other libraries.

I believe these things can be worked around, but it is messy. 1 in particular makes me concerned regarding latency of extracting fields such as index, slot, version with dynamic sizing (we would have to perform a full deserialization). However, the dynamic sizing is what allows for tighter packing and reduced number of shreds.

@github-actions github-actions bot added the stale [bot only] Added to stale content; results in auto-close after a week. label Sep 11, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale [bot only] Added to stale content; results in auto-close after a week.
Projects
None yet
Development

No branches or pull requests

2 participants