-
Notifications
You must be signed in to change notification settings - Fork 253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
removes repeated bincode serialization of gossip CrdsValues #3575
base: master
Are you sure you want to change the base?
Conversation
bb477ad
to
87eded2
Compare
330df94
to
2132bae
Compare
…_packet} As part of anza-xyz#3575 we need to implement bincode (de)serialization without deriving serde Serialize/Deserialize trait. Packet::{from_data,populate_packet} methods also only require that the data to be bincode serializable and do not care about serde either. Next version of bincode provides the necessary traits to achieve this functionality: https://docs.rs/bincode/2.0.0-rc.3/bincode/enc/trait.Encode.html However that is not released yet. In order to avoid serde::Serialize trait bound on Packet::{from_data,populate_packet}, the commit adds a new Serialize trait which only requires implementing bincode serialization.
…_packet} As part of anza-xyz#3575 we need to implement bincode (de)serialization without deriving serde Serialize/Deserialize trait. Packet::{from_data,populate_packet} methods also only require that the data to be bincode serializable and do not care about serde either. Next version of bincode provides the necessary traits to achieve this functionality: https://docs.rs/bincode/2.0.0-rc.3/bincode/enc/trait.Encode.html but that is not released yet. In order to avoid serde::Serialize trait bound on Packet::{from_data,populate_packet}, the commit adds a new Serialize trait which only requires implementing bincode serialization.
2132bae
to
9536dc8
Compare
…_packet} As part of anza-xyz#3575 we need to implement bincode (de)serialization without deriving serde Serialize/Deserialize trait. Packet::{from_data,populate_packet} methods also only require that the data to be bincode serializable and do not care about serde either. Next version of bincode provides the necessary traits to achieve this functionality: https://docs.rs/bincode/2.0.0-rc.3/bincode/enc/trait.Encode.html but that is not released yet. In order to avoid serde::Serialize trait bound on Packet::{from_data,populate_packet}, the commit adds a new Serialize trait which only requires implementing bincode serialization.
…_packet} As part of anza-xyz#3575 we need to implement bincode (de)serialization without deriving serde Serialize/Deserialize trait. Packet::{from_data,populate_packet} methods also only require that the data to be bincode serializable and do not care about serde either. Next version of bincode provides the necessary traits to achieve this functionality: https://docs.rs/bincode/2.0.0-rc.3/bincode/enc/trait.Encode.html but that is not released yet. In order to avoid serde::Serialize trait bound on Packet::{from_data,populate_packet}, the commit adds a new Encode trait which only requires implementing bincode serialization.
9536dc8
to
469a6b3
Compare
…_packet} As part of anza-xyz#3575 we need to implement bincode (de)serialization without deriving serde Serialize/Deserialize trait. Packet::{from_data,populate_packet} methods also only require that the data to be bincode serializable and do not care about serde either. Next version of bincode provides the necessary traits to achieve this functionality: https://docs.rs/bincode/2.0.0-rc.3/bincode/enc/trait.Encode.html but that is not released yet. In order to avoid serde::Serialize trait bound on Packet::{from_data,populate_packet}, the commit adds a new Encode trait which only requires implementing bincode serialization.
58fb1e1
to
5217011
Compare
…_packet} (#3636) As part of #3575 we need to implement bincode (de)serialization without deriving serde Serialize/Deserialize traits. Packet::{from_data,populate_packet} methods also only require that the data to be bincode serializable and do not care about serde either. Next version of bincode provides the necessary traits to achieve this functionality: https://docs.rs/bincode/2.0.0-rc.3/bincode/enc/trait.Encode.html but that is not released yet. In order to avoid serde::Serialize trait bound on Packet::{from_data,populate_packet}, the commit adds a new Encode trait which only requires implementing bincode serialization.
860442e
to
0028ac6
Compare
#3575 reworks CrdsValue bincode (de)serialization. In order to maintain compatibility and correctness, the commit adds a round trip test for Vec<CrdsValue> (de)serialization and also verifies the serialized bytes against a hard-coded hash.
610f57a
to
8186746
Compare
06e60f8
to
945cbc7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mostly looks good. just a couple questions and may need to clarify some uses of "raw" numbers
bytes: &[u8], | ||
) -> impl Iterator<Item = Result<Self, bincode::Error>> + '_ { | ||
// Decode number of items in the slice. | ||
let (size, mut bytes) = match convert_fixed_bytes::<[u8; 8], 8>(bytes) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
convert_fixed_bytes::<[u8; 8], 8>
what exactly is the last 8
doing here? can we be more descriptive here on what these 8
s represent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
size is just the first 8 bytes of the serialized struct? and dictates how long the Vec<CrdsValue>
is in terms of items (not bytes)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment to clarify.
// bincode::serialize{,_into} serializes sequence length as
// a u64 (8 bytes) with fixint encoding in little endian.
// Implements bincode::deserialize_from for CrdsValue. | ||
pub(crate) fn bincode_deserialize( | ||
bytes: &[u8], | ||
allow_trailing_bytes: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you explain trailing bytes? Not sure I understand why we would allow these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is the same thing as in bincode:
https://docs.rs/bincode/1.3.3/bincode/config/trait.Options.html#method.allow_trailing_bytes
If you are only deserializing a single CrdsValue then generally you do not want to allow trailing bytes.
But you may be deserializing a struct Foo(CrdsValue, Bar)
or a Vec<CrdsValue>
in which case the trailing bytes are the next value to be deserialized, so you would want to allow trailing bytes.
.reject_trailing_bytes() | ||
.deserialize(bytes) | ||
} | ||
let (tag, bytes) = convert_fixed_bytes::<[u8; 4], 4>(bytes)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, can you elaborate on what the 4s
are here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh so get the first 4 bytes of the serialized value and convert that into a "tag" which just represents the type of message received (Pull Request, Push Message, etc)? May be worth just clarifying the "raw" numbers here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment.
// bincode::serialize{,_into} serializes enum tag as a u32 (4 bytes)
// with fixint encoding in little endian.
Gossip CrdsValues are deserialized when received as a gossip push message or pull response, but serialized again immediately to obtain a value hash during Crds::insert and then again serialized repeatedly every time the value is pushed to another node or returned as a response to a pull request. In order to avoid repeated serialization of a CrdsValue during its lifetime, the commit manually implements bincode (de)serialization of CrdsValue to hold on to bincode serialized bytes of CrdsData and reuses that for serializing CrdsValue.
945cbc7
to
5ddd8c6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how worried are we about the increased memory usage here? almost 2Xing memory cost of a CrdsValue. I don't know the memory profile of agave. But looks like we're getting ~30% decrease in process_gossip_packets_time
which is great so probably worth the increase in memory
yes, that is one of the concerns with this. |
cool sounds good |
Problem
Gossip
CrdsValue
s are deserialized when received as a gossip push message or pull response, but serialized again immediately to obtain a value hash duringCrds::insert
and then again serialized repeatedly every time the value is pushed to another node or returned as a response to a pull request.Summary of Changes
In order to avoid repeated serialization of a
CrdsValue
during its lifetime, the commit manually implements bincode (de)serialization ofCrdsValue
to hold on to bincode serialized bytes ofCrdsData
and reuses that for serializingCrdsValue
.