Support #[serde(default)] for struct fields at the end of a struct #179

andrewjstone · 2017-04-30T06:34:56Z

Is it possible to support #[serde(default)] in structs? This would allow adding new fields to structs and still having the new code be able to read old structs missing those fields.

I'm currently getting the following error:

--- deserialize_old_version_bincode stdout ----
	thread 'deserialize_old_version_bincode' panicked at 'called `Result::unwrap()` on an `Err` value: IoError(Error { repr: Custom(Custom { kind: UnexpectedEof, error: StringError("failed to fill whole buffer") }) })', /buildslave/rust-buildbot/slave/nightly-dist-rustc-linux/build/src/libcore/result.rs:845
note: Run with `RUST_BACKTRACE=1` for a backtrace.

Let me state up front that I'm not familiar with the bincode format, but using the following logic, I think it may be possible.

I understand why using #[serde(default)] may be a problem if re-ordering of struct fields was done, since the types could either be the same, in which case the wrong field gets deserialized, or different in which case an unexpected error is returned. However, I think if default fields are only allowed to be added at the end of structs, deserialization should work fine. If the input ends early, meaning the rest of the fields aren't there, the defaults are used. This allows backwards compatibility that is very useful when building distributed systems.

Conversely, for forwards compatibility, any new fields at the end of a struct that weren't understood could just be ignored.

The text was updated successfully, but these errors were encountered:

tikue · 2017-05-11T06:21:57Z

How would this work for nested structs?

TyOverby · 2017-05-11T14:23:25Z

Bincode doesn't serialize any information that would allow it to determine if a field has been serialized or not. It assumes that everything has been serialized.

andrewjstone · 2017-05-11T18:16:39Z

@tikue, I'm not familiar enough with the internals to know how nested structs would work. Presumably if the whole struct is missing a default would be used. If part of the nested struct is missing, that part would be defaulted as well.

andrewjstone · 2017-05-11T18:20:22Z

It's a shame that bincode can't support this. It makes it much less useful for building long running distributed systems where all nodes can't be taken offline and upgraded at once in order to add a new field. The only option it seems is to add a new message, and additionally use a header identifying the message globally with an integer.

tikue · 2017-05-11T19:59:22Z

Unfortunately I think supporting api evolution would require fundamental changes to the binary format. Fields would have to be tagged, which goes against the "compact as possible" goal. I'd say it's probably just not the snuggest fit for large distributed systems. There are other serde-compatible binary formats that support API evolution -- have you looked into serde_cbor?

andrewjstone · 2017-05-11T20:54:42Z

Yeah, that makes sense. I did look at serde_cbor, but from my benchmarks IIRC it was a lot slower. However, I'm not sure if that's inherent in the protocol or just if the code needs some optimization. Hopefully the latter. Thanks.

…

On Thu, May 11, 2017 at 3:59 PM, Tim ***@***.***> wrote: Unfortunately I think supporting api evolution would require fundamental changes to the binary format. Fields would have to be tagged, which goes against the "compact as possible" goal. I'd say it's probably just not the suggest for for large distributed systems. There are other serde-compatible binary formats that support API evolution -- have you looked into serde_cbor? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#179 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAcf-RfShfa614qU2hySyn_Lgw1aN-POks5r42ibgaJpZM4NMfEd> .

TyOverby · 2017-05-12T20:49:06Z

I'm terribly sorry for what I consider to be a lack of good communication and documentation on my part.

Because bincode doesn't include any metadata about what is or isn't serialized, making backwards compatible structure changes is impossible. When I made bincode I designed it for use in cross-process communication from the same codebase, or for game communication where it is expected that everyone will be on the same version of the game.

andrewjstone · 2017-05-12T21:00:30Z

Gotcha. Thank you. No need to be sorry. I was just hoping it worked because my benchmarks showed bincode being so damn fast :) You made an excellent system with certain design constraints that fit your use cases. That's perfectly valid and rational. I do agree, though that adding a one liner/paragraph about this to the docs would prevent this from coming up later. Thanks Again for the help understanding.

…

On Fri, May 12, 2017 at 4:49 PM, Ty Overby ***@***.***> wrote: I'm terribly sorry for what I consider to be a lack of good communication and documentation on my part. Because bincode doesn't include any metadata about what is or isn't serialized, making backwards compatible structure changes is impossible. When I made bincode I designed it for use in cross-process communication from the same codebase, or for game communication where it is expected that everyone will be on the same version of the game. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#179 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAcf-fwyTBZIsBF4yYPdD_I2kbiQ3PZEks5r5MXCgaJpZM4NMfEd> .

dtolnay · 2017-05-12T22:42:29Z

making backwards compatible structure changes is impossible.

I don't think this is accurate. Here is one possible way to solve the more general problem of changing schema (not limited to adding new fields at the end) in Bincode.

#[macro_use]
extern crate serde_derive;

extern crate serde;
extern crate bincode;

use std::fmt;

use bincode::Infinite;
use serde::de::{self, Deserialize, Deserializer, Visitor, SeqAccess};
use serde::ser::{Serialize, Serializer, SerializeTuple};

macro_rules! versioned_schema {
    ($($version:tt => $structure:ident,)+) => {
        #[derive(Debug)]
        enum Schema {
            $(
                $structure($structure),
            )+
        }

        impl Serialize for Schema {
            fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
                where S: Serializer
            {
                let mut tuple = serializer.serialize_tuple(2)?;
                match *self {
                    $(
                        Schema::$structure(ref v) => {
                            // serialize the version as one byte
                            tuple.serialize_element(&($version as u8))?;
                            // serialize the content
                            tuple.serialize_element(v)?;
                        }
                    )+
                }
                tuple.end()
            }
        }

        impl<'de> Deserialize<'de> for Schema
            where $($structure: Deserialize<'de>),+
        {
            fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
                where D: Deserializer<'de>
            {
                struct MyVisitor;

                impl<'de> Visitor<'de> for MyVisitor {
                    type Value = Schema;

                    fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
                        formatter.write_str("versioned schema")
                    }

                    fn visit_seq<S>(self, mut seq: S) -> Result<Self::Value, S::Error>
                        where S: SeqAccess<'de>
                    {
                        // deserialize the version as one byte
                        match seq.next_element::<u8>()?
                                 .ok_or_else(|| {
                                     de::Error::custom("missing schema version")
                                 })?
                        {
                            $(
                                $version => {
                                    // deserialize the content
                                    seq.next_element()?
                                       .ok_or_else(|| de::Error::custom(
                                            concat!("missing data for version ", $version)
                                       ))
                                       .map(Schema::$structure)
                                }
                            )+
                            other => {
                                Err(de::Error::custom(format_args!("unknown version: {}", other)))
                            }
                        }
                    }
                }

                deserializer.deserialize_tuple(2, MyVisitor)
            }
        }
    };
}

versioned_schema! {
    0 => Original,
    1 => Split,
    2 => Bytes,
}

#[derive(Serialize, Deserialize, Debug)]
struct Original {
    andrewjstone: u64,
}

#[derive(Serialize, Deserialize, Debug)]
struct Split {
    andrew: u32,
    j: u16,
    stone: u32,
}

#[derive(Serialize, Deserialize, Debug)]
struct Bytes {
    andrewjstone: Vec<u8>,
}

fn main() {
    let data = Schema::Original(Original { andrewjstone: 1 });
    let bytes = bincode::serialize(&data, Infinite).unwrap();
    println!("bytes: {:?}", bytes);

    let data: Schema = bincode::deserialize(&bytes).unwrap();
    println!("data: {:#?}", data);
}

andrewjstone · 2017-05-14T21:05:18Z

Thanks @dtolnay. This does appear to solve the backwards compatibility at the cost of having to version each change. In that case, we could just have different messages as well. I was asking for a different goal of just adding new fields at the end of existing structs ala protobuf to make them extensible without versioning.

Even with the fixes for backwards compatibility here, they are not forward compatible because the new structures aren't known on older nodes and can't be decoded. This is fine for complete changes of protocol structures, when both sides have to negotiate versions and understand the changes. However, when you want to do something simple like add an extra metric or tag to a structure, the overhead of maintaining two separate messages and versioning the code becomes unreasonable.

andrewjstone mentioned this issue May 1, 2017

Use serde instead of rustc_serialize vmware-archive/haret#94

Closed

tikue mentioned this issue May 9, 2017

API evolution? google/tarpc#150

Closed

TyOverby closed this as completed May 11, 2017

zhangsoledad mentioned this issue Apr 1, 2019

transaction deserializing should be ok: Io(Custom { kind: UnexpectedEof, error: StringError("failed to fill whole buffer") }) nervosnetwork/ckb#380

Closed

leoluk mentioned this issue Jul 26, 2020

Use portable serialization format with compatibility guarantees solana-labs/solana#11214

Open

jstarry mentioned this issue Sep 26, 2020

v1.3: Bigtable inner transaction plumbing is incompatible with v1.2 solana-labs/solana#12494

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support #[serde(default)] for struct fields at the end of a struct #179

Support #[serde(default)] for struct fields at the end of a struct #179

andrewjstone commented Apr 30, 2017

tikue commented May 11, 2017

TyOverby commented May 11, 2017 •

edited

Loading

andrewjstone commented May 11, 2017

andrewjstone commented May 11, 2017

tikue commented May 11, 2017 •

edited

Loading

andrewjstone commented May 11, 2017 via email

TyOverby commented May 12, 2017

andrewjstone commented May 12, 2017 via email

dtolnay commented May 12, 2017

andrewjstone commented May 14, 2017

Support #[serde(default)] for struct fields at the end of a struct #179

Support #[serde(default)] for struct fields at the end of a struct #179

Comments

andrewjstone commented Apr 30, 2017

tikue commented May 11, 2017

TyOverby commented May 11, 2017 • edited Loading

andrewjstone commented May 11, 2017

andrewjstone commented May 11, 2017

tikue commented May 11, 2017 • edited Loading

andrewjstone commented May 11, 2017 via email

TyOverby commented May 12, 2017

andrewjstone commented May 12, 2017 via email

dtolnay commented May 12, 2017

andrewjstone commented May 14, 2017

TyOverby commented May 11, 2017 •

edited

Loading

tikue commented May 11, 2017 •

edited

Loading