Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sequence & Map Deserializers silently allow too-short cosumers #287

Closed
Lucretiel opened this issue Aug 16, 2021 · 2 comments · Fixed by #304
Closed

Sequence & Map Deserializers silently allow too-short cosumers #287

Lucretiel opened this issue Aug 16, 2021 · 2 comments · Fixed by #304

Comments

@Lucretiel
Copy link
Contributor

Lucretiel commented Aug 16, 2021

Summary

sequence deserialization doesn't check the length after deserialize_seq and deserialize_map, leading to a situation where data types may incorrectly succeed at deserializing because they read elements that were incorrectly left unconsumed by the sequence visitor

Details

MessagePack arrays & maps are a tagged length prefix, followed by simply by that many objects, serialized in order.

Consider this (malformed) input:

0x83 0xa1 0x61 0x93 0xa1 0x41 0xa1 0x43 0xa1 0x44 0xa1 0x62 0xa1 0x63

This represents the following structure, except that the first byte claims that what follows is a map containing 3 entries, when there are only 2 available:

{
  "a": ["a", "c", "d"],
  "b": "c"
}

However, suppose that we attempt to deserialize this into this struct:

#[derive(Serialize, Deserialize)]
struct Outer {
    a: [String; 1],
    b: String,
    c: String,
}

This will incorrectly successfully serialize. When the [String; 1] deserializes, it will begin deserializing a sequence, and consume the "a". However, it will then stop deserializing and report a success, even though the "c" and "d" haven't been consumed yet. However, because the Outer struct is expecting two more entries, it will parse the "c", "d" as c: "d", parse the trailing b: "c", then return with a success report.

This problem occurs with any type that deserializes a fixed number of elements (arrays, tuples, structs in sequence mode). It doesn't appear with structs in map mode, because those structs loop over all incoming keys with while let Some(key) = access.next_key_seed(seed)? {, which means that they will always parse N elements, even if N != num_fields.

The solution to this problem is to modify the SeqAccess and MapAccess structs to use a references to a usize:

struct SeqAccess<'a, R, C> {
    de: &'a mut Deserializer<R, C>,
    left: &'a mut usize,
}

Then, in the deserializer, pass a local mutable usize and check that it has counted down to 0 after the visit_seq or visit_map has completed.

@Lucretiel
Copy link
Contributor Author

I was able to reproduce this issue with non-malformed input:

use serde::{Deserialize, Serialize};
use std::io::Cursor;

#[derive(Serialize, Deserialize)]
struct Outer1 {
    a: [&'static str; 4],
    d: &'static str,
}

#[derive(Serialize, Deserialize, Debug, PartialEq, Eq)]
struct Outer2 {
    a: [String; 0],
    b: String,
}

fn main() {
    let data = Outer1 {
        a: ["b", "b", "c", "c"],
        d: "d",
    };

    let mut buffer = Vec::new();

    let mut serializer = rmp_serde::Serializer::new(&mut buffer)
        .with_binary()
        .with_struct_map();

    data.serialize(&mut serializer).unwrap();

    let mut deserializer = rmp_serde::Deserializer::new(Cursor::new(&buffer)).with_binary();

    let data = Outer2::deserialize(&mut deserializer).unwrap();

    assert_eq!(
        data,
        Outer2 {
            a: [],
            b: "b".to_owned(),
        }
    )
}

The deserialize succeeds incorrectly, and leave 4 string elements in the buffer that should have been consumed.

@kornelski
Copy link
Collaborator

Thanks for the report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants