Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow deserializing without consuming the entire stream #89

Closed
shaladdle opened this issue Jun 9, 2015 · 10 comments
Closed

Allow deserializing without consuming the entire stream #89

shaladdle opened this issue Jun 9, 2015 · 10 comments

Comments

@shaladdle
Copy link

My use case for this is deserializing more than one json object from a tcp connection. Currently you can't open one connection and send objects back and forth because serde::json::de::from_reader hangs until eof.

@erickt
Copy link
Member

erickt commented Jun 9, 2015

Hello @shaladdle! You should be able to directly use serde::json::de::Deserializer to do what you want.

@shaladdle
Copy link
Author

Thanks for the response! Sorry for taking a while to reply. I was able to get some time to try again with your suggestion, but I still get the same behavior. The receiver doesn't finish deserialization unless the sender closes the connection. Here's an example:

#![feature(custom_derive, plugin)]
#![plugin(serde_macros)]

extern crate serde;

use std::net;
use std::io::Read;
use std::thread;
use serde::Deserialize;

#[derive(Serialize, Deserialize, Debug)]
struct Request {
    message: String,
}

#[derive(Serialize, Deserialize, Debug)]
struct Response {
    message: String,
}

fn main() {
    let l = net::TcpListener::bind("localhost:20000").unwrap();
    thread::spawn(||{
        let l = l;
        for stream in l.incoming() {
            let mut stream = stream.unwrap();
            let read_stream = stream.try_clone().unwrap();
            let mut de = serde::json::Deserializer::new(read_stream.bytes()).unwrap();
            println!("deserializing");
            let request = Request::deserialize(&mut de).unwrap();
            println!("deserialized");
            let response = Response{message: request.message};
            serde::json::to_writer(&mut stream, &response).unwrap();
        }
    });
    let mut stream = net::TcpStream::connect("localhost:20000").unwrap();
    let request = Request{message: "hi there".to_string()};
    serde::json::to_writer(&mut stream, &request).unwrap();
    println!("message sent");
    let mut de = serde::json::Deserializer::new(stream.bytes()).unwrap();
    let response = Response::deserialize(&mut de).unwrap();
    println!("response: {:?}", response);
}

I would expect to see something like the following for output:

message sent
deserializing
deserialized
response: "hi there"

but it hangs after "deserializing".

@3Hren
Copy link
Contributor

3Hren commented Jul 1, 2015

There was a strange behavior in rustc-serialize, that a just parsed token was not yielded back, because just after successful parsing the decoder has tried to read at least one more byte from the Read.

Try to send over your TCP socket something like this: {}1, and if it succeed, then that bug was migrated into this project here.

@erickt
Copy link
Member

erickt commented Jul 1, 2015

Ah right. Some of the JSON constructs require looking one character ahead, which could trigger this. Maybe we're doing an unnecessary read? This will require some investigation.

@erickt
Copy link
Member

erickt commented Jul 1, 2015

Ok what's going on is that when we parse the end of an object, we call bump() here in order to say we've consumed this character. That function then reads the next character here to prep for the next token. This character is being stored in an Option<u8>, where None means that we've read everything from our stream.

It shouldn't be that hard to rewrite the parser to only fetch the next character when it actually needs it, but we'll have to be careful it doesn't impact performance.

erickt added a commit to erickt/serde that referenced this issue Aug 6, 2015
erickt added a commit to erickt/serde that referenced this issue Aug 6, 2015
rubdos pushed a commit to rubdos/serde that referenced this issue Jun 20, 2017
traits.rs: Implement Bounded for tuples of Bounded types
@jimis
Copy link

jimis commented Mar 11, 2019

Sorry for bumping this old issue, but I'm getting the exact same behaviour described on top: serde_json::from_reader() is hanging until EOF, when trying to deserialize a Vec from a BufReader. The EOF never happens ofcourse since this is a persistent socket connection. If I use take(serialization length) it simulates the EOF and everything works perfectly.
Is this the expected behaviour?

@dtolnay
Copy link
Member

dtolnay commented Mar 11, 2019

I think this is the correct behavior for from_reader. For example when deserializing from a File using from_reader, we don't want to silently allow trailing garbage. To deserialize from a prefix of a stream you can manage your own Deserializer:

let mut de = serde_json::Deserializer::from_reader(stream);
let t = T::deserialize(&mut de)?;

@jimis
Copy link

jimis commented Mar 11, 2019

Thanks, very well explained! Would be nice if it was mentioned in the docs for from_reader because hanging indefinitely is misleading to debug.

@dtolnay
Copy link
Member

dtolnay commented Mar 11, 2019

Good call, I filed serde-rs/json#522 to follow up.

@jimis
Copy link

jimis commented Mar 11, 2019

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

5 participants