Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gzip decoder returns too early while consuming TCP fragmented chuncked body #2381

Open
Andrey36652 opened this issue Aug 13, 2024 · 1 comment

Comments

@Andrey36652
Copy link

Hello. I believe a found a bug. It doesn't make HTTP response incorrect but it effectively disables connection reusing.
I drew a diagram for such case.
reqwest-bug-diagram
As you can see Client (with activated gzip feature) sends request to server. Server responds with chunked gzipped body, but body somehow is fragmented at TCP level (see diagram) and those fragments are divided by some time (milliseconds at least, so you can't read() them together with one call). But first fragment contains whole gzipped bytes.
Gzip decoder returns (and, as consequence, Response::text() call too) immediately after decoding last encoded byte, i.e. it doesn't wait for full response body arrival. And then Response is dropped (because Response::text() consumes Response). And it prevents connection from being put back in pool and being reused (because connection itself is still waiting for bytes from server).

I created minimal reproduction https://github.com/Andrey36652/reqwest-chunked-bug where client shoud make 3 consecutive requests to the server that responds with fragmented body (there is a 4 seconds delay between the fragments, so it should be ~12 seconds in total). But when I run it, all 3 requests are resolved immediately as you can see from timestamps (and each uses new connection).

2024-08-13T02:27:51.666326Z  INFO reqwest_chunked_bug::server: Listening on 127.0.0.1:9955        
2024-08-13T02:27:51.666796Z  INFO reqwest_chunked_bug::client: ---------- start request ----------
2024-08-13T02:27:51.667219Z TRACE hyper_util::client::legacy::pool: checkout waiting for idle connection: ("http", 127.0.0.1:9955)
2024-08-13T02:27:51.668027Z DEBUG reqwest::connect: starting new connection: http://127.0.0.1:9955/    
2024-08-13T02:27:51.668307Z TRACE hyper_util::client::legacy::connect::http: Http::connect; scheme=Some("http"), host=Some("127.0.0.1"), port=Some(Port(9955))
2024-08-13T02:27:51.668727Z DEBUG hyper_util::client::legacy::connect::http: connecting to 127.0.0.1:9955
2024-08-13T02:27:51.669974Z DEBUG hyper_util::client::legacy::connect::http: connected to 127.0.0.1:9955
2024-08-13T02:27:51.670344Z TRACE hyper_util::client::legacy::client: http1 handshake complete, spawning background dispatcher task
2024-08-13T02:27:51.671059Z TRACE hyper_util::client::legacy::pool: checkout dropped for ("http", 127.0.0.1:9955)
2024-08-13T02:27:51.671067Z  INFO reqwest_chunked_bug::server: Accepted connection from 127.0.0.1:51348
2024-08-13T02:27:51.671447Z TRACE reqwest::connect::verbose: 9a08e68e write (vectored): b"GET / HTTP/1.1\r\naccept: */*\r\naccept-encoding: gzip\r\nhost: 127.0.0.1:9955\r\n\r\n"    
2024-08-13T02:27:51.671533Z  INFO reqwest_chunked_bug::server: Read request from client: "GET / HTTP/1.1\r\naccept: */*\r\naccept-encoding: gzip\r\nhost: 127.0.0.1:9955\r\n\r\n"
2024-08-13T02:27:51.671811Z  INFO reqwest_chunked_bug::server: First part of response has been sent
2024-08-13T02:27:51.671815Z TRACE reqwest::connect::verbose: TODO: verbose poll_read    
2024-08-13T02:27:51.672606Z  INFO reqwest_chunked_bug::client: result: {"symbol":"BTCUSDT","openInterest":"70056.543","time":1723425441998}
2024-08-13T02:27:51.672884Z  INFO reqwest_chunked_bug::client: ---------- end request ----------
2024-08-13T02:27:51.673036Z  INFO reqwest_chunked_bug::client: ---------- start request ----------
2024-08-13T02:27:51.673269Z TRACE hyper_util::client::legacy::pool: checkout waiting for idle connection: ("http", 127.0.0.1:9955)
2024-08-13T02:27:51.673423Z DEBUG reqwest::connect: starting new connection: http://127.0.0.1:9955/    
2024-08-13T02:27:51.673608Z TRACE hyper_util::client::legacy::connect::http: Http::connect; scheme=Some("http"), host=Some("127.0.0.1"), port=Some(Port(9955))
2024-08-13T02:27:51.673722Z DEBUG hyper_util::client::legacy::connect::http: connecting to 127.0.0.1:9955
2024-08-13T02:27:51.674025Z DEBUG hyper_util::client::legacy::connect::http: connected to 127.0.0.1:9955
2024-08-13T02:27:51.674027Z  INFO reqwest_chunked_bug::server: Accepted connection from 127.0.0.1:51349
2024-08-13T02:27:51.674169Z TRACE hyper_util::client::legacy::client: http1 handshake complete, spawning background dispatcher task
2024-08-13T02:27:51.674343Z TRACE hyper_util::client::legacy::pool: checkout dropped for ("http", 127.0.0.1:9955)
2024-08-13T02:27:51.674486Z TRACE reqwest::connect::verbose: 866a12cc write (vectored): b"GET / HTTP/1.1\r\naccept: */*\r\naccept-encoding: gzip\r\nhost: 127.0.0.1:9955\r\n\r\n"    
2024-08-13T02:27:51.674500Z  INFO reqwest_chunked_bug::server: Read request from client: "GET / HTTP/1.1\r\naccept: */*\r\naccept-encoding: gzip\r\nhost: 127.0.0.1:9955\r\n\r\n"
2024-08-13T02:27:51.674680Z  INFO reqwest_chunked_bug::server: First part of response has been sent
2024-08-13T02:27:51.674744Z TRACE reqwest::connect::verbose: TODO: verbose poll_read    
2024-08-13T02:27:51.675520Z  INFO reqwest_chunked_bug::client: result: {"symbol":"BTCUSDT","openInterest":"70056.543","time":1723425441998}
2024-08-13T02:27:51.675780Z  INFO reqwest_chunked_bug::client: ---------- end request ----------
2024-08-13T02:27:51.676093Z  INFO reqwest_chunked_bug::client: ---------- start request ----------
2024-08-13T02:27:51.676354Z TRACE hyper_util::client::legacy::pool: checkout waiting for idle connection: ("http", 127.0.0.1:9955)
2024-08-13T02:27:51.676575Z DEBUG reqwest::connect: starting new connection: http://127.0.0.1:9955/
2024-08-13T02:27:51.676834Z TRACE hyper_util::client::legacy::connect::http: Http::connect; scheme=Some("http"), host=Some("127.0.0.1"), port=Some(Port(9955))
2024-08-13T02:27:51.677125Z DEBUG hyper_util::client::legacy::connect::http: connecting to 127.0.0.1:9955
2024-08-13T02:27:51.677489Z DEBUG hyper_util::client::legacy::connect::http: connected to 127.0.0.1:9955
2024-08-13T02:27:51.677773Z TRACE hyper_util::client::legacy::client: http1 handshake complete, spawning background dispatcher task
2024-08-13T02:27:51.678187Z  INFO reqwest_chunked_bug::server: Accepted connection from 127.0.0.1:51350
2024-08-13T02:27:51.678194Z TRACE hyper_util::client::legacy::pool: checkout dropped for ("http", 127.0.0.1:9955)
2024-08-13T02:27:51.678778Z TRACE reqwest::connect::verbose: 1110a507 write (vectored): b"GET / HTTP/1.1\r\naccept: */*\r\naccept-encoding: gzip\r\nhost: 127.0.0.1:9955\r\n\r\n"
2024-08-13T02:27:51.678844Z  INFO reqwest_chunked_bug::server: Read request from client: "GET / HTTP/1.1\r\naccept: */*\r\naccept-encoding: gzip\r\nhost: 127.0.0.1:9955\r\n\r\n"
2024-08-13T02:27:51.679286Z  INFO reqwest_chunked_bug::server: First part of response has been sent
2024-08-13T02:27:51.679322Z TRACE reqwest::connect::verbose: TODO: verbose poll_read
2024-08-13T02:27:51.679771Z  INFO reqwest_chunked_bug::client: result: {"symbol":"BTCUSDT","openInterest":"70056.543","time":1723425441998}
2024-08-13T02:27:51.679920Z  INFO reqwest_chunked_bug::client: ---------- end request ----------
2024-08-13T02:27:55.679025Z  INFO reqwest_chunked_bug::server: Second part of response has been sent
2024-08-13T02:27:55.679020Z  INFO reqwest_chunked_bug::server: Second part of response has been sent
thread 'thread 'tokio-runtime-worker' panicked at src\server.rstokio-runtime-worker:' panicked at 27src\server.rs::1427:
:read_http_request failed: Os { code: 10053, kind: ConnectionAborted, message: "Программа на вашем хост-компьютере разорвала установленное подключение." }14:
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
read_http_request failed: Os { code: 10053, kind: ConnectionAborted, message: "Программа на вашем хост-компьютере разорвала установленное подключение." }   
2024-08-13T02:27:55.682262Z  INFO reqwest_chunked_bug::server: Second part of response has been sent
thread 'tokio-runtime-worker' panicked at src\server.rs:27:14:
read_http_request failed: Os { code: 10053, kind: ConnectionAborted, message: "Программа на вашем хост-компьютере разорвала установленное подключение." }

If I disable gzip deconding in Client (uncomment line in src/client.rs) then behaviour is correct - each request waits until full body will be consumed and returns only after that moment.

2024-08-13T02:30:18.447052Z  INFO reqwest_chunked_bug::server: Listening on 127.0.0.1:9955
2024-08-13T02:30:18.447729Z  INFO reqwest_chunked_bug::client: ---------- start request ----------
2024-08-13T02:30:18.448322Z TRACE hyper_util::client::legacy::pool: checkout waiting for idle connection: ("http", 127.0.0.1:9955)
2024-08-13T02:30:18.448508Z DEBUG reqwest::connect: starting new connection: http://127.0.0.1:9955/    
2024-08-13T02:30:18.448700Z TRACE hyper_util::client::legacy::connect::http: Http::connect; scheme=Some("http"), host=Some("127.0.0.1"), port=Some(Port(9955))
2024-08-13T02:30:18.448808Z DEBUG hyper_util::client::legacy::connect::http: connecting to 127.0.0.1:9955
2024-08-13T02:30:18.449165Z DEBUG hyper_util::client::legacy::connect::http: connected to 127.0.0.1:9955
2024-08-13T02:30:18.449332Z TRACE hyper_util::client::legacy::client: http1 handshake complete, spawning background dispatcher task
2024-08-13T02:30:18.449507Z TRACE hyper_util::client::legacy::pool: checkout dropped for ("http", 127.0.0.1:9955)
2024-08-13T02:30:18.449509Z  INFO reqwest_chunked_bug::server: Accepted connection from 127.0.0.1:54654
2024-08-13T02:30:18.449771Z TRACE reqwest::connect::verbose: 4e2ae2db write (vectored): b"GET / HTTP/1.1\r\naccept: */*\r\nhost: 127.0.0.1:9955\r\n\r\n"    
2024-08-13T02:30:18.449804Z  INFO reqwest_chunked_bug::server: Read request from client: "GET / HTTP/1.1\r\naccept: */*\r\nhost: 127.0.0.1:9955\r\n\r\n"    
2024-08-13T02:30:18.449960Z  INFO reqwest_chunked_bug::server: First part of response has been sent
2024-08-13T02:30:18.449976Z TRACE reqwest::connect::verbose: TODO: verbose poll_read
2024-08-13T02:30:22.462205Z  INFO reqwest_chunked_bug::server: Second part of response has been sent
2024-08-13T02:30:22.462967Z TRACE reqwest::connect::verbose: TODO: verbose poll_read    
2024-08-13T02:30:22.463353Z TRACE hyper_util::client::legacy::pool: put; add idle connection for ("http", 127.0.0.1:9955)
2024-08-13T02:30:22.463426Z  INFO reqwest_chunked_bug::client: result: ▼♥�V*��M��Q�Rr
v♫      Q�Q�/H���+I-J-.☺J�↑��陚↑♥�J2sS��♀͍�M�LML♀---j☺�Gb;D
2024-08-13T02:30:22.463662Z DEBUG hyper_util::client::legacy::pool: pooling idle connection for ("http", 127.0.0.1:9955)
2024-08-13T02:30:22.464137Z  INFO reqwest_chunked_bug::client: ---------- end request ----------
2024-08-13T02:30:22.464510Z  INFO reqwest_chunked_bug::client: ---------- start request ----------
2024-08-13T02:30:22.464760Z TRACE hyper_util::client::legacy::pool: take? ("http", 127.0.0.1:9955): expiration = Some(300s)
2024-08-13T02:30:22.464959Z DEBUG hyper_util::client::legacy::pool: reuse idle connection for ("http", 127.0.0.1:9955)
2024-08-13T02:30:22.465213Z TRACE reqwest::connect::verbose: 4e2ae2db write (vectored): b"GET / HTTP/1.1\r\naccept: */*\r\nhost: 127.0.0.1:9955\r\n\r\n"    
2024-08-13T02:30:22.465264Z  INFO reqwest_chunked_bug::server: Read request from client: "GET / HTTP/1.1\r\naccept: */*\r\nhost: 127.0.0.1:9955\r\n\r\n"    
2024-08-13T02:30:22.465597Z  INFO reqwest_chunked_bug::server: First part of response has been sent
2024-08-13T02:30:22.465606Z TRACE reqwest::connect::verbose: TODO: verbose poll_read
2024-08-13T02:30:26.475653Z  INFO reqwest_chunked_bug::server: Second part of response has been sent
2024-08-13T02:30:26.476240Z TRACE reqwest::connect::verbose: TODO: verbose poll_read    
2024-08-13T02:30:26.476739Z TRACE hyper_util::client::legacy::pool: put; add idle connection for ("http", 127.0.0.1:9955)
2024-08-13T02:30:26.476800Z  INFO reqwest_chunked_bug::client: result: ▼♥�V*��M��Q�Rr
v♫      Q�Q�/H���+I-J-.☺J�↑��陚↑♥�J2sS��♀͍�M�LML♀---j☺�Gb;D
2024-08-13T02:30:26.477604Z  INFO reqwest_chunked_bug::client: ---------- end request ----------
2024-08-13T02:30:26.477081Z DEBUG hyper_util::client::legacy::pool: pooling idle connection for ("http", 127.0.0.1:9955)
2024-08-13T02:30:26.477836Z  INFO reqwest_chunked_bug::client: ---------- start request ----------
2024-08-13T02:30:26.478274Z TRACE hyper_util::client::legacy::pool: take? ("http", 127.0.0.1:9955): expiration = Some(300s)
2024-08-13T02:30:26.478447Z DEBUG hyper_util::client::legacy::pool: reuse idle connection for ("http", 127.0.0.1:9955)
2024-08-13T02:30:26.478758Z TRACE reqwest::connect::verbose: 4e2ae2db write (vectored): b"GET / HTTP/1.1\r\naccept: */*\r\nhost: 127.0.0.1:9955\r\n\r\n"    
2024-08-13T02:30:26.478780Z  INFO reqwest_chunked_bug::server: Read request from client: "GET / HTTP/1.1\r\naccept: */*\r\nhost: 127.0.0.1:9955\r\n\r\n"    
2024-08-13T02:30:26.479189Z  INFO reqwest_chunked_bug::server: First part of response has been sent
2024-08-13T02:30:26.479201Z TRACE reqwest::connect::verbose: TODO: verbose poll_read
2024-08-13T02:30:30.483203Z  INFO reqwest_chunked_bug::server: Second part of response has been sent
2024-08-13T02:30:30.484606Z TRACE reqwest::connect::verbose: TODO: verbose poll_read    
2024-08-13T02:30:30.485787Z TRACE hyper_util::client::legacy::pool: put; add idle connection for ("http", 127.0.0.1:9955)
2024-08-13T02:30:30.485925Z  INFO reqwest_chunked_bug::client: result: ▼♥�V*��M��Q�Rr
v♫      Q�Q�/H���+I-J-.☺J�↑��陚↑♥�J2sS��♀͍�M�LML♀---j☺�Gb;D
2024-08-13T02:30:30.488265Z  INFO reqwest_chunked_bug::client: ---------- end request ----------
2024-08-13T02:30:30.486928Z DEBUG hyper_util::client::legacy::pool: pooling idle connection for ("http", 127.0.0.1:9955)
2024-08-13T02:30:30.489721Z  WARN reqwest_chunked_bug::server: Connection closed by client

I don't know if an error is in reqwest or in async-compression crate, and I'm afraid my knowledge isn't enough to find a solution.

@Andrey36652
Copy link
Author

Seems like I've found out a reason. async-compression by default will report stream as "Done" as soon as GzipDecoder has finished it's work. https://github.com/Nullus157/async-compression/blob/9880aa99b264825e5c6902b2e25132e6bf2f74b9/src/tokio/bufread/generic/decoder.rs#L91
But one can change this behaviour by calling decoder.multiple_members(true) after constructing GzipDecoder.
So

#[cfg(feature = "gzip")]
DecoderType::Gzip => Poll::Ready(Ok(Inner::Gzip(Box::pin(FramedRead::new(
GzipDecoder::new(StreamReader::new(_body)),
BytesCodec::new(),
))))),
should be changed to

#[cfg(feature = "gzip")]
DecoderType::Gzip => Poll::Ready(Ok(Inner::Gzip(Box::pin(FramedRead::new(
    {
        let mut d = GzipDecoder::new(StreamReader::new(_body));
        d.multiple_members(true);
        d
    },
    BytesCodec::new(),
))))),

And now it works as expected!

Should all other decoders be created with this option enabled? I think yes.

Should this option be enabled only if Transfer-Encoding: chunked header is present or it can be used for non-chunked gzipped body as well? From my testing it works fine (for gzipped content) when:

  • Transfer-Encoding is chunked
  • Content-Length header is present and is corrent and no Transfer-Encoding

But response processing hangs up when neither Transfer-Encoding nor Content-Length are presented. So we could disable multiple_members in such case (but then connection will not be put back in pool and reused 🙁). I'm not familiar with HTTP specification, is it legal for response to not have any of these headers?

And I'm not sure how multiple-resource bodies should be processed. https://developer.mozilla.org/en-US/docs/Web/HTTP/Messages#body_2
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant