Attempt to return dropped Body to pool #94

davidiw · 2020-06-25T17:07:56Z

In one of my apps, I use ureq to read/write from a key/value server. When I perform writes, the status response is sufficient to know failure or success. However, if I do not read the response body, the connection doesn't return to the pool since the data remains on the stream. This seems odd and requires in my app adding an explicit read on the response in order to allow the stream to return to the pool. I was curious why this is the case and whether or not we could just flush the buffer:

https://github.com/algesten/ureq/blob/master/src/pool.rs#L218

jsha · 2020-06-25T22:59:21Z

At the HTTP protocol level, something has to read the full response body before the TCP connection is ready to send connections on again. In ureq, that's done with into_reader, into_string, etc. Reading diem/diem#4735 it looks like you do read the body, but only on 200's. That means 403's, 500s, etc will not get returned to the pool and get closed when the Response is dropped. If you're seeing something other than that, please let us know. It seems like what you most likely want is to unconditionally call into_string() and then check the status code.

We've talked recently about a possible breaking change to the API to use Result. I'm planning to write up a detailed proposal soon, but I think that returning an Err for connection-breaking errors would perhaps make it clearer that anything receiving an Ok needs to read the stream.

Also on diem/diem#4735 I see someone mention implementing drop. I assume that means "read all outstanding response bytes when Response is dropped." I think that's not the right approach, since that can wind up performing large amounts of work or even blocking indefinitely.

algesten · 2020-06-26T17:20:27Z

The connection is auto-put back in the pool on drop (if possible), we could make some explicit API for that.

On the Response object we could add something like this:

  fn is_poolable(&self) -> bool {
    // check if response is exhausted
  }

  fn return_to_pool(self) {
    // 1. exhaust response into void
    // 2. by virtue of taking `self`, the rest just happens
  }

The idea is that it would aid workflows where pooling is desirable.

davidiw · 2020-06-26T17:23:26Z

Yeah, in my case, it is obvious that recycling should happen without any issues, but I can totally see that as @jsha alluded this cannot be the default behavior -- if someone attempts to download a file just to see if it exists, for instance, a naive approach would end up downloading the entire file /just/ to reuse a connection.

jsha · 2020-06-26T21:47:34Z

I'm not keen on adding an explicit return_to_pool, when we already have various functions that accomplish the same goal, like into_string(). I would rather make the API more obvious / easier to use in a way that maximizes efficiency.

Here are a couple of other interventions that could increase connection pooling in cases where the user isn't necessarily calling into_string() or into reader():

If there is no response body (that is, the response headers do not contain Content-Length or Transfer-Encoding), go ahead and return the connection to the pool immediately without waiting for into_string() / into_reader(). Store some internal state to remember that we gave up the Stream and return a zero-length reader if asked.
If the response body is known to be zero-length (Content-Length: 0), do the same thing.
Similarly, if the response body is known to be small (Content-Length: N for N < 1024), proactively read the body into an internal buffer, and return a Cursor over that buffer when asked for into_reader().

By the way, thinking about this made me realize another trickiness of the API. Since into_reader() / into_string() consume the Response, the status code and headers become unavailable to users once those methods are called. I think in many cases the user may want to do something like:

  let resp = ureq::get("http://example.com").call();
  let resp_body = resp.into_string()?;
  println!("response status {}, body {}", resp.status(), resp_body);

The above fails because resp was moved / consumed by into_string(). A user can work around it by storing a copy of the status code and headers before consuming the body, but it's a bit awkward.

I don't have a particular recommendation for how to change this API right now but will think about it some more.

algesten · 2020-06-27T06:32:52Z

easier to use in a way that maximizes efficiency.

I like these ideas. Repooling early is not a thought that crossed my mind.

If we're considering breaking changes, one idea could be to use the informal rust standardized http crate. In hreq I made that a design goal, and it has some advantages for the user.

It's standard.
It's agnostic wrt body.
Body is separate.

A common approach with http would be into_parts(). Here's some fantasy API:

use ureq::Body;
let resp: http::Response<Body> = ureq::get("http://example.com").call();
let (parts, body) = resp.into_parts();

println!("response status {}, body {}", parts.status, body.into_string()?);

And we could use extension traits to make it even more ergonomic.

use ureq::ResponseExt;
use ureq::Body;
let resp: http::Response<Body> = ureq::get("http://example.com").call();

println!("response status {}, body {}", resp.status(), resp.read_string()?);

tv42 · 2021-01-13T22:42:36Z

For what it's worth, in a similar scenario of wanting to reuse a socket with unread response leftovers, Go's net/http consumes & discards up to 2 KiB of data, abandoning the socket if it would have to read more. If that doesn't result in the end of response, the socket is thrown out. It also checks Content-Length and doesn't bother reading at all if it looks like there'd be too much to consume.

https://github.com/golang/go/blob/go1.15.6/src/net/http/client.go#L693-L701

(Tangent: this is only true for HTTP/1, HTTP/2, /3 are different beasts.. Not that that would matter for ureq as it is today.)

jsha · 2021-01-13T23:15:20Z

That's a very helpful reference. Thank you!

algesten · 2024-11-26T22:02:28Z

In ureq3. x this still needs doing. Dropping the Body before it is finished currently results in that connection being gone.

A new config option to attempt to return connections of dropped bodies to the pool. Defaults to false.
impl Drop trait for the relevant types
Attempt to discard up to 2kb (or configurable amount)

davidiw mentioned this issue Jun 25, 2020

[safety-rules] A few improvements around testing and benchmarks diem/diem#4735

Closed

jsha mentioned this issue Aug 5, 2020

Proposal: Result-based API #128

Closed

JoshLind mentioned this issue Jan 11, 2021

[Secure Storage] Update vault request timeout and error handling. diem/diem#7193

Merged

algesten changed the title ~~Stream (socket) pooling depends on reading streams to completion~~ Read up to 2kb of data for a Dropped body to attempt to return to pool Nov 26, 2024

algesten changed the title ~~Read up to 2kb of data for a Dropped body to attempt to return to pool~~ Attempt to return dropped Body to pool Nov 26, 2024

algesten added the help wanted Extra attention is needed label Nov 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attempt to return dropped Body to pool #94

Attempt to return dropped Body to pool #94

davidiw commented Jun 25, 2020

jsha commented Jun 25, 2020

algesten commented Jun 26, 2020 •

edited

Loading

davidiw commented Jun 26, 2020

jsha commented Jun 26, 2020 •

edited

Loading

algesten commented Jun 27, 2020 •

edited

Loading

tv42 commented Jan 13, 2021

jsha commented Jan 13, 2021

algesten commented Nov 26, 2024

Attempt to return dropped Body to pool #94

Attempt to return dropped Body to pool #94

Comments

davidiw commented Jun 25, 2020

jsha commented Jun 25, 2020

algesten commented Jun 26, 2020 • edited Loading

davidiw commented Jun 26, 2020

jsha commented Jun 26, 2020 • edited Loading

algesten commented Jun 27, 2020 • edited Loading

tv42 commented Jan 13, 2021

jsha commented Jan 13, 2021

algesten commented Nov 26, 2024

algesten commented Jun 26, 2020 •

edited

Loading

jsha commented Jun 26, 2020 •

edited

Loading

algesten commented Jun 27, 2020 •

edited

Loading