Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Latency spikes if Body ever returns Poll::Pending #3187

Closed
MorganR opened this issue Mar 28, 2023 · 4 comments
Closed

Latency spikes if Body ever returns Poll::Pending #3187

MorganR opened this issue Mar 28, 2023 · 4 comments
Labels
A-server Area: server. C-performance Category: performance. This is making existing behavior go faster.

Comments

@MorganR
Copy link

MorganR commented Mar 28, 2023

Version

I'm not sure exactly which module is the culprit, but here are my dependencies:

  • hyper: 0.14.25
  • http: 0.2.9
  • h2: 0.3.16
  • axum: 0.6.12

Platform
The output of uname -a (UNIX), or version and 32 or 64-bit (Windows)

Ubuntu 22.04.2 via WSL:

Linux MORGAN-SURFACE 5.15.90.1-microsoft-standard-WSL2 ... x86_64 GNU/Linux

Also tested on a Debian GCP VM:

Linux instance-1 5.10.0-21-cloud-amd64 #1 SMP Debian 5.10.162-1 (2023-01-21) x86_64 GNU/Linux

Description

I initially filed this as an issue in tower-http, since I noticed it when using tower's ServeDir middleware. However, on further investigation, it seems like this is an issue with how Body's are polled.

If a Body::poll_data ever returns Poll::Pending, then latency seems to jump to about 50ms (it's normally about 100-200us). I have reproduced this as simply as I can in this repro (make sure you're looking at the "body" branch).

I've implemented two very simple Body structs:

  • SimpleBody: returns a number of lines of text, providing one line per call to Body::poll_data. This performs almost identically to the simple single case that returns all lines as a single string.
  • DelayedBody: does the same thing as SimpleBody, except it returns Poll::Pending on every other call to Body::poll_data. As soon as Poll::Pending is returned, the latency jumps from a couple hundred microseconds to about 50 milliseconds.
@MorganR MorganR added the C-bug Category: bug. Something is wrong. This is bad! label Mar 28, 2023
@seanmonstar seanmonstar added A-server Area: server. C-performance Category: performance. This is making existing behavior go faster. and removed C-bug Category: bug. Something is wrong. This is bad! labels Mar 28, 2023
@seanmonstar
Copy link
Member

If there's something we can do to improve performance of streaming bodies, we should!

I'm curious about a potential simple thing first: could it be Nagle's? What if you enable tcp_nodelay?

@MorganR
Copy link
Author

MorganR commented Mar 28, 2023

Wow, that seems to remove the issue! Thank you Sean!

I'm unfortunately not very knowledgeable about the Nagle algorithm, or this layer of the stack generally. If Nagle introduces some kind of delayed send, then I'm naively wondering if it's possible for the server to realize that there is no more data to fetch (i.e. poll_data returns Poll::Ready(None)), and force an immediate send? Otherwise it appears to wait about 49ms longer than needed.

@hyperium hyperium deleted a comment from mahjelan Mar 28, 2023
@seanmonstar
Copy link
Member

It's a setting for the TCP implementation in the OS. It tries to prevent "small writes" by delaying them, hoping userland will write more data to the kernel buffer before it sends more. This was back when the overhead of a packet was expensive, so they tried to limit waste. It's usually a bad idea, at this point. But there are times when people still want to reduce bandwidth consumption, and so most libraries don't mess with the default, letting the operating system pick a default, and an application override if it decides to.

There's no "flush" for Nagle's. It's either on, or off. hyper has pushed all the data into the TCP socket, but if Nagle's is active, we can't force it to send out faster.

@seanmonstar seanmonstar closed this as not planned Won't fix, can't repro, duplicate, stale Mar 28, 2023
@MorganR
Copy link
Author

MorganR commented Mar 29, 2023

Thank you for the thorough explanation, Sean! Makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-server Area: server. C-performance Category: performance. This is making existing behavior go faster.
Projects
None yet
Development

No branches or pull requests

2 participants