-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
100% CPU usage on client disconnection when serving a streamed response #1764
Comments
I can see the same behaviour with the latest Hyper v0.12.24 (tokio v0.1.15, tokio-io v0.1.11). The application works correctly when I put it behind nginx proxy. |
Hm, sounds like it's looping here? Line 215 in 877606d
Does the log message after it trigger over and over? |
Yes, I also think it's at this line. I'm not sure which log message you mean. If you mean the stack trace, then it's also correct. Every time I interrupt the process in GDB, I get this stack trace (or a stack trace corresponding to this one but reaching only one of the callers). |
I meant the
|
Sorry, I did not have much time to analyse it. I got back to it today. I've discovered that the flushed message appears regularly until I stop the Docker container with my client. Once the container is stopped, the message stops appearing entirely. This would probably suggest that the |
I've managed to create a minimum working example (see the attachment below). It's a simple Hyper server that responds with an infinite streamed response to every request. The project contains also a directory called "test". The directory contains a simple Python script that polls data from the server and a Docker file for creating a Docker image with the Python script. You can re-create the problem using the following steps: # unzip the file and enter the project directory and then:
cargo build
# the server will by default listen on 0.0.0.0:12345:
target/debug/hyper-loop in a separate terminal: # enter the project directory and then:
cd test
docker build -t hyper-loop-test .
# note: replace 172.17.42.1 with the address of your docker network interface
docker run --name=kill-me hyper-loop-test python3 -u test.py http://172.17.42.1:12345/ and finally kill the "kill-me" Docker container from a separate terminal: docker rm -f kill-me The server will never print the "client has left" message as defined in main.rs on line 29 and it will get into an infinite loop. I've also noticed that it's sensitive to the amount of data the server sends at once. The bigger chunks of data I send, the bigger is the chance for getting stuck in an infinite loop. Attachment: hyper-loop.zip |
Thanks for more details! By chance, do you happen to know if it it was logging the debug line "flushed n bytes" in the loop? |
As I mentioned, the line "flushed n bytes" stops appearing when I stop the Docker container with my client. Here is a trace log that I captured from the server example:
The last three lines repeat over and over again. Clearly, there's a HUP event processed by the event loop (the line |
Ah yes, this is #1716. hyper by default allows for half-closed transports (calling |
Thank you for the info. Setting I'm just wondering if this should not be the default behaviour. I realize it's a breaking change for some applications. However, if half-closed connections are enabled by default, it'll lead to this problem sooner or later. I guess that we don't want production applications to end up with 100% CPU usage for no apparent reason. It might be even considered a DoS vulnerability. Regarding EPIPE - in my opinion, writing into a (half) closed socket does not necessarily have to end up with EPIPE as the remote peer might be behind firewall/NAT. In such case, the remote end would simply drop all incoming packets instead of sending an RST TCP packet. |
I ran into an issue when load testing my server application based on Hyper 0.12.23. The servers provides very long (video) streams to clients. The problem is that sometimes when a client disconnects, the server ends up in an infinite loop trying to flush the response.
I have a load testing script written in Python that simulates behaviour of these clients. When I run the script directly from command line and stop the script using SIGINT, server load decreases and everything works as expected. However, when I run the script in a Docker container and then I stop the container, the server application suddenly starts consuming a lot of CPU time.
I managed to get the following stack trace:
It's the same for all busy worker threads.
In my opinion, running the load testing script in a Docker container and then stopping it leaves some connections half open and Hyper is probably not able to recognize this. I'm not sure if this can be related to issue #1716.
The text was updated successfully, but these errors were encountered: