-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
traefik hangs - stops handling requests #662
Comments
@r0bj for high performance, I recommend to set a higher value for
Can you try with this change (for example 1000)? |
Changing MaxIdleConnsPerHost value to 1000 didn't fix the issue. Daemon stop handling traffic. |
Just to be clear, traefik stays in freeze state even after |
Hum, this is really weird... There must be a regression here. I will investigate on this. |
I cannot reproduce this issue. I used your toml file and launched several times |
I've disabled |
@r0bj I still cannot reproduce this issue on my laptop, even using v1.0.2 😕 Traefik continues to respond after benchmarks. Which architecture are you using? amd64? Are you on bare metal or on a VM? |
I hit this issue on bare metal amd64 host. Then I try to reproduce this on vagrant/virtual box VM (vanilla ubuntu/trusty64) with success. |
I am having the same issue, I end up writing a script which check |
@abhishekamralkar Which version are you using? Is it something new? As I still cannot reproduce this behavior, I need as much information as possible to investigate :) |
I can confirm this with the latest release of traefik using using e.g. I was able to reproduce this on 2 physical machines with Ubuntu 16.04 running Traefik in Docker 1.12.1 As soon as the error orrurs Traefik stops reacting - even after wrk is finised. In the logs of traefik the following appears:
http://httpd:80 is the backend that I have configured. Despite the error message, the backend is still there and reacts fine if I reach it manually:
When I shutdown the traefik container I see:
When I restart traefik everything workd fine again (I don't need to restart the httpd backend server) Do you need any additional information? |
This is the version I am using. Not sure whats going on but it just hanged and nothing works |
I think I found the issue. This seems due to a race condition in https://github.com/thoas/stats. It produces when accessing to the Could you confirm that you are accessing A workaround is to avoid accessing webui during tests and change your healthcheck to I'm investigating if the issue is still present in the master branch. |
@emilevauge Yes, indeed that has been the case for me. I had the health UI check opened during the tests all the time. |
@emilevauge Yes, It seems that that was the case. I was accessing /health during the wrk test - this was actually a way of determinig if traefik is still working. And also I pointed marathon healtcheck to traefik /health. |
We are experiencing similar behavior. We see this despite our healthcheck using the |
@ryanleary then could you give as much details as possible? |
@emilevauge I did some load tests and I can confirm that the health UI was indeed the problem - at least in my case. I can reproduce the problem many times when the health UI is opened but when it's closed, I cannot reproduce the problem anymore - no matter what I try. |
@emilevauge Thanks a lot. Do you know when the next traefik release will be out, which has this fix included? |
Already released: https://github.com/containous/traefik/releases/tag/v1.0.3 :) |
Perfect, thanks a lot. :-) |
Thanks. I can confirm that this issue is no longer valid with v1.0.3. |
traefik version: 1.0.2
Set open files limit to 1000000:
ulimit -n 1000000
traefik config:
After start everything works:
Then lets test it with
wrk
:wrk -t30 -c400 -d30s -H "Host: test-nginx.example.net" http://localhost:8000
After some time (one or few attempts) traefik is unresponsive on port 8000:
So traffic is no longer processed.
Health API is also unresponsive:
Whats interesting dashboard is responsive (but without data):
Traefik access logs just stop writing:
Logs (severity DEBUG) shows in log file all the time even for requests without reply:
strace during attempt to send request to traefik (
curl -svo /dev/null -H "Host:test-nginx.example.net" http://localhost:8000
):https://gist.github.com/r0bj/b618c74b1bc0db5c11f78db08c34fc15
So it seems that request hits backend but response isn't sent to original sender.
There are many connections in
CLOSE_WAIT
status:https://gist.github.com/r0bj/c647c76fe65a562ffd2e024e11a260cd
Restart treafik daemon fixes this issue.
It's easy to replicate this issue:
ubuntu/trusty64
image with default settingswrk
benchmark one or more timesOne can also replicate this issue with sending
wrk
requests to non existing backend (resulting 404):wrk -t30 -c400 -d30 http://localhost:8000/
The text was updated successfully, but these errors were encountered: