-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
http server hangs after ~16,400 connections #66
Labels
Comments
I made a simple web server benchmark ( http://bit.ly/2pzAFC ) I have no problems with GoLang version using 100.000 requests with the Apache Web server benchmark tool, however the python version returned "Timeout" around the 47.000 request. |
interesting roblesjm. i retried the bench on my linux laptop ubuntu hardy, and there were no problems. i also fired up go on an ec2 instance, to be able to profile with reproduceable results - also no problems with regards to the server hanging. for interest, running two instances of the go example hello world http server on an EC2 High-CPU Medium Instance, and running 2 instances of apache bench from a 2nd High-CPU Medium Instance was able to put through around 10-11k requests per second. in comparison, running a hello world wsgi server behind spawning/eventlet with 2 process, 0 threads (greenlets) was able to put through 3500 requests per second. so the hanging may either just be an issue on darwin, .. or only an issue on my macbook :) |
I've been playing with different values of "n" arg for ab. I've had it work sometimes for 20,000, but not for values over that. When if fails, it shows a couple of different values for different runs: apr_poll: The timeout specified has expired (70007) Total of 16385 requests completed apr_poll: The timeout specified has expired (70007) Total of 16384 requests completed apr_poll: The timeout specified has expired (70007) Total of 16386 requests completed Found this page: http://serverfault.com/questions/10852/what-limits-the-maximum-number-of-connections-on-a-linux-server But, don't know how to tune either variable in osx: mike-kinneys-macbook-pro:runtime mikekinney$ sysctl -A 2>&1 | grep tcp_max_orphans mike-kinneys-macbook-pro:runtime mikekinney$ sysctl -A 2>&1 | grep tcp_tw_reuse I've tried running ab with -k option: ab -k -n 20000 -c 1 http://localhost:12345/hello but it still shows: apr_poll: The timeout specified has expired (70007) Total of 16386 requests completed It might be a matter of dealing "keep-alive"... don't know. See: http://www.mail-archive.com/dev@couchdb.apache.org/msg05082.html Hope that helps. Let me know if there's anything I can do to help. (dbg, log, etc) echo $GOOS $GOARCH darwin 386 mike-kinneys-macbook-pro:mike mikekinney$ hg log -l 1 changeset: 4037:cd0140653802 tag: tip user: David Titarenco <[email protected]> date: Fri Nov 13 18:06:47 2009 -0800 summary: Created new Conn.Flush() public method so the fd pipeline can be drained arbitrarily by the user. osx 10.6.2 mike-kinneys-macbook-pro:http mikekinney$ gcc --version i686-apple-darwin10-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5646) (dot 1) |
Owner changed to [email protected]. |
This issue is caused by the OS running out of sockets. ab and Go are cycling through socket pairs for communication faster than the OS can reallocate them for reuse. While Go supports keeping the connection alive using HTTP/1.1 techniques, the ab binary uses a different technique. This results in the -k flag for ab having no affect. The ab man page recommends running the binary on a different machine that the server being tested. This would reduce the rate of socket consumption by 50% and allow the OS to return sockets to the server. |
I'm not sure. Go does support keepalive using HTTP 1.1, but ab is sending HTTP 1.0 in its GET requests. I'm not familiar enough with HTTP to understand how the two differ. As a quick experiment, I hacked http/server.go to try and use the HTTP 1.1 keepalive code anways, but that didn't work. If you want me to work on this, I'm happy to do so. |
We're always happy to accept help. I wondered if it was something as simple as the HTTP server not closing its file descriptors, but it seems like you'd run out much earlier. It could be that we need to set the SO_REUSEPORT or some such option on some such socket, or set the linger time to 0, or something like that. |
The only change I was able to make that had any affect was to compile the 'ab' binary such that it set SO_LINGER with a linger time of 0 seconds. At that point, I got these results. Server Software: Server Hostname: 127.0.0.1 Server Port: 12345 Document Path: /hello Document Length: 16 bytes Concurrency Level: 1 Time taken for tests: 27.913 seconds Complete requests: 100000 Failed requests: 0 Write errors: 0 Total transferred: 7600000 bytes HTML transferred: 1600000 bytes Requests per second: 3582.62 [#/sec] (mean) Time per request: 0.279 [ms] (mean) Time per request: 0.279 [ms] (mean, across all concurrent requests) Transfer rate: 265.90 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.1 0 14 Processing: 0 0 0.6 0 27 Waiting: 0 0 0.6 0 20 Total: 0 0 0.6 0 41 Percentage of the requests served within a certain time (ms) 50% 0 66% 0 75% 0 80% 0 90% 0 95% 0 98% 0 99% 1 100% 41 (longest request) Even having a socket linger time of 1 second was sufficient to trigger the socket exhaustion problem. No changes were necessary on the Go side. Modified Go network code with explicit socket shutdown() calls and setting SO_LINGER and SO_REUSEADDR had no effect on performance. |
Comment 16 by [email protected]: I'm not sure if I'm seeing something related to this. But here's what I'm finding... Let me know if I should open a new issue. I'm using a simple web serve app.. package main import ( "encoding/json" "net/http" ) // structs type Reading struct { Id string `json:"id"` Name string `json:"name"` } func main() { http.HandleFunc("/machines/", func(w http.ResponseWriter, r *http.Request) { // Setup readings readings := prepareReadings() // return readings w.Write([]byte(readingsToString(readings))) }) http.ListenAndServe(":3000", nil) } func readingsToString(readings []Reading) string { data, err := json.Marshal(readings) if err != nil { panic(err) } return string(data) } func prepareReadings() []Reading { var readings []Reading for i := 1; i <= 1; i++ { readings = append(readings, Reading{Name: "Thing"}) } return readings } As you can see not much to it. I've setup multiple load generation servers that are separate from the web server itself. So in total I have 17 machines. 1 web server, and 16 load generation servers. On the load generation servers I am using siege, not ab. Running this command on all servers: siege -v "http://192.168.122.31:3000/machines/ POST" -c 500 -r 100 -b Causes me to start getting connection timed out messages. My file descriptor limits for the web server are pretty high... [api #3312 -- limits] Limit Soft Limit Hard Limit Units Max cpu time unlimited unlimited seconds Max file size unlimited unlimited bytes Max data size unlimited unlimited bytes Max stack size 8388608 unlimited bytes Max core file size 0 unlimited bytes Max resident set unlimited unlimited bytes Max processes 59479 59479 processes Max open files 4999999 4999999 files Max locked memory 65536 65536 bytes Max address space unlimited unlimited bytes Max file locks unlimited unlimited locks Max pending signals 59479 59479 signals Max msgqueue size 819200 819200 bytes Max nice priority 0 0 Max realtime priority 0 0 Max realtime timeout unlimited unlimited us When I use the command 'lsof | wc -l', I dont' get above 1000. Generally in the ~800-850 range. When I use the command 'watch --interval=2 'netstat -tuna |grep "SYN_RECV"|wc -l'', I am generally in the ~130-250 range. I'm not sure if this is related, or possibly a problem with siege at this point. Any advice? |
This is an old thread by following are my findings for whatever it is worth. From what I found out this seems to be a MacOSx problem. See the following thread for information http://stackoverflow.com/questions/1216267/ab-program-freezes-after-lots-of-requests-why This is from the above thread: On Mac OS X the default ephemeral port range is 49152 to 65535, for a total of 16384 ports. You can check this with the sysctl command: $ sysctl net.inet.ip.portrange.first net.inet.ip.portrange.last net.inet.ip.portrange.first: 49152 net.inet.ip.portrange.last: 65535 Changing this configuration to start from 32768 helped $ sudo sysctl -w net.inet.ip.portrange.first=32768 net.inet.ip.portrange.first: 49152 -> 32768 Another configuration that I changed was for default timeout Set the default timeout to 1000ms like so: $ sudo sysctl -w net.inet.tcp.msl=1000 net.inet.tcp.msl: 15000 -> 1000 After changing above configuration following are the results Server Software: Server Hostname: localhost Server Port: 4000 Document Path: / Document Length: 12 bytes Concurrency Level: 1 Time taken for tests: 17.005 seconds Complete requests: 100000 Failed requests: 0 Write errors: 0 Total transferred: 14800000 bytes HTML transferred: 1200000 bytes Requests per second: 5880.58 [#/sec] (mean) Time per request: 0.170 [ms] (mean) Time per request: 0.170 [ms] (mean, across all concurrent requests) Transfer rate: 849.93 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.1 0 12 Processing: 0 0 0.0 0 1 Waiting: 0 0 0.0 0 1 Total: 0 0 0.1 0 12 Percentage of the requests served within a certain time (ms) 50% 0 66% 0 75% 0 80% 0 90% 0 95% 0 98% 0 99% 0 100% 12 (longest request) |
Thank you for your investigation, please do not continue to comment on this issue, it has been closed for nearly four years. I suggest starting a new thread on the golang-nuts mailing list as more people read that than are subscribed to the issue comments. Labels changed: added restrict-addissuecomment-commit. |
minux
added a commit
to minux/goios
that referenced
this issue
Mar 2, 2015
…(SB), Rx This is the first step in fixing golang#66 and make it possible to build PIE without text relocations (Darwin/ARM64 forbids text relocations.) Internal linking is fully working (with verifyAsm=false), but external linking is not.
minux
added a commit
to minux/goios
that referenced
this issue
Mar 2, 2015
1. all.bash passed with internal linking 2. misc/cgo/test passed with external linking 3. GOOBJ=2 go install std passed. Fixes golang#66.
This issue was closed.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
by andy.gayton:
The text was updated successfully, but these errors were encountered: