Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slow_cooker slow-down after about 5-10 minutes #59

Open
olix0r opened this issue May 31, 2017 · 7 comments
Open

slow_cooker slow-down after about 5-10 minutes #59

olix0r opened this issue May 31, 2017 · 7 comments
Assignees

Comments

@olix0r
Copy link
Member

olix0r commented May 31, 2017

After about 5-10 minutes of running slow_cooker against a simple python http server, throughput drops by 25%.

:; python -m SimpleHTTPServer 8000
Serving HTTP on 0.0.0.0 port 8000 ...
:; slow_cooker -concurrency 4 -qps 30 http://localhost:8000
# sending 120 GET req/s with concurrency=4 to http://localhost:8000 ...
#                      good/b/f t   goal%   min [p50 p95 p99  p999]  max bhash change
2017-05-31T17:03:01Z   1196/0/0 1200  99% 10s   1 [  6  10  10   12 ]   12      0 +
2017-05-31T17:03:11Z   1202/0/0 1200 100% 10s   2 [  6  10  11   15 ]   15      0 
2017-05-31T17:03:21Z   1179/0/0 1200  98% 10s   1 [  6  11  25  220 ]  220      0 
2017-05-31T17:03:31Z   1200/0/0 1200 100% 10s   2 [  6  10  14   53 ]   53      0 
2017-05-31T17:03:41Z   1201/0/0 1200 100% 10s   1 [  6  11  34   51 ]   51      0 
2017-05-31T17:03:51Z   1199/0/0 1200  99% 10s   2 [  6  10  46   54 ]   54      0 
2017-05-31T17:04:01Z   1200/0/0 1200 100% 10s   1 [  6  12  41   54 ]   54      0 
2017-05-31T17:04:11Z   1202/0/0 1200 100% 10s   1 [  6  12  42   53 ]   53      0 
2017-05-31T17:04:21Z   1197/0/0 1200  99% 10s   2 [  6  10  38   55 ]   55      0 
2017-05-31T17:04:31Z   1202/0/0 1200 100% 10s   2 [  6  10  43   52 ]   52      0 
2017-05-31T17:04:41Z   1203/0/0 1200 100% 10s   2 [  6  12  45   60 ]   60      0 
2017-05-31T17:04:51Z   1196/0/0 1200  99% 10s   1 [  6  11  41   54 ]   54      0 
2017-05-31T17:05:01Z   1203/0/0 1200 100% 10s   1 [  6  12  46   54 ]   54      0 
2017-05-31T17:05:11Z   1196/0/0 1200  99% 10s   1 [  6  13  40   84 ]   84      0 
2017-05-31T17:05:21Z   1200/0/0 1200 100% 10s   2 [  6  12  33   50 ]   50      0 
2017-05-31T17:05:31Z   1202/0/0 1200 100% 10s   2 [  6  12  44   57 ]   57      0 
2017-05-31T17:05:41Z   1199/0/0 1200  99% 10s   2 [  6  12  40   53 ]   53      0 
2017-05-31T17:05:51Z   1200/0/0 1200 100% 10s   2 [  6  10  40   51 ]   51      0 
2017-05-31T17:06:01Z   1200/0/0 1200 100% 10s   2 [  6  12  40   56 ]   56      0 
2017-05-31T17:06:11Z   1201/0/0 1200 100% 10s   1 [  6  11  39   53 ]   53      0 
2017-05-31T17:06:21Z   1200/0/0 1200 100% 10s   2 [  7  10  37   53 ]   53      0 
2017-05-31T17:06:31Z   1192/0/0 1200  99% 10s   2 [  7  13  44  131 ]  131      0 
2017-05-31T17:06:41Z   1196/0/0 1200  99% 10s   2 [  6  11  45   64 ]   64      0 
2017-05-31T17:06:51Z   1203/0/0 1200 100% 10s   1 [  6  11  40   53 ]   53      0 
2017-05-31T17:07:01Z   1200/0/0 1200 100% 10s   1 [  6  10  45   54 ]   54      0 
2017-05-31T17:07:11Z   1200/0/0 1200 100% 10s   1 [  6  12  40   52 ]   52      0 
2017-05-31T17:07:22Z   1200/0/0 1200 100% 10s   1 [  6  11  35   46 ]   46      0 
2017-05-31T17:07:32Z   1084/0/0 1200  90% 10s   1 [  6  11  46 1013 ] 1013      0 
2017-05-31T17:07:42Z   1200/0/0 1200 100% 10s   2 [  6  12  41   53 ]   53      0 
2017-05-31T17:07:52Z   1200/0/0 1200 100% 10s   2 [  7  12  36   60 ]   60      0 
2017-05-31T17:08:02Z   1112/0/0 1200  92% 10s   2 [  6  11  35   47 ]   47      0 
2017-05-31T17:08:12Z    862/0/0 1200  71% 10s   1 [  5  20  79  262 ]  262      0 
2017-05-31T17:08:22Z    901/0/0 1200  75% 10s   1 [  5   8  34   43 ]   43      0 
2017-05-31T17:08:32Z    901/0/0 1200  75% 10s   2 [  5   9  37   41 ]   41      0 
2017-05-31T17:08:42Z    900/0/0 1200  75% 10s   1 [  4   9  20   52 ]   52      0 
2017-05-31T17:08:52Z    900/0/0 1200  75% 10s   1 [  4   8  18   34 ]   34      0 
2017-05-31T17:09:02Z    900/0/0 1200  75% 10s   1 [  4   8  17   31 ]   31      0 
2017-05-31T17:09:12Z    900/0/0 1200  75% 10s   1 [  5   8  18   29 ]   29      0 
2017-05-31T17:09:22Z    900/0/0 1200  75% 10s   2 [  5   8  19   30 ]   30      0 
2017-05-31T17:09:32Z    900/0/0 1200  75% 10s   2 [  5   8  22   32 ]   32      0 
2017-05-31T17:09:42Z    900/0/0 1200  75% 10s   2 [  5   8  20   28 ]   28      0 
2017-05-31T17:09:52Z    900/0/0 1200  75% 10s   2 [  5   8  25   34 ]   34      0 
2017-05-31T17:10:02Z    900/0/0 1200  75% 10s   2 [  5   8  24   31 ]   31      0 
2017-05-31T17:10:12Z    900/0/0 1200  75% 10s   1 [  5   8  20   30 ]   30      0 
2017-05-31T17:10:22Z    900/0/0 1200  75% 10s   1 [  5   8  23   33 ]   33      0 
2017-05-31T17:10:32Z    900/0/0 1200  75% 10s   2 [  5   8  27   33 ]   33      0 
2017-05-31T17:10:42Z    900/0/0 1200  75% 10s   1 [  5   8  21   31 ]   31      0 
2017-05-31T17:10:52Z    900/0/0 1200  75% 10s   1 [  5   8  26   31 ]   31      0 
2017-05-31T17:11:02Z    900/0/0 1200  75% 10s   2 [  5   8  21   30 ]   30      0 
2017-05-31T17:11:12Z    901/0/0 1200  75% 10s   1 [  5   8  21   31 ]   31      0 
2017-05-31T17:11:22Z    901/0/0 1200  75% 10s   1 [  5   8  21   32 ]   32      0 
2017-05-31T17:11:32Z    901/0/0 1200  75% 10s   1 [  5   8  19   30 ]   30      0 
2017-05-31T17:11:42Z    898/0/0 1200  74% 10s   2 [  5   8  27   34 ]   34      0 
2017-05-31T17:11:52Z    901/0/0 1200  75% 10s   1 [  5   8  24   32 ]   32      0 
2017-05-31T17:12:02Z    901/0/0 1200  75% 10s   1 [  5   8  20   30 ]   30      0 
2017-05-31T17:12:12Z    900/0/0 1200  75% 10s   1 [  5   8  20   33 ]   33      0 
2017-05-31T17:12:22Z    900/0/0 1200  75% 10s   1 [  5   8  20   29 ]   29      0 
2017-05-31T17:12:32Z    900/0/0 1200  75% 10s   1 [  5   8  25   33 ]   33      0 
2017-05-31T17:12:42Z    900/0/0 1200  75% 10s   1 [  5   9  22   38 ]   38      0 
2017-05-31T17:12:52Z    900/0/0 1200  75% 10s   1 [  5   8  25   33 ]   33      0 
2017-05-31T17:13:02Z    900/0/0 1200  75% 10s   2 [  5   8  20   33 ]   33      0 
2017-05-31T17:13:12Z    900/0/0 1200  75% 10s   1 [  5   8  27   33 ]   33      0 
2017-05-31T17:13:22Z    900/0/0 1200  75% 10s   1 [  5   8  23   32 ]   32      0 
2017-05-31T17:13:32Z    900/0/0 1200  75% 10s   2 [  5   8  21   34 ]   34      0 
2017-05-31T17:13:42Z    900/0/0 1200  75% 10s   2 [  5   8  26   32 ]   32      0 
2017-05-31T17:13:52Z    900/0/0 1200  75% 10s   2 [  5   8  26   33 ]   33      0 
2017-05-31T17:14:02Z    900/0/0 1200  75% 10s   2 [  5   8  22   29 ]   29      0 
2017-05-31T17:14:12Z    900/0/0 1200  75% 10s   1 [  5   8  20   34 ]   34      0 
2017-05-31T17:14:22Z    900/0/0 1200  75% 10s   2 [  5   8  22   33 ]   33      0 
2017-05-31T17:14:32Z    900/0/0 1200  75% 10s   1 [  5   8  19   28 ]   28      0 
2017-05-31T17:14:42Z    901/0/0 1200  75% 10s   2 [  4   8  29   35 ]   35      0 
2017-05-31T17:14:52Z    901/0/0 1200  75% 10s   1 [  5   9  18   30 ]   30      0 
2017-05-31T17:15:02Z    898/0/0 1200  74% 10s   2 [  5   8  27   32 ]   32      0 
2017-05-31T17:15:12Z    790/0/0 1200  65% 10s   1 [  5   9  22   37 ]   37      0 
2017-05-31T17:15:22Z    899/0/0 1200  74% 10s   1 [  5   8  24   34 ]   34      0 
2017-05-31T17:15:32Z    897/0/0 1200  74% 10s   2 [  5   9  28   97 ]   97      0 
2017-05-31T17:15:42Z    901/0/0 1200  75% 10s   1 [  5   9  21   29 ]   29      0 
2017-05-31T17:15:52Z    899/0/0 1200  74% 10s   2 [  5   8  25   36 ]   36      0 
2017-05-31T17:16:02Z    901/0/0 1200  75% 10s   1 [  5   9  22   34 ]   34      0 
2017-05-31T17:16:12Z    900/0/0 1200  75% 10s   2 [  5   8  17   31 ]   31      0 
2017-05-31T17:16:22Z    900/0/0 1200  75% 10s   2 [  5   8  20   28 ]   28      0 
2017-05-31T17:16:32Z    900/0/0 1200  75% 10s   2 [  5   8  20   27 ]   27      0 
2017-05-31T17:16:42Z    900/0/0 1200  75% 10s   2 [  5   8  26   32 ]   32      0 
2017-05-31T17:16:52Z    900/0/0 1200  75% 10s   2 [  5   9  19   26 ]   26      0 
2017-05-31T17:17:02Z    900/0/0 1200  75% 10s   2 [  5   9  29   36 ]   36      0 
2017-05-31T17:17:12Z    900/0/0 1200  75% 10s   1 [  5   9  21   35 ]   35      0 
2017-05-31T17:17:22Z    900/0/0 1200  75% 10s   2 [  5   9  22   31 ]   31      0 
2017-05-31T17:17:32Z    900/0/0 1200  75% 10s   2 [  5   9  19   31 ]   31      0 
2017-05-31T17:17:42Z    900/0/0 1200  75% 10s   1 [  5   8  20   32 ]   32      0 
2017-05-31T17:17:52Z    900/0/0 1200  75% 10s   1 [  5   8  18   32 ]   32      0 
2017-05-31T17:18:02Z    897/0/0 1200  74% 10s   1 [  5   9  19   42 ]   42      0 
2017-05-31T17:18:12Z    902/0/0 1200  75% 10s   1 [  5  10  17   32 ]   32      0 
2017-05-31T17:18:22Z    899/0/0 1200  74% 10s   2 [  5   8  25   33 ]   33      0 
2017-05-31T17:18:32Z    900/0/0 1200  75% 10s   2 [  5   8  26   30 ]   30      0 
2017-05-31T17:18:42Z    900/0/0 1200  75% 10s   2 [  5   8  30   35 ]   35      0 
2017-05-31T17:18:52Z    901/0/0 1200  75% 10s   1 [  5   8  26   33 ]   33      0 
2017-05-31T17:19:02Z    900/0/0 1200  75% 10s   2 [  5   8  16   25 ]   25      0 
2017-05-31T17:19:12Z    901/0/0 1200  75% 10s   1 [  5   8  24   33 ]   33      0 
2017-05-31T17:19:22Z    899/0/0 1200  74% 10s   1 [  5  10  25   34 ]   34      0 
2017-05-31T17:19:32Z    900/0/0 1200  75% 10s   2 [  5   9  21   27 ]   27      0 
2017-05-31T17:19:42Z    901/0/0 1200  75% 10s   1 [  5  10  21   32 ]   32      0 
2017-05-31T17:19:52Z    900/0/0 1200  75% 10s   2 [  5   9  15   21 ]   21      0 
2017-05-31T17:20:02Z    900/0/0 1200  75% 10s   2 [  5   9  23   46 ]   46      0 
2017-05-31T17:20:12Z    900/0/0 1200  75% 10s   1 [  5  10  25   49 ]   49      0 
2017-05-31T17:20:22Z    900/0/0 1200  75% 10s   1 [  5   9  18   29 ]   29      0 
2017-05-31T17:20:32Z    900/0/0 1200  75% 10s   1 [  5   9  29   37 ]   37      0 
2017-05-31T17:20:42Z    900/0/0 1200  75% 10s   1 [  5   9  15   34 ]   34      0 
2017-05-31T17:20:52Z    900/0/0 1200  75% 10s   2 [  5   8  17   35 ]   35      0 
2017-05-31T17:21:02Z    900/0/0 1200  75% 10s   2 [  5   8  21   34 ]   34      0 
2017-05-31T17:21:12Z    900/0/0 1200  75% 10s   2 [  5   8  27   38 ]   38      0 
2017-05-31T17:21:22Z    902/0/0 1200  75% 10s   2 [  5   9  30   37 ]   37      0 
2017-05-31T17:21:32Z    898/0/0 1200  74% 10s   1 [  5   8  27   35 ]   35      0 
2017-05-31T17:21:42Z    900/0/0 1200  75% 10s   2 [  5   9  26   40 ]   40      0 
2017-05-31T17:21:52Z    900/0/0 1200  75% 10s   2 [  5   8  22   33 ]   33      0 
2017-05-31T17:22:02Z    901/0/0 1200  75% 10s   2 [  5   9  19   35 ]   35      0 
2017-05-31T17:22:12Z    900/0/0 1200  75% 10s   2 [  5  10  23   28 ]   28      0 
2017-05-31T17:22:22Z    900/0/0 1200  75% 10s   2 [  5  10  28   38 ]   38      0 
2017-05-31T17:22:32Z    902/0/0 1200  75% 10s   1 [  4   9  25   51 ]   51      0 
2017-05-31T17:22:42Z    900/0/0 1200  75% 10s   2 [  5   8  24   32 ]   32      0 
2017-05-31T17:22:52Z    899/0/0 1200  74% 10s   2 [  5   8  17   34 ]   34      0 
2017-05-31T17:23:02Z    901/0/0 1200  75% 10s   1 [  5   8  18   31 ]   31      0 
2017-05-31T17:23:12Z    900/0/0 1200  75% 10s   1 [  4   9  16   22 ]   22      0 
2017-05-31T17:23:22Z    900/0/0 1200  75% 10s   2 [  5   8  20   30 ]   30      0 
2017-05-31T17:23:32Z    899/0/0 1200  74% 10s   2 [  5   8  24   30 ]   30      0 
2017-05-31T17:23:42Z    901/0/0 1200  75% 10s   2 [  5   9  20   34 ]   34      0 
2017-05-31T17:23:52Z    900/0/0 1200  75% 10s   2 [  5   8  24   31 ]   31      0 
2017-05-31T17:24:02Z    900/0/0 1200  75% 10s   2 [  6   9  28   35 ]   35      0 
2017-05-31T17:24:12Z    900/0/0 1200  75% 10s   1 [  5   9  29   40 ]   40      0 
2017-05-31T17:24:22Z    900/0/0 1200  75% 10s   2 [  5   8  22   30 ]   30      0 
2017-05-31T17:24:32Z    900/0/0 1200  75% 10s   1 [  5   9  21   28 ]   28      0 
2017-05-31T17:24:42Z    897/0/0 1200  74% 10s   2 [  5  13  32   68 ]   68      0 
2017-05-31T17:24:52Z    900/0/0 1200  75% 10s   1 [  4   9  19   31 ]   31      0 
2017-05-31T17:25:02Z    900/0/0 1200  75% 10s   1 [  5   8  26   34 ]   34      0 
2017-05-31T17:25:12Z    900/0/0 1200  75% 10s   1 [  5   8  20   30 ]   30      0 
2017-05-31T17:25:22Z    900/0/0 1200  75% 10s   1 [  4   8  18   31 ]   31      0 
2017-05-31T17:25:32Z    900/0/0 1200  75% 10s   1 [  5   8  27   32 ]   32      0 
2017-05-31T17:25:42Z    900/0/0 1200  75% 10s   2 [  5   9  19   29 ]   29      0 
2017-05-31T17:25:52Z    896/0/0 1200  74% 10s   2 [  5  10  24  108 ]  108      0 

Restarting the python server without slow_cooker yields no improvement. Restarting slow_cooker, however, resumes at 120rps.

The same behavior is exhibited with and without -noreuse, presumably because the python server only speaks HTTP/1.0:

:; curl -v localhost:8000
* Rebuilt URL to: localhost:8000/
*   Trying ::1...
* TCP_NODELAY set
* Connection failed
* connect to ::1 port 8000 failed: Connection refused
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8000 (#0)
> GET / HTTP/1.1
> Host: localhost:8000
> User-Agent: curl/7.51.0
> Accept: */*
> 
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Server: SimpleHTTP/0.6 Python/2.7.10
< Date: Wed, 31 May 2017 17:55:10 GMT
< Content-type: text/html; charset=utf-8
< Content-Length: 1596
:; uname -a
Darwin doe 16.6.0 Darwin Kernel Version 16.6.0: Fri Apr 14 16:21:16 PDT 2017; root:xnu-3789.60.24~6/RELEASE_X86_64 x86_64
:; python --version
Python 2.7.10
:; (cd $GOPATH/src/github.com/buoyantio/slow_cooker && git describe --tags)
1.1.0-10-g095a524

slow_cooker doesn't appear to be hitting any file descriptor limits:

:; netstat -an | awk '$5 ~ /.8000$/ {print $6}' |sort | uniq -c |sort -rn
2451 TIME_WAIT
lsof is pretty uninteresting
:; lsof -p 24884 # python server
COMMAND   PID USER   FD      TYPE             DEVICE   SIZE/OFF    NODE NAME
Python  24884  ver  cwd       DIR                1,4       1326 1459453 /Users/ver/b/linkerd
Python  24884  ver  txt       REG                1,4      51712 3137143 /System/Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python
Python  24884  ver  txt       REG                1,4    2616016 3137138 /System/Library/Frameworks/Python.framework/Versions/2.7/Python
Python  24884  ver  txt       REG                1,4      52736 3138936 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload/_locale.so
Python  24884  ver  txt       REG                1,4      65904 3138980 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload/time.so
Python  24884  ver  txt       REG                1,4     144704 3138942 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload/_socket.so
Python  24884  ver  txt       REG                1,4      48224 3138930 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload/_functools.so
Python  24884  ver  txt       REG                1,4     167712 3138944 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload/_ssl.so
Python  24884  ver  txt       REG                1,4      57008 3138955 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload/cStringIO.so
Python  24884  ver  txt       REG                1,4     257872 3138934 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload/_io.so
Python  24884  ver  txt       REG                1,4      78848 3138968 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload/math.so
Python  24884  ver  txt       REG                1,4      60160 3138951 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload/binascii.so
Python  24884  ver  txt       REG                1,4      57296 3138931 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload/_hashlib.so
Python  24884  ver  txt       REG                1,4      84208 3138945 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload/_struct.so
Python  24884  ver  txt       REG                1,4      52624 3138940 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload/_random.so
Python  24884  ver  txt       REG                1,4      52000 3138961 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload/fcntl.so
Python  24884  ver  txt       REG                1,4      65872 3138976 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload/select.so
Python  24884  ver  txt       REG                1,4      71328 3138923 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload/_collections.so
Python  24884  ver  txt       REG                1,4      85104 3138971 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload/operator.so
Python  24884  ver  txt       REG                1,4     108160 3138967 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload/itertools.so
Python  24884  ver  txt       REG                1,4      59952 3138932 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload/_heapq.so
Python  24884  ver  txt       REG                1,4      73408 3138977 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload/strop.so
Python  24884  ver  txt       REG                1,4      44720 3138941 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload/_scproxy.so
Python  24884  ver  txt       REG                1,4      43488 3138964 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload/grp.so
Python  24884  ver  txt       REG                1,4     697712 3169325 /usr/lib/dyld
Python  24884  ver  txt       REG                1,4  662257664 3569894 /private/var/db/dyld/dyld_shared_cache_x86_64h
Python  24884  ver    0u      CHR               16,7 0t60163019     815 /dev/ttys007
Python  24884  ver    1u      CHR               16,7 0t60163019     815 /dev/ttys007
Python  24884  ver    2u      CHR               16,7 0t60163019     815 /dev/ttys007
Python  24884  ver    3u     IPv4 0xd5fcac6e7757be41        0t0     TCP *:irdmi (LISTEN)
Python  24884  ver    4   NPOLICY                                       
Python  24884  ver    5u     unix 0xd5fcac6e63ac6901        0t0         ->0xd5fcac6e63ac6839
Python  24884  ver    6u    systm                           0t0         
Python  24884  ver    7r      CHR               14,1     0t2500     579 /dev/urandom
#; ~/b/rs/linkerd-tcp  (ver/router ✘)
:; lsof -p $(pgrep -f slow_cooker)
COMMAND     PID USER   FD     TYPE             DEVICE  SIZE/OFF    NODE NAME
slow_cook 25043  ver  cwd      DIR                1,4      1326 1459453 /Users/ver/b/linkerd
slow_cook 25043  ver  txt      REG                1,4   8794292 3379776 /Users/ver/s/go/bin/slow_cooker
slow_cook 25043  ver  txt      REG                1,4    697712 3169325 /usr/lib/dyld
slow_cook 25043  ver  txt      REG                1,4 662257664 3569894 /private/var/db/dyld/dyld_shared_cache_x86_64h
slow_cook 25043  ver    0u     CHR               16,9 0t1928237     819 /dev/ttys009
slow_cook 25043  ver    1u     CHR               16,9 0t1928237     819 /dev/ttys009
slow_cook 25043  ver    2u     CHR               16,9 0t1928237     819 /dev/ttys009
slow_cook 25043  ver    3u   systm                          0t0         
slow_cook 25043  ver    4u    unix 0xd5fcac6e6fd8fdb1       0t0         ->0xd5fcac6e6fd8e001
slow_cook 25043  ver    8u    IPv6 0xd5fcac6e58520831       0t0     TCP localhost:60725->localhost:irdmi (CLOSED)
slow_cook 25043  ver    9u  KQUEUE                                      count=0, state=0xa
@stevej
Copy link
Contributor

stevej commented Jun 2, 2017

Happy to dig into this: this is slow_cooker built from master?

@stevej stevej self-assigned this Jun 2, 2017
@stevej
Copy link
Contributor

stevej commented Jun 2, 2017

Unable to reproduce from master on a gcloud host to another gcloud host running nginx, lemme keep digging.

stevej@netty-test-8:~$ ./slow_cooker_linux_amd64 -concurrency 4 -qps 30 -interval 1m http://perf-target-4:8080
# sending 120 GET req/s with concurrency=4 to http://perf-target-4:8080 ...
#                      good/b/f t   goal%    min [p50 p95 p99  p999]  max bhash change
2017-06-02T03:29:10Z   7196/0/0 7200  99% 1m0s   0 [  0   0   1    4 ]    4      0 +
2017-06-02T03:30:10Z   7200/0/0 7200 100% 1m0s   0 [  0   0   1   18 ]   18      0 +
2017-06-02T03:31:10Z   7200/0/0 7200 100% 1m0s   0 [  0   0   2    4 ]    4      0 +
2017-06-02T03:32:10Z   7201/0/0 7200 100% 1m0s   0 [  0   0   2    3 ]    3      0 +
2017-06-02T03:33:10Z   7203/0/0 7200 100% 1m0s   0 [  0   0   2    3 ]    3      0
2017-06-02T03:34:10Z   7200/0/0 7200 100% 1m0s   0 [  0   0   1    3 ]    3      0
2017-06-02T03:35:10Z   7200/0/0 7200 100% 1m0s   0 [  0   0   2    2 ]    2      0
2017-06-02T03:36:10Z   7200/0/0 7200 100% 1m0s   0 [  0   0   2    7 ]    7      0
2017-06-02T03:37:10Z   7200/0/0 7200 100% 1m0s   0 [  0   0   2    3 ]    3      0

@stevej
Copy link
Contributor

stevej commented Jun 2, 2017

An overnight run on gcloud was ok, no slowdown noted. I'll try with -noreuse since you mentioned python's built-in webserver doesn't support keep-alives.

Updated: That's a lot of sockets in TIME_WAIT, I wonder if we're eventually running into an issue opening new sockets. Let me figure out how to set some socket options in go's http client and we can tweak SO_LINGER

Updated 2: No slowdown with -noreuse, I'm still digging into how to either explicitly close the connection from golang's somewhat anemic http client or how to set socket options.

@stevej
Copy link
Contributor

stevej commented Jun 2, 2017

Ok, I definitely see this on the mac with -noreuse enabled on a local nginx.

2017-06-02T10:17:59-07:00   1200/0/0 1200 100% 10s   1 [  2   3  11   17 ]   17      0
2017-06-02T10:18:09-07:00   1200/0/0 1200 100% 10s   1 [  2   7  13   17 ]   17      0
2017-06-02T10:18:19-07:00   1200/0/0 1200 100% 10s   1 [  2   3  11   15 ]   15      0
2017-06-02T10:18:29-07:00   1200/0/0 1200 100% 10s   1 [  2   6  17   47 ]   47      0
2017-06-02T10:18:39-07:00   1200/0/0 1200 100% 10s   1 [  2   3   7   30 ]   30      0
2017-06-02T10:18:49-07:00   1200/0/0 1200 100% 10s   1 [  2   5  10   17 ]   17      0
2017-06-02T10:18:59-07:00   1196/0/0 1200  99% 10s   1 [  2   6  17   75 ]   75      0
2017-06-02T10:19:09-07:00    929/0/0 1200  77% 10s   1 [  2   6  15   39 ]   39      0
2017-06-02T10:19:19-07:00    902/0/0 1200  75% 10s   1 [  2   5  12   24 ]   24      0
2017-06-02T10:19:29-07:00    898/0/0 1200  74% 10s   1 [  2   8  16   28 ]   28      0
2017-06-02T10:19:39-07:00    900/0/0 1200  75% 10s   1 [  2   9  15   24 ]   24      0
2017-06-02T10:19:49-07:00    901/0/0 1200  75% 10s   1 [  2   4  10   20 ]   20      0

@stevej
Copy link
Contributor

stevej commented Jun 2, 2017

It continues to drop in throughput, after 20 minutes it's down to 50%.

After another 5 minutes, we're down to 25% throughput. this is troubling.

@stevej
Copy link
Contributor

stevej commented Jun 2, 2017

Keeping total requests the same but reducing concurrency to 1 prevents the slow down from occurring.

# sending 120 GET req/s with concurrency=1 to http://localhost:8080 ...
#                           good/b/f t   goal%    min [p50 p95 p99  p999]  max bhash change
2017-06-02T11:14:05-07:00   7130/0/0 7200  99% 1m0s   0 [  2   5   7  112 ]  112      0 +
2017-06-02T11:15:05-07:00   7080/0/0 7200  98% 1m0s   0 [  2   5  13  119 ]  119      0 +
2017-06-02T11:16:05-07:00   7130/0/0 7200  99% 1m0s   0 [  2   5  10   95 ]   95      0
2017-06-02T11:17:05-07:00   7161/0/0 7200  99% 1m0s   0 [  2   5   8   36 ]   36      0
2017-06-02T11:18:05-07:00   7125/0/0 7200  98% 1m0s   0 [  2   5   9  115 ]  115      0
2017-06-02T11:19:05-07:00   7110/0/0 7200  98% 1m0s   0 [  2   5  14   61 ]   61      0
2017-06-02T11:20:05-07:00   7121/0/0 7200  98% 1m0s   0 [  2   4  11   75 ]   75      0
2017-06-02T11:21:05-07:00   7116/0/0 7200  98% 1m0s   0 [  1   5   8  410 ]  410      0
2017-06-02T11:22:05-07:00   7164/0/0 7200  99% 1m0s   0 [  1   5   8   30 ]   30      0
2017-06-02T11:23:05-07:00   7157/0/0 7200  99% 1m0s   0 [  1   5   9   73 ]   73      0
2017-06-02T11:24:05-07:00   7141/0/0 7200  99% 1m0s   0 [  1   5  11   45 ]   45      0
2017-06-02T11:25:05-07:00   7166/0/0 7200  99% 1m0s   0 [  1   4   8   38 ]   38      0
2017-06-02T11:26:05-07:00   7168/0/0 7200  99% 1m0s   0 [  1   4   7   34 ]   34      0
2017-06-02T11:27:05-07:00   7101/0/0 7200  98% 1m0s   0 [  1   4  13  134 ]  134      0
2017-06-02T11:28:05-07:00   7135/0/0 7200  99% 1m0s   1 [  1   3   9  127 ]  127      0
2017-06-02T11:29:05-07:00   7150/0/0 7200  99% 1m0s   0 [  1   4  10   57 ]   57      0
2017-06-02T11:30:05-07:00   7169/0/0 7200  99% 1m0s   0 [  1   3   8   36 ]   36      0```

@stevej
Copy link
Contributor

stevej commented Jun 26, 2017

Let me know if #60 helps with this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants