Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong metrics on timed out HTTP requests #925

Closed
na-- opened this issue Feb 13, 2019 · 5 comments · Fixed by #1047
Closed

Wrong metrics on timed out HTTP requests #925

na-- opened this issue Feb 13, 2019 · 5 comments · Fixed by #1047

Comments

@na--
Copy link
Member

na-- commented Feb 13, 2019

When we time out an HTTP request like this:

import http from "k6/http";

export default function () {
    http.get("https://httpbin.org/delay/10", { timeout: 3000 });
}

The metrics k6 emits don't make sense:

WARN[0003] Request Failed                                error="Get https://httpbin.org/delay/10: net/http: request canceled (Client.Timeout exceeded while awaiting headers)"
    done [==========================================================] 1 / 1

    data_received..............: 3.1 kB 1.0 kB/s
    data_sent..................: 455 B  151 B/s
    http_req_blocked...........: avg=501.1ms  min=501.1ms  med=501.1ms  max=501.1ms  p(90)=501.1ms  p(95)=501.1ms 
    http_req_connecting........: avg=128.62ms min=128.62ms med=128.62ms max=128.62ms p(90)=128.62ms p(95)=128.62ms
    http_req_duration..........: avg=100.02µs min=100.02µs med=100.02µs max=100.02µs p(90)=100.02µs p(95)=100.02µs
    http_req_receiving.........: avg=0s       min=0s       med=0s       max=0s       p(90)=0s       p(95)=0s      
    http_req_sending...........: avg=100.02µs min=100.02µs med=100.02µs max=100.02µs p(90)=100.02µs p(95)=100.02µs
    http_req_tls_handshaking...: avg=312.07ms min=312.07ms med=312.07ms max=312.07ms p(90)=312.07ms p(95)=312.07ms
    http_req_waiting...........: avg=0s       min=0s       med=0s       max=0s       p(90)=0s       p(95)=0s      
    http_reqs..................: 1      0.333313/s
    iteration_duration.........: avg=3s       min=3s       med=3s       max=3s       p(90)=3s       p(95)=3s      
    iterations.................: 1      0.333313/s
    vus........................: 1      min=1 max=1
    vus_max....................: 1      min=1 max=1

The http_req_* metrics should add up roughly to 3 seconds, but they're very far from it...

@na-- na-- added this to the v1.0.0 milestone Feb 13, 2019
@mstoykov
Copy link
Contributor

I would guess that we don't calculate http_req_receiving and http_req_waiting at all. I would say that the waiting should get all the difference ?

@na--
Copy link
Member Author

na-- commented Feb 13, 2019

Yes, seems to me that waiting should soak up the difference, unless the server delay was implemented in such a way that it didn't even start reading the request headers and body. That wasn't the case here apparently, but it could happen. Also, http_req_blocked seems quite high in this case, which is also strange...

@na--
Copy link
Member Author

na-- commented Feb 14, 2019

The receiving times here also seem suspiciously low: https://serverfault.com/questions/952025/not-understanding-metrics-of-k6/
I'm not sure how to unit-test these things without introducing anymore flaky tests... maybe we can spin-up an http server and intercept the network traffic to it, injecting some sleep times in either the network connection or in the HTTP response, so we can simulate different aspects of a shitty network. Then, if we check that say the http_req_connecting or http_req_receiving or whatever is more than the sleep we injected, this test should be pretty resilient and shouldn't fail randomly in CI runs...

@mstoykov
Copy link
Contributor

This test looks like it will probably need some pretty heavy setup which might require something like docker-compose to run it easier. Although my experience with making network shitty is that VM and docker have some random problems ... Maybe we should first try to make a simple setup that we can run and test stuff and than figure out if we can make it into a test ?

@na--
Copy link
Member Author

na-- commented Feb 14, 2019

That's how I started the issue 😜 and I'm all for that approach, for finding the bugs and for the final verification that we've fixed them. But for the actual fixing, I'd really like if we have a repeatable automated test that should work according to our expectations, but currently fails, at least until we fix the bug. Ideally, we should have at least one such test for each http_req_* metric, just so we're reasonably sure that we won't break something else by fixing the bugs we know about...

And locally, we should be able to simulate most of the delays without any docker containers, just by spinning up a Go http server in the test and either messing with the connection, or with the server response. For sure, we won't be able to simulate every possible shitty network, and we're bound to have a ton of issues with HTTP/2, but we should be able to have one simple test per metric...

na-- added a commit that referenced this issue Jun 10, 2019
This should fix #1041, #1044, #925, and a potential minor data race! Hopefully, without introducing new bugs or preformance regressions...
na-- added a commit that referenced this issue Jun 13, 2019
Fix a bunch of HTTP measurement and handling issues

This should fix #1041, #1044, #925, and a potential minor data race! Hopefully, without introducing new bugs or performance regressions...
@na-- na-- removed this from the v1.0.0 milestone Jul 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants