-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for adding http_req_errors
metric
#1311
Comments
There's a lot to unpack here, but as I mentioned on Slack, I'm very much for addressing the very real UX issues you mentioned, but against some of the specific approaches to do so you proposed. I'll start by sharing all of the potential issues with the proposal I can think of, then hopefully we can arrive at a better alternative that achieves the same goals.
As far as I can see, all of the things you propose could be achieved with custom metrics, even the ones I disagree with 😉. And while I agree that we could definitely improve the UX and offer more flexibility for detecting failures, I disagree that it's basic, since there are a lot of complicated details to consider.
I'm not against adding a new metric, but:
I'm firmly against adding a separate In any case, I think And, similar to what we plan for #877, we might also add a default HTTP error threshold. We should just make sure users have a way to overwrite or disable whatever default option we pick.
In k6, problems are rarely solved by adding new global config options... 😉 #883 is a partial compilation of some of the reasons why that is the case... 😅 In this particular case, some parts of your load tests might treat 403/404/etc. errors as expected, while others might consider them actual errors, so global configuration options seem particularly unsuited for this. Instead, I think configuring this should probably be handled with the new JS HTTP API (issue... soon... 🤦♂️), probably at the
I agree, and a new
0 errors is a good thing, but
As I mentioned, I think these shouldn't be global config options, they should probably be a part of the new JS HTTP API. And while I agree with the default value of considering everything above 399 an error by default, we probably need some better way of specifying the rule. Maybe some sort of a JS-callback that evaluates HTTP responses before k6 emits metrics, maybe a more powerful rule engine (along the lines of #1313)... We also shouldn't forget that non-2xx HTTP statuses isn't the only way an HTTP request can fail, network/DNS/protocol/etc. problems are also a possible error cause. So, whatever way we pick to filter these out, we should probably be able to handle these as well... Also, as I mentioned in #1298, maybe we're looking at this from the wrong angle. We have to consider both issues together, since it might be enough to just let users tag requests that they expect to fail with some new system tag - then, if k6 sees such a tag, it won't add a true value to the new |
I agree that most of this functionality can be achieved today, with the current version of k6 if
It can be done, but strongly disagree that it should be. For the sake of the argument, here's the simplest implementation I could come up with for a script with one batch and one regular request (without resorting to helper utils). import http from "k6/http";
import { sleep } from "k6";
import { Rate, Trend } from "k6/metrics";
let req_success_dur = Trend("req_success_dur");
let my_httpx_errors = Rate("my_httpx_errors", false); // isTime=false
let test_trend = Trend("test_trend"); // isTime=false
export let options = {
thresholds: {
'my_httpx_errors': [
'rate<0.1'
],
'req_success_dur': [
'p(95)<1000'
]
},
stages: [
{ duration: "2m", target: 100 }, // below normal load
{ duration: "5m", target: 100 },
{ duration: "2m", target: 200 }, // normal load
{ duration: "5m", target: 200 },
{ duration: "2m", target: 300 }, // around the breaking point
{ duration: "5m", target: 300 },
{ duration: "2m", target: 400 }, // beyond the breaking point
{ duration: "5m", target: 400 },
{ duration: "10m", target: 0 }, // scale down. Recovery stage.
]
}
export default function() {
let responses = http.batch(
[
["GET", "https://httpbin.test.loadimpact.com/status/503"],
["GET", "https://httpbin.test.loadimpact.com/status/403"],
["GET", "https://httpbin.test.loadimpact.com/status/404"],
["GET", "https://httpbin.test.loadimpact.com/status/201"],
]
)
Object.values(responses).forEach(resp => {
if (resp.status >= 400 && resp.status != 404 && resp.status != 403) {
my_httpx_errors.add(true);
}
else {
req_success_dur.add(Math.round(resp.timings.duration));
my_httpx_errors.add(false);
}
});
let resp = http.get('https://httpbin.test.loadimpact.com/status/200')
if (resp.status >= 400 && resp.status != 404 && resp.status != 403) {
my_httpx_errors.add(true);
}
else {
req_success_dur.add(Math.round(resp.timings.duration));
my_httpx_errors.add(false);
}
sleep(1);
} This script works. It records I agree that I agree that the Here's the terminal output of the above script. The new, correct metrics are highlighted in green. The http_req_blocking and friends are still accounting for successful+failed requests. That's not great. Again, I think that k6 should by default record only metrics for successful requests, and the manual workaround implemented above is not a viable solution.
When doing stress testing, errors are expected to appear and you want to continue the test for an extended period of time to see when the system recovers. I believe that it's still useful to know what the http_req_duration is for successful requests when 90% of other requests are failing. It helps to determine the stress failure mode. If errors are included in Another reason for implementing something like this: let options = {
http_errors: ["status>399"], // default setting
http_errors_exclude: [404, 403, 1099]
}; is for the Cloud UI to be able to understand which requests should be considered successful/failed. Today the cloud UI is assuming that status code >= 400 is a failure, and marks it red in the UI. This is clearly incorrect in many cases. Cloud should be able to understand these settings and make use of them for UI and performance alerts. The error metric should be displayed in the cloud by default, with the same prominence as http_req_duration. A complementary functionality would be to introduce a change to the HTTP API. I purposefully didn't mention it in the original issue description because that's a rabbit hole 😄
Even though this is a more flexible solution, I suggest we don't do that as a part of this work. I think that the options solution is solving the problem in 95% of the cases,
Whatever format we decide on, the format needs to be readable by the cloud. It would be great if it didn't depend on JS.
That's a good point. Perhaps we should rename the
#1298 would benefit from the HTTP API change I'm suggesting above, but I think the
Errors are fundamental to load/stress testing. We should track and display errors by default in k6 UI and cloud UI. This should not be left for the users to define and implement on their end as a special case. I believe we should also have default thresholds based on the rate of errors. I'm sorry about the length of this comment. I need to learn how to be more concise 😅 |
Completely agree.
I'd say that something like this is a perfect place for helper utils 😉 Or, more precisely, a wrapper around import http from "k6/http";
import { sleep } from "k6";
import { Rate, Trend } from "k6/metrics";
let req_success_dur = Trend("req_success_dur");
let my_httpx_errors = Rate("my_httpx_errors", false); // isTime=false
function checkResponse(resp) {
if (resp.status >= 400 && resp.status != 404 && resp.status != 403) {
my_httpx_errors.add(true);
return false;
}
req_success_dur.add(Math.round(resp.timings.duration));
my_httpx_errors.add(false);
return true;
}
function myGet(url, params) {
let resp = http.get(url, params);
checkResponse(resp);
return resp;
}
function myBatch(reqs) {
var resps = http.batch(reqs);
Object.values(resps).forEach(checkResponse);
return resps;
}
export let options = {
thresholds: {
'my_httpx_errors': [
'rate<0.1'
],
'req_success_dur': [
'p(95)<1000'
]
},
stages: [
{ duration: "2m", target: 100 }, // below normal load
{ duration: "5m", target: 100 },
{ duration: "2m", target: 200 }, // normal load
{ duration: "5m", target: 200 },
{ duration: "2m", target: 300 }, // around the breaking point
{ duration: "5m", target: 300 },
{ duration: "2m", target: 400 }, // beyond the breaking point
{ duration: "5m", target: 400 },
{ duration: "10m", target: 0 }, // scale down. Recovery stage.
]
}
export default function () {
myBatch([
["GET", "https://httpbin.test.loadimpact.com/status/503"],
["GET", "https://httpbin.test.loadimpact.com/status/403"],
["GET", "https://httpbin.test.loadimpact.com/status/404"],
["GET", "https://httpbin.test.loadimpact.com/status/201"],
]);
myGet('https://httpbin.test.loadimpact.com/status/200')
sleep(1);
}
While, as I said, I'd like to improve these parts of k6 to handle the most common use cases in a better way (while also allowing the handling of more complex ones with some work), I don't think something like the above wrapper is too onerous of a workaround until then.
👍 🤝
Agree, the current output is messy. I'll respond to your other issue (#1319) later, but in general I agree that we shouldn't show most of the HTTP metrics we currently show, unless users explicitly request them or have a threshold on them. But we can't remove them until we have a way that allows users to add them back... So, I guess this is tracked in #570, for lack of a better issue.
Maybe, I can see how this makes sense, though I think this should be configurable... For example, you're thinking in terms of application errors (i.e. HTTP status code >= 400), but there are also network/DNS/HTTPS/etc. issues, for which the duration measurement might be more important... And we shouldn't forget intentional/unavoidable errors, which we should also measure (#1298), and that decision has to be taken locally, not be a global option. In any case, I can see use cases for both measuring "errors" and for not measuring them, so, in the interest of not locking out any users, I think this should be configurable... We can have a flag (and other options) in the new JS HTTP API that enables/disables this behavior. In that API, not measuring them can also be the default value, if we think that's the generally correct approach, but in the current API we should preserve the current behavior, I think...
Hmm you're right, this is a very valid use case. 👍
As I mentioned earlier, I strongly think that such options aren't the best UX for this behavior. And while I wouldn't mind something like this in the new HTTP API, I'm especially against having them in the global So, I think if we just tag the metrics from the expectedly erroneous requests (however those are defined) with a
I don't understand this point, sorry.
This proposal I actually like 😅 Again, it probably should fall under the new HTTP API, but in my head, this (or something like this) is a nice way to handle it. Basically, the k6 HTTP client can emit
I'm trying to argue that it will be both easier and better (i.e. 100% of cases) not to have global options 😉
Again, we have per-request metrics, it would be much simpler for the cloud to process something it already understands, than to parse (probably with differences from k6...) some new global option.
Not sure... My gut feeling is that we shouldn't mix these things, so
I disagree. As far as I understand, they want to consider only some 404 errors as expected. After their resource is created, I don't see why they'd want to ignore any other errors. Also, #1298 (comment) ... 😉
I agree 👍
Me too 😅 |
wow!! I will be brave and participate in this conversation. I won't discuss the details because I just want to share the cases where I've experienced a default HTTP error metric could be useful: Simple Threshold on HTTP errors Define easily thresholds based on HTTP errors without having to define custom metrics.
Visualize HTTP errors in Datadog Provide a general visualization for HTTP error rate or error counter in Datadog. This is not possible today because Datadog does not allow to query metric by tags using the OR operator.
When reviewing other Datadog integrations I found the following rigor error metrics:
|
@sniku, @robingustafsson and @na--, can we bump |
We still don't really have an agreement on what And IIRC, that eventual design was tentatively intended to be included into the new k6 HTTP API, not the current one. Not only because we probably want #1321 beforehand and we don't want to make #883 worse with global options, but also because global options will likely be too restrictive for this feature. You might want to treat HTTP errors one way in one scenario and another way in the other scenarios. So, these distinctions need to be made on something like an HTTP-client level, not globally, which is what |
Prioritization
@sniku I think we should spend some time finding an agreement on this API.
I don't know the status of the new k6 HTTP API but it looks experimental and further in the future. I wonder what are the impediments to include this on API Design Skipping cosmetics preferences, IMO, the global option together with the per-request Breaking changes I think we all agree that error response times should be excluded "somehow" from success responses. I raise the concern that if error response times are excluded from |
@ppcano, we have discussed this feature during the last k6 sync call, and we have decided to work on the specification. The feature is not as simple as I initially thought, so it takes some time to work out all the edge cases, and address all the valid concerns raised above. We plan to include this feature in v0.31 release. I'll update this issue or create a new one when the specification has been finalized—hopefully, today. |
Closing this in favor of the new issue: #1828 |
This is a proposal for adding a new, standard metric to k6 called
http_req_errors
that records the number of failed HTTP requests.Some of this functionality can be achieved today with a custom metric, but it's so basic that I suggest it should be included by default.
Feature Description
Successful requests often have longer
http_req_duration
than failed requests. This metric would allow k6 to display a separatehttp_req_duration
for successful requests and separatehttp_req_duration_errors
for failed requests.It's not possible to define which HTTP status codes should be considered errors because these codes are context-specific. For example, HTTP-404 is not an error if someone is testing for 404s.
This problem can be solved by adding additional configuration options, as described below.
The biggest advantage of this functionality is the UX improvement.
Currently, k6 doesn't distinguish between failed and successful requests, therefore the terminal output for a test that has only failed requests is not distinguishable from a successful test.
With this change, one could see at first glance that the test was/wasn't successful without adding checks and thresholds.
Suggested Solution (optional)
k6 terminal output could look similar to this:
The text was updated successfully, but these errors were encountered: