-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanup of the custom metrics #1435
Comments
Found a previous issue of yours on the same topic 😅 #908 |
This is also causing issues when streaming metrics to the cloud (and probably other outputs), we get |
OpenMetrics spec was released a few weeks ago. Probably it's interesting to give it a look while working on this ticket! (Pawel said that k6 metrics implementation was more-or-less based on Prometheus format). If our metrics adhere to the spec and we follow the best practices, there is less context switch for users working with k6 that come from OTel or Prometheus. |
#1832 (or something like it) is a prerequisite for the current issue. |
With #1876 now all metrics are checked for It also will either throw an exception if throw is enabled or log a warning. This IMO closes most of this issue and specifically the more non-nuanced issues which were bricking the whole metric. The leftovers IMO should be added to separate new issues so they can be discussed separately. |
The most important bugs were fixed, so I'm closing the issue. Not all features were implemented, but those features can be added as separate issues in the future. |
Custom metrics (Counter, Gauge, Trend, Rate) behave unexpectedly when non-numeric values are added. Some non-numeric values are casted to numeric values (true => 1) and some break the internal state of the metric.
Currently, any value passed to
.add()
is accepted and treated in some way towards the internal state.Regardless of what is passed to
.add()
, k6 doesn't raise exceptions. This is the correct behavior of a load testing tool because the tool needs to be resilient against failing SUT.SUT failures are expected and should not cause the test execution to be unreliable.
Non-numeric values should not be counted towards the internal state, but they should not raise exceptions either.
Rate
Rate considers
undefined
,NaN
and"strings"
as positive values and counts them towards the internal state.It considers
false
andnull
as negative values and marshals both to0
. I think this is probably correct.I have to say that it's rather funny that value
-1
is considered positive and is marshaled to+1
. Perhaps this is by design, but I believe that negative values should be counted towards the "negative" side of the metric.All non-numeric values fail JSON marshaling when the JSON output is used
k6 run -o json
Proposed refactoring
Positive values:
true
,1,2,3...
,0.1,...
,Negative values:
null
,false
,0
,-1, -2,...
,-0.1,...
Error values:
Nan
,undefined
,"string"
, any other non-numeric valueAll positive values should be marshaled to 1, all negative values should be marshaled to 0.
When an
error value
is added to the metric, k6 should produce a warning, increment an invalid_value counter for the metric, and continue with the iteration.k6 should not raise Exception and abort the iteration.
Example:
The result of this should look similar to:
I think this behavior makes k6 more resilient when executing stress tests. Ideally, the user should not need to worry about breaking iterations when the SUT is crashing.
Counter
The current default behavior of the Counter metric is not to count but to sum. I believe we should consider it a bug.
Example:
The original metrics implementation in k6 was partially based on Prometheus (#429 (comment)) which has a Counter type that is cumulative (https://prometheus.io/docs/concepts/metric_types/#counter).
As the WIP Openmetrics project is going to be some extension of the Prometheus format, it should be considered in the context of k6 as well (#858, although we don’t have to follow Prometheus of course if we don’t think it makes any sense).
The proposed change adds functionality to the
Counter
metric and fixes bugs, but doesn't break the loose compatibility with Prometheus definition of the metric.proposed refactoring.
The
Counter
metric should have 2 defaultaggregator methods
:count
andsum
.count
should be incremented on every call to.add()
.sum
should be modified on every numeric value.add(numeric_value)
.Gauge
proposed refactoring.
Trend
The trend metric is currently the most resilient metric type we have. It doesn't completely die when invalid values are added. Nevertheless, it exhibits incorrect behavior in several cases.
Current Behavior
Prevent users from redefining built-in metrics
Example script.
Produces Trend metric with values from both, built-in Trend metric and the custom Counter metric.
The
k6 cloud
backend is also confused, showing the values of the built-in Trend metric on the main chart and Counter values in the Analysis Panel.Suggested error:
ERRO: you can't redefine the 'http_req_duration' built-in metric.
Additional options
(To be discussed)
Allow users to specify the initial value for metrics
The text was updated successfully, but these errors were encountered: