[Lens] Discuss types of rates #62375

wylieconlon · 2020-04-02T21:18:53Z

Edit: This discussion is not focused specifically on the "average event rate", it's turned into a more general discussion of rates. Keeping the old discussion for posterity. Discussion about a generic rate function begins at this comment: #62375 (comment)

Average event rate lets users represent very small time intervals in their visualizations without building slow queries, and is common in timeseries use cases. For example, if the user wants to visualize the average sales per second, they can either build a date histogram with an interval of 1 second, or build a date histogram with a larger interval and then use an average event rate aggregation.

The definition of average event rate is: the average number of documents added, divided by the time interval, multiplied into a target interval in units of milliseconds, seconds, hours, etc.

This aggregation is already possible to do by using clever logic in TSVB, but we can simplify this and make it more widely available to handle time series use cases in other tools in Kibana.

For users, this metric will appear in two ways

When the index pattern has a timefield, the user can build a metric or gauge visualization using average event rate.
When using a date histogram, the user can choose an interval smaller than the histogram interval to calculate the average

One of the important parts about this calculation is to handle timezones and leap seconds.

elasticmachine · 2020-04-02T21:19:01Z

Pinging @elastic/kibana-app-arch (Team:AppArch)

wylieconlon · 2020-08-06T19:56:10Z

This aggregation may be added to Elasticsearch as a convenience: elastic/elasticsearch#60674

wylieconlon · 2020-08-13T19:13:28Z

@cchaos @AlonaNadler here are two UI concepts that I have come up with for this aggregation, to show how we could support this:

As a rate option on the Count metric:

As a separate function:

AlonaNadler · 2020-08-14T17:35:56Z

I think I prefer to have it as a dedicated function for discoverability. It also doesn't always clear whats the relationship with count .
Sounds like event rate only requires time scaling.

wylieconlon · 2020-08-14T17:56:01Z

@AlonaNadler the relationship with count is that "event rate" is something like "count per second". We definitely want a separate function for positive rate. You're saying you would prefer to have both "Positive rate" and "Event rate" as separate functions?

wylieconlon · 2020-08-25T15:42:26Z

The new rate aggregation was just merged into Elasticsearch, and supports two options:

Event rate (example: Count per second)
Rate of a field based on the sum of values (Example: Average quantity sold per day over a month)

I can see both of these being valuable, but in my mind they have separate names. My personal preference is to expose this in two ways:

I would prefer having a rate option for the Count metric
For the "rate of a field", I would add a new function named "Average rate"

We can't begin work on this until it's supported in esaggs, so it's currently blocked.

AlonaNadler · 2020-08-25T19:27:12Z

What if we had a general rate function.
If there is already a field in the configurator then Lens does rate of the field. E,g, rate of bytes
If there is no field it does rate of events
If there is a field user can still choose to select rate of events and in that case it removes the field from that dimension in the configuration.

The idea behind it is that most users don't know and wouldn't be able to tell the difference between the rate the way we do in Elastic and they shouldn't need to

wylieconlon · 2020-08-25T19:42:36Z

@AlonaNadler The idea you're proposing is possible from a technical perspective. Based on the use cases I've analyzed for rates, I am not sure that the there is a "general" rate like you are asking about. There are specific types of rates based on the data:

Count per second
Average rate per second of a field
Positive rate per second of a field that is always increasing
Growth rate from previous interval

So it's definitely possible to combine all of these types of rates into a single function, but we will need to make the user choose one of the 4 options.

I think that we would benefit here from real data and examples. I'm planning on writing up example data for each of these 4 options, unless you'd prefer to take this on @AlonaNadler. I also think we are missing clarity from @cchaos on the different options and how we want to present them to the user.

AlonaNadler · 2020-08-26T00:43:16Z

Sounds good Wylie, please focus on the first 3. Growth might be considered as a rate but it shouldn't. The rate function addressing mostly our observability and metrics users. Exploring online I see several ways it is being calculated, none of them though directly correspond to the top 3 you have above.
Since most users are not familiar with this semantic, our goal is to simplify that for them, choose the right default. Based on your research let's try to see how we can have all first 3 items coupled together under one rate function

wylieconlon · 2020-08-26T15:18:51Z

Okay, the next step is to get a mockup from @cchaos

AlonaNadler · 2020-09-02T14:21:22Z

Based on my research that includes talking with multiple observability folks and researching the ways various solutions calculate rates. I suggest the following :

There are two types of field users want to calculate rates on:
Gauges (more common for our user base) - are point in time metric, CPU, memory, revenue, etc
Counters - are accumulating metrics. Doing cumulative over time until they reach a certain point and reset

Looking at our beats most are gauges. Looking at other vendors it seems more commonly they calculate rate assuming gauges metrics.

goals :

Reduce the number of decisions the users need to make
Help users with friendly message to get to what they need
Make the most common operations default.

What we expose for users?
2 new functions:

Rate
% change

Rate

By default, Lens assumes a gauge metric and performs based on a per-second interval. For example bytes per second, CPU per second.
The calculation is done based on the field being gauge metric - average(field))/ normalized or sum of field diveden to normalized by the interval
Rate can be on records field - doing rate of events
If users choose advance they can specify their field is a counter. In that case, Lens calculates the rate using max instead of average.
In the advance popup, there is also a checkbox for positive only, the checkbox is checked by default so the rate will not have negative values by default

% change:

used often in business to show growth or decline using percentage in a normalized way.
refers to also as month over month or d/d
example: % change of revenue or transactions
can show a negative percentage
should be exposed outside of rate since it is a common function for business users who not always refers to it as rate in business terms

cc: @cchaos @crowens

wylieconlon · 2020-09-02T21:07:01Z

Based on your comment and offline discussions, I think we mostly agree, with the exception of gauges. I think this might be a confusion about the terminology, and will attempt to clarify this using examples. These examples are based on the work I've been doing to create a comprehensive list of time series functions for us to work backwards from.

Count:

Count per hour, count per second. For example, I can show the number of hourly transactions in the ecommerce sample data, even when I query the data per day:

Value of a summable field:

For fields that would usually be displayed as a Sum, such as quantity, we can convert these into a rate by taking the Sum over the time interval. For example, the ecommerce sample data has products.quantity, and we can look at the average number of products sold per hour, even in a daily chart.

Counter:

Counters are monotonically increasing numbers, such as network traffic. The function to convert a counter into a rate does not work for other types of numbers. Counters usually have a separate ID field, and if the user doesn't provide an ID we'll produce incorrect data. Here is a correct dataset, showing the average Megabytes per second:

4: Gauges:

Represents point-in-time data like CPU or memory. Gauges aren't usually shown as a rate, because they are usually shown as an average. Despite this, some timeseries tools offer this functionality, but I think we should discourage users from attempting this on gauges:

It would make sense to apply smoothing functions to gauges, such as moving averages.

AlonaNadler · 2020-09-16T22:51:07Z

Talking with observability folks @sorantis @crowens @ruflin @exekias

This is what I suggest:

Rate by default assumes the field is a accumulating counter - calculated positive(derivative(max(field)))
By default Lens normalizes to a second. Users can change it to minutes, hour, day etc...
Users can specify explicitly that their field is not a counter, in which case Lens will calculated it based on rate on gauge field Derivative(average(field))/ normalized by the interval
Rate of events is an edge case it can be fulfilled in Lens in two ways (we should choose one):
- allow users to perform rate on the count of records - this will be calculated based on count being a gauge
- We can add this as an advance function within count function in Lens
Users can select a numeric field drag to the preview or configurator change to rate function and will get a preview based on rate per second (while users can use this in the wrong way and create rates on fields which shouldn't be using rate, Lens will still allow it)
% of change is a separate quick function in Lens
ideally (but not sure if its possible): when Lens detect a field is not an accumulating counter field, it will calculate rates based on the gauge formula without the need from users side to explicitly configure it.

@cchaos wireframe for rate:

wylieconlon · 2020-09-16T23:14:51Z

@AlonaNadler I don't think that you've addressed the points I made in my previous comment. Can you please respond to my points more specifically? Here are the main differences I've identified:

You are saying that we should be supporting "rates on gauges", but like I wrote previously, this does not make sense to me. I could not find any examples where this is the desired behavior. Can you provide a specific example, or is this a terminology issue?
The "rate of summable number", which I listed as 2 above, is missing from your comment. If this was unintentional, then I think your wireframe needs to be updated.
You think the default type of numbers are "counters", but this doesn't match the data I was looking at. There are very few counters tracked in metricbeat. Overall, most numbers in metricbeat are gauges, and I wouldn't want to convert these into rates. Why do you think the default should be counters? Please provide more evidence.

AlonaNadler · 2020-09-17T04:57:30Z

You are saying that we should be supporting "rates on gauges", but like I wrote previously, this does not make sense to me. I could not find any examples where this is the desired behavior. Can you provide a specific example, or is this a terminology issue?

checkout my last comment "Rate by default assumes the field is an accumulating counter - calculated positive(derivative(max(field)))"

The "rate of summable number", which I listed as 2 above, is missing from your comment. If this was unintentional, then I think your wireframe needs to be updated.

Yes I don't think we should support it in this form. I suggest we support a simple rate function that doesn't require to add another calculation and is based on the field being an accumulating counter

You think the default type of numbers are "counters", but this doesn't match the data I was looking at. There are very few counters tracked in metricbeat. Overall, most numbers in metricbeat are gauges, and I wouldn't want to convert these into rates. Why do you think the default should be counters? Please provide more evidence.

your 3rd point seems to contradict your 1st. Based on my discussion with the observability team, seem like the counter will continue and these are the fields that their users need rates more often. Any other example I showed that were different variation in the beats dashboards were a calculation mistake. To support other users who might be using gauges (related or unrelated to beat) I suggested the approach in the wireframe that support both gauge and counters assuming counters by default

exekias · 2020-09-17T15:24:26Z

Thanks for moving this forward folks! some clarifications from what we do in Metricbeat and others:

Our definition of counter is in fact what you are talking about: a monotonically increasing number. Calculating its rate is done with positive(derivative(max(field))).

I love to see that rate is normalized to seconds by default with users being able to change this!

exekias · 2020-09-17T15:49:04Z

I did some more thinking about this:

"normalizing values to the bucket times" is actually calculating a rate 🤦, so it all depends on the definition we want to have.

If we take:

positive(derivative(max(field))) / normalized to interval for counters
average(field) / normalized to interval for the rest (gauges).

things add up.

Answering myself on examples:

A gauge could be the document count per interval, either calculated by Kibana or sent by the Agent. Calculating a rate on it would result on document count per second, which matches the definition of rate.

A counter could be network bytes sent since the system started, incrementing on each Agent poll. Calculating a rate on it would result on network bytes sent per second.

I think that, for the sake of this conversation, the value of a summable field matches the definition of a gauge. Users would just need to ask for the rate of the sum of the field. I understand that chaining these will be possible?

wylieconlon · 2020-09-17T19:23:40Z

This issue has turned into a discussion about rates in general, so I've split out my original questions about just the "event rate" function into a separate issue: #77811

wylieconlon · 2020-09-17T20:00:39Z

@exekias @AlonaNadler I think we are on the same page about counter-type numbers, as well as about "event rates". To summarize:

Counter type: We can offer a convenient function to calculate the rate. This is case 3 that I listed above.
Event rates by multiplying the Count/Sum/Average into a "per second" rate: This is not a separate function, but it's an option on existing functions. This would handle the cases which I described above as 1 and 2.

The remaining issue is gauge-type numbers, which I listed as case 4 above:

Gauges: Represents point-in-time data like CPU or memory. Gauges aren't usually shown as a rate, because they are usually shown as an average.

I think that I've already shown that CPU and memory don't make sense when converted into a "rate per second", but I've been looking for examples where it might make sense. I finally found one: metricbeat has an elasticsearch module which tracks the total document count for each index. Here's a graph of the Average of elasticsearch.index.total.docs_count:

As you can see, the value goes up and down over time. Here's what it looks like if I take the derivative of this value and scale it to 1 minute:

Here's what it looks like with the same calculation, but split by the index that we're calculating from:

All of these calculation will be possible in Lens by default, even if we don't offer a rate function which is able to do them. So do we need this at all?

I am proposing that the only dedicated "rate" functions in Lens would be:

Monotonic rate for counters
Percentage change

flash1293 · 2021-02-04T10:50:31Z

The kinds of rates discussed here are supported in Lens today

wylieconlon added Feature:Aggregations Aggregation infrastructure (AggConfig, esaggs, ...) Feature:TSVB TSVB (Time Series Visual Builder) Team:AppArch Feature:Lens labels Apr 2, 2020

wylieconlon mentioned this issue Apr 6, 2020

[lens] Add "counter rate" for monotonically increasing numbers #46627

Closed

rayafratkina mentioned this issue Apr 13, 2020

[Meta] Kibana support for ES aggregations #58628

Closed

7 tasks

flash1293 added the enhancement New value added to drive a business result label Aug 6, 2020

This was referenced Aug 6, 2020

[Meta][Lens] Calculations and advanced queries #57713

Closed

[Meta][Lens] Data Modelling #57708

Closed

wylieconlon mentioned this issue Aug 11, 2020

[Lens] Support for all time series functions #74813

Closed

2 tasks

wylieconlon changed the title ~~Average event rate aggregation~~ Average event rate aggregation (count per second) Aug 14, 2020

wylieconlon added the blocked label Aug 25, 2020

wylieconlon assigned cchaos Aug 26, 2020

wylieconlon changed the title ~~Average event rate aggregation (count per second)~~ [Lens] Discuss types of rates Sep 17, 2020

wylieconlon added discuss and removed Team:AppArch Feature:Aggregations Aggregation infrastructure (AggConfig, esaggs, ...) Feature:TSVB TSVB (Time Series Visual Builder) blocked enhancement New value added to drive a business result labels Sep 17, 2020

wylieconlon added the Team:Visualizations Visualization editors, elastic-charts and infrastructure label Sep 17, 2020

wylieconlon unassigned cchaos Sep 17, 2020

wylieconlon mentioned this issue Sep 17, 2020

[Lens] Options to scale metrics into a "rate per second" or "rate per minute" #77811

Closed

flash1293 closed this as completed Feb 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Lens] Discuss types of rates #62375

[Lens] Discuss types of rates #62375

wylieconlon commented Apr 2, 2020 •

edited

Loading

elasticmachine commented Apr 2, 2020

wylieconlon commented Aug 6, 2020

wylieconlon commented Aug 13, 2020

AlonaNadler commented Aug 14, 2020

wylieconlon commented Aug 14, 2020

wylieconlon commented Aug 25, 2020

AlonaNadler commented Aug 25, 2020

wylieconlon commented Aug 25, 2020 •

edited

Loading

AlonaNadler commented Aug 26, 2020

wylieconlon commented Aug 26, 2020

AlonaNadler commented Sep 2, 2020 •

edited by wylieconlon

Loading

wylieconlon commented Sep 2, 2020

AlonaNadler commented Sep 16, 2020

wylieconlon commented Sep 16, 2020 •

edited

Loading

AlonaNadler commented Sep 17, 2020

exekias commented Sep 17, 2020 •

edited

Loading

exekias commented Sep 17, 2020 •

edited

Loading

wylieconlon commented Sep 17, 2020

wylieconlon commented Sep 17, 2020 •

edited

Loading

flash1293 commented Feb 4, 2021

[Lens] Discuss types of rates #62375

[Lens] Discuss types of rates #62375

Comments

wylieconlon commented Apr 2, 2020 • edited Loading

elasticmachine commented Apr 2, 2020

wylieconlon commented Aug 6, 2020

wylieconlon commented Aug 13, 2020

AlonaNadler commented Aug 14, 2020

wylieconlon commented Aug 14, 2020

wylieconlon commented Aug 25, 2020

AlonaNadler commented Aug 25, 2020

wylieconlon commented Aug 25, 2020 • edited Loading

AlonaNadler commented Aug 26, 2020

wylieconlon commented Aug 26, 2020

AlonaNadler commented Sep 2, 2020 • edited by wylieconlon Loading

wylieconlon commented Sep 2, 2020

AlonaNadler commented Sep 16, 2020

wylieconlon commented Sep 16, 2020 • edited Loading

AlonaNadler commented Sep 17, 2020

exekias commented Sep 17, 2020 • edited Loading

exekias commented Sep 17, 2020 • edited Loading

wylieconlon commented Sep 17, 2020

wylieconlon commented Sep 17, 2020 • edited Loading

flash1293 commented Feb 4, 2021

wylieconlon commented Apr 2, 2020 •

edited

Loading

wylieconlon commented Aug 25, 2020 •

edited

Loading

AlonaNadler commented Sep 2, 2020 •

edited by wylieconlon

Loading

wylieconlon commented Sep 16, 2020 •

edited

Loading

exekias commented Sep 17, 2020 •

edited

Loading

exekias commented Sep 17, 2020 •

edited

Loading

wylieconlon commented Sep 17, 2020 •

edited

Loading