-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Lens] Discuss types of rates #62375
Comments
Pinging @elastic/kibana-app-arch (Team:AppArch) |
This aggregation may be added to Elasticsearch as a convenience: elastic/elasticsearch#60674 |
@cchaos @AlonaNadler here are two UI concepts that I have come up with for this aggregation, to show how we could support this: As a rate option on the Count metric: As a separate function: |
I think I prefer to have it as a dedicated function for discoverability. It also doesn't always clear whats the relationship with count . |
@AlonaNadler the relationship with count is that "event rate" is something like "count per second". We definitely want a separate function for positive rate. You're saying you would prefer to have both "Positive rate" and "Event rate" as separate functions? |
The new rate aggregation was just merged into Elasticsearch, and supports two options:
I can see both of these being valuable, but in my mind they have separate names. My personal preference is to expose this in two ways:
We can't begin work on this until it's supported in esaggs, so it's currently blocked. |
What if we had a general rate function. The idea behind it is that most users don't know and wouldn't be able to tell the difference between the rate the way we do in Elastic and they shouldn't need to |
@AlonaNadler The idea you're proposing is possible from a technical perspective. Based on the use cases I've analyzed for rates, I am not sure that the there is a "general" rate like you are asking about. There are specific types of rates based on the data:
So it's definitely possible to combine all of these types of rates into a single function, but we will need to make the user choose one of the 4 options. I think that we would benefit here from real data and examples. I'm planning on writing up example data for each of these 4 options, unless you'd prefer to take this on @AlonaNadler. I also think we are missing clarity from @cchaos on the different options and how we want to present them to the user. |
Sounds good Wylie, please focus on the first 3. Growth might be considered as a rate but it shouldn't. The rate function addressing mostly our observability and metrics users. Exploring online I see several ways it is being calculated, none of them though directly correspond to the top 3 you have above. |
Okay, the next step is to get a mockup from @cchaos |
Based on my research that includes talking with multiple observability folks and researching the ways various solutions calculate rates. I suggest the following : There are two types of field users want to calculate rates on: Looking at our beats most are gauges. Looking at other vendors it seems more commonly they calculate rate assuming gauges metrics. goals :
What we expose for users?
Rate
% change:
|
Based on your comment and offline discussions, I think we mostly agree, with the exception of gauges. I think this might be a confusion about the terminology, and will attempt to clarify this using examples. These examples are based on the work I've been doing to create a comprehensive list of time series functions for us to work backwards from.
Count per hour, count per second. For example, I can show the number of hourly transactions in the ecommerce sample data, even when I query the data per day:
For fields that would usually be displayed as a Sum, such as quantity, we can convert these into a rate by taking the Sum over the time interval. For example, the ecommerce sample data has
Counters are monotonically increasing numbers, such as network traffic. The function to convert a counter into a rate does not work for other types of numbers. Counters usually have a separate ID field, and if the user doesn't provide an ID we'll produce incorrect data. Here is a correct dataset, showing the average Megabytes per second: 4: Gauges: Represents point-in-time data like CPU or memory. Gauges aren't usually shown as a rate, because they are usually shown as an average. Despite this, some timeseries tools offer this functionality, but I think we should discourage users from attempting this on gauges: It would make sense to apply smoothing functions to gauges, such as moving averages. |
Talking with observability folks @sorantis @crowens @ruflin @exekias This is what I suggest:
@cchaos wireframe for rate: |
@AlonaNadler I don't think that you've addressed the points I made in my previous comment. Can you please respond to my points more specifically? Here are the main differences I've identified:
|
checkout my last comment "Rate by default assumes the field is an accumulating counter - calculated positive(derivative(max(field)))"
Yes I don't think we should support it in this form. I suggest we support a simple rate function that doesn't require to add another calculation and is based on the field being an accumulating counter
your 3rd point seems to contradict your 1st. Based on my discussion with the observability team, seem like the counter will continue and these are the fields that their users need rates more often. Any other example I showed that were different variation in the beats dashboards were a calculation mistake. To support other users who might be using gauges (related or unrelated to beat) I suggested the approach in the wireframe that support both gauge and counters assuming counters by default |
Thanks for moving this forward folks! some clarifications from what we do in Metricbeat and others: Our definition of I love to see that rate is normalized to seconds by default with users being able to change this! |
I did some more thinking about this: "normalizing values to the bucket times" is actually calculating a rate 🤦, so it all depends on the definition we want to have. If we take:
things add up. Answering myself on examples: A gauge could be the A counter could be I think that, for the sake of this conversation, the value of a summable field matches the definition of a gauge. Users would just need to ask for the |
This issue has turned into a discussion about rates in general, so I've split out my original questions about just the "event rate" function into a separate issue: #77811 |
@exekias @AlonaNadler I think we are on the same page about counter-type numbers, as well as about "event rates". To summarize:
The remaining issue is gauge-type numbers, which I listed as case 4 above:
I think that I've already shown that CPU and memory don't make sense when converted into a "rate per second", but I've been looking for examples where it might make sense. I finally found one: metricbeat has an As you can see, the value goes up and down over time. Here's what it looks like if I take the derivative of this value and scale it to 1 minute: Here's what it looks like with the same calculation, but split by the index that we're calculating from: All of these calculation will be possible in Lens by default, even if we don't offer a rate function which is able to do them. So do we need this at all? I am proposing that the only dedicated "rate" functions in Lens would be:
|
The kinds of rates discussed here are supported in Lens today |
Edit: This discussion is not focused specifically on the "average event rate", it's turned into a more general discussion of rates. Keeping the old discussion for posterity. Discussion about a generic rate function begins at this comment: #62375 (comment)
Average event rate lets users represent very small time intervals in their visualizations without building slow queries, and is common in timeseries use cases. For example, if the user wants to visualize the average sales per second, they can either build a date histogram with an interval of 1 second, or build a date histogram with a larger interval and then use an average event rate aggregation.
The definition of average event rate is: the average number of documents added, divided by the time interval, multiplied into a target interval in units of milliseconds, seconds, hours, etc.
This aggregation is already possible to do by using clever logic in TSVB, but we can simplify this and make it more widely available to handle time series use cases in other tools in Kibana.
For users, this metric will appear in two ways
One of the important parts about this calculation is to handle timezones and leap seconds.
The text was updated successfully, but these errors were encountered: