diff --git a/plugins/processors/topk/README.md b/plugins/processors/topk/README.md index 308d4f9f85f05..cfcb0b2176d38 100644 --- a/plugins/processors/topk/README.md +++ b/plugins/processors/topk/README.md @@ -1,16 +1,18 @@ # TopK Processor Plugin -The TopK processor plugin is a filter designed to get the top series over a period of time. It can be tweaked to do its top k computation over a period of time, so spikes can be smoothed out. +The TopK processor plugin is a filter designed to get the top series over a period of time. It can be tweaked to calculate the top metrics via different aggregation functions. This processor goes through these steps when processing a batch of metrics: - 1. Groups metrics in buckets using their tags and name as key - 2. Aggregates each of the selected fields for each bucket by the selected aggregation function (sum, mean, etc) - 3. Orders the buckets by one of the generated aggregations, returns all metrics in the top `K` buckets, then reorders the buckets by the next of the generated aggregations, returns all metrics in the top `K` buckets, etc, etc, etc, until it runs out of fields. + 1. Groups measurements in buckets based on their tags and name + 2. Every N seconds, for each bucket, for each selected field: aggregate all the measurements using a given aggregation function (min, sum, mean, etc) and the field. + 3. For each computed aggregation: order the buckets by the aggregation, then returns all measurements in the top `K` buckets -The plugin makes sure not to duplicate metrics - -Note that depending on the amount of metrics on each computed bucket, more than `K` metrics may be returned +Notes: + * The deduplicates metrics + * The name of the measurement is always used when grouping it + * Depending on the amount of metrics on each bucket, more than `K` series may be returned + * If a measurement does not have one of the selected fields, it is dropped from the aggregation ### Configuration: @@ -19,46 +21,40 @@ Note that depending on the amount of metrics on each computed bucket, more than ## How many seconds between aggregations # period = 10 - ## How many top metrics to return + ## How many top buckets to return # k = 10 - ## Over which tags should the aggregation be done. Globs can be specified, in - ## which case any tag matching the glob will aggregated over. If set to an - ## empty list is no aggregation over tags is done + ## Based on which tags should the buckets be computed. Globs can be specified. + ## If set to an empty list tags are not considered when creating the buckets # group_by = ['*'] - ## Over which fields are the top k are calculated + ## Over which fields is the aggregation done # fields = ["value"] - ## What aggregation to use. Options: sum, mean, min, max + ## What aggregation function to use. Options: sum, mean, min, max # aggregation = "mean" - ## Instead of the top k largest metrics, return the bottom k lowest metrics + ## Instead of the top k buckets, return the bottom k buckets # bottomk = false - ## The plugin assigns each metric a GroupBy tag generated from its name and - ## tags. If this setting is different than "" the plugin will add a - ## tag (which name will be the value of this setting) to each metric with - ## the value of the calculated GroupBy tag. Useful for debugging + ## This setting provides a way to know wich metrics where group together. + ## Add a tag (which name will be the value of this setting) to each metric. + ## The value will be the tags used to pick its bucket. # add_groupby_tag = "" - ## These settings provide a way to know the position of each metric in - ## the top k. The 'add_rank_field' setting allows to specify for which - ## fields the position is required. If the list is non empty, then a field - ## will be added to each and every metric for each string present in this - ## setting. This field will contain the ranking of the group that - ## the metric belonged to when aggregated over that field. + ## This setting provides a way to know the position of each metric's bucket in the top k + ## If the list is non empty, a field will be added to each and every metric + ## for each string present in this setting. This field will contain the ranking + ## of the bucket that the metric belonged to when aggregated over that field. ## The name of the field will be set to the name of the aggregation field, ## suffixed with the string '_topk_rank' # add_rank_fields = [] ## These settings provide a way to know what values the plugin is generating - ## when aggregating metrics. The 'add_aggregate_field' setting allows to - ## specify for which fields the final aggregation value is required. If the - ## list is non empty, then a field will be added to each every metric for - ## each field present in this setting. This field will contain - ## the computed aggregation for the group that the metric belonged to when - ## aggregated over that field. + ## when aggregating metrics. If the list is non empty, then a field will be + ## added to each every metric for each field present in this setting. + ## This field will contain the computed aggregation for the bucket that the + ## metric belonged to when aggregated over that field. ## The name of the field will be set to the name of the aggregation field, ## suffixed with the string '_topk_aggregate' # add_aggregate_fields = []