Skip to content

Commit

Permalink
Fix spelling and clarify docs (influxdata#8164)
Browse files Browse the repository at this point in the history
  • Loading branch information
mirath authored Dec 23, 2020
1 parent ea4feb1 commit 841e971
Showing 1 changed file with 26 additions and 30 deletions.
56 changes: 26 additions & 30 deletions plugins/processors/topk/README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,18 @@
# TopK Processor Plugin

The TopK processor plugin is a filter designed to get the top series over a period of time. It can be tweaked to do its top k computation over a period of time, so spikes can be smoothed out.
The TopK processor plugin is a filter designed to get the top series over a period of time. It can be tweaked to calculate the top metrics via different aggregation functions.

This processor goes through these steps when processing a batch of metrics:

1. Groups metrics in buckets using their tags and name as key
2. Aggregates each of the selected fields for each bucket by the selected aggregation function (sum, mean, etc)
3. Orders the buckets by one of the generated aggregations, returns all metrics in the top `K` buckets, then reorders the buckets by the next of the generated aggregations, returns all metrics in the top `K` buckets, etc, etc, etc, until it runs out of fields.
1. Groups measurements in buckets based on their tags and name
2. Every N seconds, for each bucket, for each selected field: aggregate all the measurements using a given aggregation function (min, sum, mean, etc) and the field.
3. For each computed aggregation: order the buckets by the aggregation, then returns all measurements in the top `K` buckets

The plugin makes sure not to duplicate metrics

Note that depending on the amount of metrics on each computed bucket, more than `K` metrics may be returned
Notes:
* The deduplicates metrics
* The name of the measurement is always used when grouping it
* Depending on the amount of metrics on each bucket, more than `K` series may be returned
* If a measurement does not have one of the selected fields, it is dropped from the aggregation

### Configuration:

Expand All @@ -19,46 +21,40 @@ Note that depending on the amount of metrics on each computed bucket, more than
## How many seconds between aggregations
# period = 10

## How many top metrics to return
## How many top buckets to return
# k = 10

## Over which tags should the aggregation be done. Globs can be specified, in
## which case any tag matching the glob will aggregated over. If set to an
## empty list is no aggregation over tags is done
## Based on which tags should the buckets be computed. Globs can be specified.
## If set to an empty list tags are not considered when creating the buckets
# group_by = ['*']

## Over which fields are the top k are calculated
## Over which fields is the aggregation done
# fields = ["value"]

## What aggregation to use. Options: sum, mean, min, max
## What aggregation function to use. Options: sum, mean, min, max
# aggregation = "mean"

## Instead of the top k largest metrics, return the bottom k lowest metrics
## Instead of the top k buckets, return the bottom k buckets
# bottomk = false

## The plugin assigns each metric a GroupBy tag generated from its name and
## tags. If this setting is different than "" the plugin will add a
## tag (which name will be the value of this setting) to each metric with
## the value of the calculated GroupBy tag. Useful for debugging
## This setting provides a way to know wich metrics where group together.
## Add a tag (which name will be the value of this setting) to each metric.
## The value will be the tags used to pick its bucket.
# add_groupby_tag = ""

## These settings provide a way to know the position of each metric in
## the top k. The 'add_rank_field' setting allows to specify for which
## fields the position is required. If the list is non empty, then a field
## will be added to each and every metric for each string present in this
## setting. This field will contain the ranking of the group that
## the metric belonged to when aggregated over that field.
## This setting provides a way to know the position of each metric's bucket in the top k
## If the list is non empty, a field will be added to each and every metric
## for each string present in this setting. This field will contain the ranking
## of the bucket that the metric belonged to when aggregated over that field.
## The name of the field will be set to the name of the aggregation field,
## suffixed with the string '_topk_rank'
# add_rank_fields = []

## These settings provide a way to know what values the plugin is generating
## when aggregating metrics. The 'add_aggregate_field' setting allows to
## specify for which fields the final aggregation value is required. If the
## list is non empty, then a field will be added to each every metric for
## each field present in this setting. This field will contain
## the computed aggregation for the group that the metric belonged to when
## aggregated over that field.
## when aggregating metrics. If the list is non empty, then a field will be
## added to each every metric for each field present in this setting.
## This field will contain the computed aggregation for the bucket that the
## metric belonged to when aggregated over that field.
## The name of the field will be set to the name of the aggregation field,
## suffixed with the string '_topk_aggregate'
# add_aggregate_fields = []
Expand Down

0 comments on commit 841e971

Please sign in to comment.