-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Normalization pipeline aggregations #51005
Labels
:Analytics/Aggregations
Aggregations
>feature
Team:Analytics
Meta label for analytical engine team (ESQL/Aggs/Geo)
Top Ask
Comments
Pinging @elastic/es-analytics-geo (:Analytics/Aggregations) |
rjernst
added
the
Team:Analytics
Meta label for analytical engine team (ESQL/Aggs/Geo)
label
May 4, 2020
talevy
added a commit
to talevy/elasticsearch
that referenced
this issue
May 8, 2020
This aggregation will perform normalizations of metrics for a given series of data in the form of bucket values. The aggregations supports the following normalizations - rescale 0-1 - rescale 0-100 - percentage of sum - mean normalization - z-score normalization - softmax normalization To specify which normalization is to be used, it can be specified in the normalize agg's `normalizer` field. For example: ``` { "normalize": { "buckets_path": <>, "normalizer": "percent" } } ``` Closes elastic#51005.
talevy
added a commit
that referenced
this issue
May 14, 2020
This aggregation will perform normalizations of metrics for a given series of data in the form of bucket values. The aggregations supports the following normalizations - rescale 0-1 - rescale 0-100 - percentage of sum - mean normalization - z-score normalization - softmax normalization To specify which normalization is to be used, it can be specified in the normalize agg's `normalizer` field. For example: ``` { "normalize": { "buckets_path": <>, "normalizer": "percent" } } ``` Closes #51005.
talevy
added a commit
to talevy/elasticsearch
that referenced
this issue
May 14, 2020
This aggregation will perform normalizations of metrics for a given series of data in the form of bucket values. The aggregations supports the following normalizations - rescale 0-1 - rescale 0-100 - percentage of sum - mean normalization - z-score normalization - softmax normalization To specify which normalization is to be used, it can be specified in the normalize agg's `normalizer` field. For example: ``` { "normalize": { "buckets_path": <>, "normalizer": "percent" } } ``` Closes elastic#51005.
29 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
:Analytics/Aggregations
Aggregations
>feature
Team:Analytics
Meta label for analytical engine team (ESQL/Aggs/Geo)
Top Ask
This proposal is for one (or several) pipeline aggs that can perform normalization of the metrics. For example, given the series of data:
A user might want to normalize those in different ways:
[-1, -1, -0.77, 1, -0.77, -0.33]
[0, 0, 11.11, 100, 11.11, 33.33]
[5%, 5%, 10%, 50%, 10%, 20%]
[4.63, 4.63, 9.63, 49.63, 9.63, 9.63, 19.63]
[-0.68, -0.68, -0.39, 1.94, -0.39, 0.19]
[2.862E-20, 2.862E-20, 4.248E-18, 0.999, 9.357E-14, 4.248E-18]
etc etc
The two obvious use-cases are rescaling values to a a
[0, 1]
range to make it easier to compare relative magnitudes, and normalizing to percentage of the sum for percentage charts.More advanced functions like z-score are useful for their statistical properties, softmax can handle negative numbers nicely, etc. But I'm not sure how useful they would be in practice, since this is operating over bucket values and not raw values (which is where normalization/centering/standardizing typically has value).
In any case, a pipeline agg could accept the values from a multi-bucket agg (like a date_histo) and perform the normalization to produce a new set of metrics. Unsure how the syntax would look. If it was a single-purpose agg (
percentage_of_sum
) it's easy. But if we want to build a multi-function agg that can perform multiple functions, we either need a selectable function or something like MovingFunction where the user specifies a script (with helper methods)The text was updated successfully, but these errors were encountered: