Normalization pipeline aggregations #51005

polyfractal · 2020-01-14T21:05:11Z

This proposal is for one (or several) pipeline aggs that can perform normalization of the metrics. For example, given the series of data:

[5, 5, 10, 50, 10, 20]

A user might want to normalize those in different ways:

Rescale [-1, 1]
- [-1, -1, -0.77, 1, -0.77, -0.33]
Rescale [0, 100]
- [0, 0, 11.11, 100, 11.11, 33.33]
Percentage of sum [0, 100%]
- [5%, 5%, 10%, 50%, 10%, 20%]
Mean normalization
- [4.63, 4.63, 9.63, 49.63, 9.63, 9.63, 19.63]
Z-score normalization (mean of zero, stdev of 1)
- [-0.68, -0.68, -0.39, 1.94, -0.39, 0.19]
Softmax (0-1 range, sum to 1, larger values have more weight)
- [2.862E-20, 2.862E-20, 4.248E-18, 0.999, 9.357E-14, 4.248E-18]

etc etc

The two obvious use-cases are rescaling values to a a [0, 1] range to make it easier to compare relative magnitudes, and normalizing to percentage of the sum for percentage charts.

More advanced functions like z-score are useful for their statistical properties, softmax can handle negative numbers nicely, etc. But I'm not sure how useful they would be in practice, since this is operating over bucket values and not raw values (which is where normalization/centering/standardizing typically has value).

In any case, a pipeline agg could accept the values from a multi-bucket agg (like a date_histo) and perform the normalization to produce a new set of metrics. Unsure how the syntax would look. If it was a single-purpose agg (percentage_of_sum) it's easy. But if we want to build a multi-function agg that can perform multiple functions, we either need a selectable function or something like MovingFunction where the user specifies a script (with helper methods)

The text was updated successfully, but these errors were encountered:

elasticmachine · 2020-01-14T21:05:14Z

Pinging @elastic/es-analytics-geo (:Analytics/Aggregations)

This aggregation will perform normalizations of metrics for a given series of data in the form of bucket values. The aggregations supports the following normalizations - rescale 0-1 - rescale 0-100 - percentage of sum - mean normalization - z-score normalization - softmax normalization To specify which normalization is to be used, it can be specified in the normalize agg's `normalizer` field. For example: ``` { "normalize": { "buckets_path": <>, "normalizer": "percent" } } ``` Closes elastic#51005.

This aggregation will perform normalizations of metrics for a given series of data in the form of bucket values. The aggregations supports the following normalizations - rescale 0-1 - rescale 0-100 - percentage of sum - mean normalization - z-score normalization - softmax normalization To specify which normalization is to be used, it can be specified in the normalize agg's `normalizer` field. For example: ``` { "normalize": { "buckets_path": <>, "normalizer": "percent" } } ``` Closes #51005.

This aggregation will perform normalizations of metrics for a given series of data in the form of bucket values. The aggregations supports the following normalizations - rescale 0-1 - rescale 0-100 - percentage of sum - mean normalization - z-score normalization - softmax normalization To specify which normalization is to be used, it can be specified in the normalize agg's `normalizer` field. For example: ``` { "normalize": { "buckets_path": <>, "normalizer": "percent" } } ``` Closes elastic#51005.

$@polyfractal$ polyfractal added >feature :Analytics/Aggregations Aggregations labels Jan 14, 2020

$@polyfractal$ polyfractal added the team-discuss label Jan 15, 2020

wylieconlon mentioned this issue Feb 24, 2020

[Lens] Stacked to 100% option elastic/kibana#57389

Closed

nik9000 removed the team-discuss label Mar 2, 2020

$@polyfractal$ polyfractal added the Top Ask label Apr 9, 2020

talevy self-assigned this Apr 30, 2020

rjernst added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label May 4, 2020

talevy mentioned this issue May 8, 2020

Add Normalize Pipeline Aggregation #56399

Merged

talevy closed this as completed in #56399 May 14, 2020

russcam mentioned this issue Jul 23, 2020

7.9.0 Meta ticket elastic/elasticsearch-net#4872

Closed

29 tasks

ChrisHegarty unassigned talevy Oct 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Normalization pipeline aggregations #51005

Normalization pipeline aggregations #51005

polyfractal commented Jan 14, 2020

elasticmachine commented Jan 14, 2020

Normalization pipeline aggregations #51005

Normalization pipeline aggregations #51005

Comments

polyfractal commented Jan 14, 2020

elasticmachine commented Jan 14, 2020