Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Transform] add support for percentile aggregation #51663

Closed
hendrikmuhs opened this issue Jan 30, 2020 · 2 comments · Fixed by #51808
Closed

[Transform] add support for percentile aggregation #51663

hendrikmuhs opened this issue Jan 30, 2020 · 2 comments · Fixed by #51808
Assignees

Comments

@hendrikmuhs
Copy link

add percentile support to Transform

Related: support for percentile ranks, stats, extended_stats, see separate issues

Because percentile output multiple values the transform will output the result as a nested object while the root name is configurable but the inner names are derived from the configuration. For percentiles with decimal places (99.9), the field will replace the . with an _ to not collide with nested objects. Renaming can be done using a ingest pipeline, e.g. to rename my_percentile.50 to my_median.

example configuration:

"my_percentile": {
        "percentiles": {
          "field": "bytes",
          "percents": [
            10,
            50,
            99.9
          ]
        }
      }

example output:

      "my_percentile" : {
        "99_9" : 9875.0,
        "50" : 5673.5,
        "10" : 604.6000000000001
      }

Values are default mapped to double.

Alternative Histogram

7.6 added a histogram datatype. Storing histograms and calculating percentiles on top of it has various advantages.

Transform should support the histogram agg and write the result into a histogram data type.

This alternative should be implemented in addition, especially for large cases, storing histograms allows updating without full re-processing.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml/Transform)

@hendrikmuhs
Copy link
Author

/CC @pkobziak - you might want to subscribe to this issue

@hendrikmuhs hendrikmuhs self-assigned this Jan 31, 2020
hendrikmuhs pushed a commit that referenced this issue Feb 4, 2020
make transform ready for multi value aggregations and add support for percentile

fixes #51663
hendrikmuhs pushed a commit that referenced this issue Feb 4, 2020
make transform ready for multi value aggregations and add support for percentile

fixes #51663
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants