[Rollup] Support for data-structure based metrics (Cardinality, Percentiles, etc) #33214

polyfractal · 2018-08-28T17:02:15Z

We would like to support more complex metrics in Rollup such as cardinality, percentiles and percentile ranks. These are trickier since they are calculated from data sketches rather than simple numerics.

They also introduce issues with backwards compatibility. If the algorithm powering the sketch changes in the future (improvements, bug-fixes, etc) we will likely have to continue supporting the old versions of the algorithm. It's unlikely that these sketches will be "upgradable" to the new version since they are lossy by nature.

I see two approaches to implementing these types of metrics:

New data types

In the first approach, we implement new data types in the Rollup plugin. Similar to the hash, geo or completion data types, these would expect input data to adhere to some kind of complex format. Internally it would be stored as a compressed representation that could be used to build the sketch (e.g. a long[] which could be used to build a HLL sketch).

The pro's are strong validation and making it easier for aggregations to work with the data. Another large positive is that it allows external clients to provide pre-built sketches as long as they follow the correct format. For example, edge-nodes may be collecting and aggregating data locally and just want to send the sketch.

The cons are considerably more work implementing the data types. It may also not be ideal to expose these data structures outside Rollup, since they carry the aforementioned bwc baggage.

Convention-only types

Alternatively, we could implement these entirely by convention (like the rest of Rollup). E.g. a binary field can be used to hold the appropriate data sketch, and we just use field naming to convey the meaning. Versioning can be done with a secondary field.

The advantage is much less upfront work...we can just serialize into fields and we're off. It also limits the impact of these data types, since only Rollup will be equipped to deal with the convention (less likely for a user to accidentally use one and then run into trouble later).

Big downside is that external clients will have a more difficult time providing pre-built sketches, since the format is just convention and won't be validated until search time. It also feels a bit more fragile since it is another convention to maintain.

BWC

In both cases, Rollup will probably have to maintain a catalog of "old" algorithms so that historical rollup indices can continue to function. Not ideal, but given that these algos don't change super often it's probably an ok burden to bear.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2018-08-28T17:02:17Z

Pinging @elastic/es-search-aggs

polyfractal · 2018-10-19T13:58:25Z

Small update.

We discussed this some more and decided the likely approach will be explicit data types (implementing field mappers, etc). This allows stronger validation, and better control over the experience
We will probably start with percentiles, HDRHistogram in particular, because it is the simplest and most robust sketch to implement. There are also fewer forward-compatibility concerns, since the sketch itself could be translated into a new algorithm with some bounded error (because the sketch is really just a histogram of buckets arranged in a special manner)
A "pre-aggregated percentiles" query will need to be implemented at the same time, otherwise the datastructure is useless

pmoust · 2018-12-16T03:23:12Z

Relates: #24468

painslie · 2019-03-29T03:16:58Z

Hi @polyfractal, do you know when this slated to go into production?

polyfractal · 2019-03-29T21:17:38Z

Hi @painslie, I'm afraid I do not have an update. We'll update this issue when there's more information, or link to it from a PR.

pcsanwald · 2019-04-29T21:36:14Z

@polyfractal I'm curious how well the promethus histogram would line up with what you're thinking?

polyfractal · 2019-04-30T12:06:59Z

HDRHistogram is essentially just a clever layout of different-sized intervals: a set of exponentially-sized intervals, with a fixed number of linear intervals inside each exponential "level". But at it's heart, it's still just a histogram of counts like Prometheus histos (and unlike algos like TDigest which are weighted centroids, etc).

So it should be possible to translate a Prometheus histogram into an HDRHisto. Prometheus histos have user-definable intervals, which means the accuracy of translation will depend on how nicely the Promtheus histos line up with the HDRHisto intervals. I think any Prometheus histo should be convertible, and the accuracy of that conversion depends on the exact layout.

Prometheus Summaries are an implementation of Targeted Quantiles and will be much harder to use. The output of a summary is just a percentile estimation at that point in time, which is mostly useless to us. It might be possible to convert the underlying Targeted Quantiles sketch into a TDigest since the algos share some similarities, but I suspect it won't give great accuracy. I've been told summaries aren't as common either compared to Histos, so also probably not a priority.

With all that said, it's still not entirely clear how a user will convert a prometheus (or any other system's histogram output) into our datastructure. I'm kinda thinking an ingest processor would make the most sense, slurping up a prometheus histo and emitting a compatible HDRHisto-field. But I haven't spent a lot of time thinking about the ergonomics of that yet. :)

vipul657 · 2019-05-08T09:19:06Z

Hi @polyfractal is there any ticket for adding weighted average support in pack rollups?

amontalenti · 2019-10-23T16:37:18Z

@polyfractal A quick update here. @kbourgoin and I have implemented a custom field type for serialized HLL rollups in the ES index, along with a corresponding aggregation query that works much like cardinality, but de-serializes and merges multiple serialized document-stored HLL blobs. We've built it as a proper Elasticsearch plugin and presented it in NYC at a local Elasticsearch meetup yesterday, and it's almost ready for review by Elastic folks. I'll be writing up my slides into a technical blog post, as well, so people can try it out. It's not quite ready for production, but it's getting there. Would be good to sync up about this, as I'm sure it can help inform the similar approach for HDRHistogram and percentiles.

jpountz · 2019-10-23T18:06:25Z

Excellent. We have had some discussions on our end as well on what the API and implementation could look like for a histogram field for percentile aggregations and a HLL++ field for cardinality aggregations. I suspect both impls will end up looking similar. :) cc @iverase

polyfractal · 2020-01-10T14:32:16Z

Small note: histograms have been implemented in #48580 (:tada:). Support in Rollup is still pending... we may want to wait for #42720

wchaparro · 2023-06-23T19:16:16Z

We will plan to build this support in Downsampling. Support for Histograms in Downsampling is pending, Design is in place, ready to be prioritized as soon as we have availability.

With the 8.7 release of Elasticsearch, we have made a new downsampling capability associated with the new time series datastreams functionality generally available (GA). This capability was in tech preview in ILM since 8.5. Downsampling provides a method to reduce the footprint of your time series data by storing it at reduced granularity. The downsampling process rolls up documents within a fixed time interval into a single summary document. Each summary document includes statistical representations of the original data: the min, max, sum, value_count, and average for each metric. Data stream time series dimensions are stored unchanged.

Downsampling is superior to rollup because:

Downsampled indices are searched through the _search API
It is possible to query multiple downsampled indices together with raw data indices
The pre-aggregation is based on the metrics and time series definitions in the index mapping so very little configuration is required (i.e. much easier to add new time serieses)
Downsampling is managed as an action in ILM
It is possible to downsample a downsampled index, and reduce granularity as the index ages
The performance of the pre-aggregation process is superior in downsampling, as it builds on the time_series index mode infrastructure

Because of the introduction of this new capability, we are deprecating the rollups functionality, which never left the Tech Preview/Experimental status, in favor of downsampling and thus we are closing this issue. We encourage you to migrate your solution to downsampling and take advantage of the new TSDB functionality.

lasseschou · 2024-03-04T10:20:34Z

@wchaparro the new downsampling feature looks great, but it still doesn't support percentiles. Downsampling a fixed set of percentiles such as median, 75th, 90th, 95th and 99th, is a very common use case for reporting latencies so I bet a lot of ElasticSearch users could benefit for having percentiles in the downsample feature.

$@polyfractal$ polyfractal added >enhancement :StorageEngine/Rollup Turn fine-grained time-based data into coarser-grained data labels Aug 28, 2018

$@polyfractal$ polyfractal mentioned this issue Oct 19, 2018

Store pre-computed HyperLogLog compatible with Cardinality Aggregation #24468

Closed

$@polyfractal$ polyfractal mentioned this issue Mar 11, 2019

[Rollup] Support other metrics (percentiles) #39912

Closed

colings86 added the 7x label Apr 12, 2019

jpountz mentioned this issue Apr 24, 2019

Add a weight for documents in percentiles aggregations #41479

Closed

axw mentioned this issue May 2, 2019

Add GC summary metrics elastic/apm-agent-go#161

Open

$@polyfractal$ polyfractal mentioned this issue May 30, 2019

Refactor rollups meta (AKA Rollup V2) #42720

Closed

21 tasks

roncohen mentioned this issue Jul 1, 2019

Use APM metrics to introduce low-fi data layer for space reduction elastic/apm#104

Closed

iverase mentioned this issue Oct 28, 2019

Pre-aggregated field mappers #48578

Open

$@polyfractal$ polyfractal removed the 7x label Dec 12, 2019

rjernst added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label May 4, 2020

$@polyfractal$ polyfractal mentioned this issue Sep 3, 2020

Support from/to directly in the new Histogram Aggregation #61410

Closed

wchaparro closed this as not planned Won't fix, can't repro, duplicate, stale Jun 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Rollup] Support for data-structure based metrics (Cardinality, Percentiles, etc) #33214

[Rollup] Support for data-structure based metrics (Cardinality, Percentiles, etc) #33214

polyfractal commented Aug 28, 2018

elasticmachine commented Aug 28, 2018

polyfractal commented Oct 19, 2018

pmoust commented Dec 16, 2018

painslie commented Mar 29, 2019 •

edited

Loading

polyfractal commented Mar 29, 2019

pcsanwald commented Apr 29, 2019

polyfractal commented Apr 30, 2019

vipul657 commented May 8, 2019

amontalenti commented Oct 23, 2019

jpountz commented Oct 23, 2019

polyfractal commented Jan 10, 2020 •

edited

Loading

wchaparro commented Jun 23, 2023

lasseschou commented Mar 4, 2024

[Rollup] Support for data-structure based metrics (Cardinality, Percentiles, etc) #33214

[Rollup] Support for data-structure based metrics (Cardinality, Percentiles, etc) #33214

Comments

polyfractal commented Aug 28, 2018

New data types

Convention-only types

BWC

elasticmachine commented Aug 28, 2018

polyfractal commented Oct 19, 2018

pmoust commented Dec 16, 2018

painslie commented Mar 29, 2019 • edited Loading

polyfractal commented Mar 29, 2019

pcsanwald commented Apr 29, 2019

polyfractal commented Apr 30, 2019

vipul657 commented May 8, 2019

amontalenti commented Oct 23, 2019

jpountz commented Oct 23, 2019

polyfractal commented Jan 10, 2020 • edited Loading

wchaparro commented Jun 23, 2023

lasseschou commented Mar 4, 2024

painslie commented Mar 29, 2019 •

edited

Loading

polyfractal commented Jan 10, 2020 •

edited

Loading