Speedup time_series agg by caching current tsid ordinal, parent bucket ordinal and buck ordinal #91784

martijnvg · 2022-11-22T08:59:58Z

This avoids needlessly adding the same parent bucket ordinal or TSIDs to BytesKeyedBucketOrds.

Relates to #74660

…t ordinal and buck ordinal. This avoids needlessly adding the same parent bucket ordinal or TSIDs to `BytesKeyedBucketOrds`. Relates to elastic#74660

elasticsearchmachine · 2022-11-22T09:46:13Z

Pinging @elastic/es-analytics-geo (Team:Analytics)

jpountz · 2022-11-22T15:45:52Z

...ons/src/main/java/org/elasticsearch/aggregations/bucket/timeseries/TimeSeriesAggregator.java

            @Override
            public void collect(int doc, long bucket) throws IOException {
+                if (currentBucket == bucket && currentTsidOrd == aggCtx.getTsidOrd()) {


It looks like this aggregation is always called with bucket == 0 so maybe remove this currentBucket == bucket condition here and replace it with an assertion that bucket is zero?

I think that is only the case if time_series aggregation is a top level aggregation.
I don't think this is always the case and for example a date_histogram aggregation is a likely parent aggregation in the case of time_series aggregation.

Ah, that makes sense. Then maybe the currentBucket == bucket condition is a bit fragile as it would prevent the optimization from kicking in if this collector is called with different bucket ordinals even though it the TSID doesn't change. What about changing currentTsidOrd and currentBucketOrdinal to arrays that are keyed by bucket, ie. tracking separately the current TSID and bucket ordinal per bucket?

I think the case we know there isn't a parent aggregation then we can return a leaf bucket collector that only does the currentTsidOrd == aggCtx.getTsidOrd() check.

What about changing currentTsidOrd and currentBucketOrdinal to arrays that are keyed by bucket, ie. tracking separately the current TSID and bucket ordinal per bucket?

We'd need a BigArray thing because we can have many buckets above us and we'd want to track the memory. It could be useful though. If you ever "come back" to the same bucket. But I don't think, say, date_histogram will. I think once it's finished collecting some date for the time series it'll never come back.

I think the case we know there isn't a parent aggregation then we can return a leaf bucket collector that only does the currentTsidOrd == aggCtx.getTsidOrd() check.

That's what the CardinalityUpperBound bucketCardinality argument to the ctor is for. If it's exactly one you know you don't have a parent agg or it always makes a single bucket.

Actually things should be fine with a terms aggregation as a parent too, because almost all the time this terms aggregation would run on a field that is a dimension or a tag. So bucket would only change when tsidOrd changes too?

Yes, I think that is true if the field of the terms agg is a dimension field. But maybe not in the case if the field is a non dimension field (label field)?

Right. My intuition is that we don't have to care much about this case of labels that could take different values for the same TSID (as long as the aggregation output is correct, just less optimized), do we?

My intuition is that we don't have to care much about this case of labels that could take different values for the same TSID (as long as the aggregation output is correct, just less optimized), do we?

This is what I think as well. This change favours parent aggs based on dimension fields, @timestamp or no parent aggregation. For the other cases, this change doesn't help, but I think doesn't hurt either. We are always correct, since we then fall back to checking whether we have seen tsid & bucket combination in BytesKeyedBucketOrds.

I added a comment: f23de49

nik9000 · 2022-11-22T19:41:55Z

...ons/src/main/java/org/elasticsearch/aggregations/bucket/timeseries/TimeSeriesAggregator.java

            @Override
            public void collect(int doc, long bucket) throws IOException {
+                if (currentBucket == bucket && currentTsidOrd == aggCtx.getTsidOrd()) {


What about changing currentTsidOrd and currentBucketOrdinal to arrays that are keyed by bucket, ie. tracking separately the current TSID and bucket ordinal per bucket?

We'd need a BigArray thing because we can have many buckets above us and we'd want to track the memory. It could be useful though. If you ever "come back" to the same bucket. But I don't think, say, date_histogram will. I think once it's finished collecting some date for the time series it'll never come back.

I think the case we know there isn't a parent aggregation then we can return a leaf bucket collector that only does the currentTsidOrd == aggCtx.getTsidOrd() check.

That's what the CardinalityUpperBound bucketCardinality argument to the ctor is for. If it's exactly one you know you don't have a parent agg or it always makes a single bucket.

nik9000 · 2022-11-22T19:44:14Z

...ons/src/main/java/org/elasticsearch/aggregations/bucket/timeseries/TimeSeriesAggregator.java

+                    collectExistingBucket(sub, doc, currentBucketOrdinal);
+                    return;
+                }
+
                long bucketOrdinal = bucketOrds.add(bucket, aggCtx.getTsid());


If we have a tsidOrd can we use that as a key instead of the bytes?

So we need the tsid as bytes in the buildAggregations() method and I don't think we can use this tsidord that TimeSeriesIndexSearcher generates as a lookup to get the tsid?

Indeed this tsidord is neither a segment ordinal nor a global ordinal but an ordinal for TSIDs that intersect with the query on this shard. So it can't be used as a key to retrieve TSIDs from the terms dictionary. (Separately, maybe we should rename this method to reduce chances that someone mistakenly uses it as an ordinal for the _tsid terms dict? I worry a bit that it would look like it works for common cases like the query only consists of a range query on the @timestamp and this time range happens to match every unique TSID, so tests might not even catch the problem if someone started doing this?)

Let's rename the method in a follow up change?

…nhencement

jpountz

Looks great.

…nhencement

martijnvg · 2022-11-30T07:47:14Z

Benchmarking this change to time series aggregation by using the query added in elastic/rally-tracks#348 results in a good improvement:

|                                                Min Throughput | date-histo-with-time-series-1h |     0.036549    |     0.0854548   |      0.04891 |  ops/s | +133.81% |
|                                               Mean Throughput | date-histo-with-time-series-1h |     0.0366182   |     0.0855126   |      0.04889 |  ops/s | +133.53% |
|                                             Median Throughput | date-histo-with-time-series-1h |     0.0366221   |     0.0854992   |      0.04888 |  ops/s | +133.46% |
|                                                Max Throughput | date-histo-with-time-series-1h |     0.0366767   |     0.0856078   |      0.04893 |  ops/s | +133.41% |
|                                       50th percentile latency | date-histo-with-time-series-1h | 27112.2         | 11575           | -15537.2     |     ms |  -57.31% |
|                                       90th percentile latency | date-histo-with-time-series-1h | 27310.6         | 11748.9         | -15561.7     |     ms |  -56.98% |
|                                       99th percentile latency | date-histo-with-time-series-1h | 27600           | 11901.7         | -15698.4     |     ms |  -56.88% |
|                                      100th percentile latency | date-histo-with-time-series-1h | 27797.9         | 12036.8         | -15761.1     |     ms |  -56.70% |
|                                  50th percentile service time | date-histo-with-time-series-1h | 27112.2         | 11575           | -15537.2     |     ms |  -57.31% |
|                                  90th percentile service time | date-histo-with-time-series-1h | 27310.6         | 11748.9         | -15561.7     |     ms |  -56.98% |
|                                  99th percentile service time | date-histo-with-time-series-1h | 27600           | 11901.7         | -15698.4     |     ms |  -56.88% |
|                                 100th percentile service time | date-histo-with-time-series-1h | 27797.9         | 12036.8         | -15761.1     |     ms |  -56.70% |
|                                                    error rate | date-histo-with-time-series-1h |     0           |     0           |      0       |      % |    0.00% |

* upstream/main: (209 commits) Remove unused methods and classes from HLRC (elastic#92030) Clean up on exception while chunking XContent (elastic#92024) Add profiling plugin (elastic#91640) Remove unused methods and classes from HLRC (elastic#92012) Remove IndexerState from HLRC (elastic#92023) Ensure cached time elapses in ClusterServiceIT (elastic#91986) Chunked encoding for RestGetIndicesAction (elastic#92016) Simplify shardsWithState (elastic#91991) [DOCS] Updates ML decider docs by mentioning CPU as scaling criterion (elastic#92018) Add chunking to ClusterState.Custom impls (elastic#91963) Speedup time_series agg by caching current tsid ordinal, parent bucket ordinal and buck ordinal (elastic#91784) Drop the ingest listener call count tracking (elastic#92003) [DOCS] fixes issue number 91889 - missing [discrete] header (elastic#91976) Fix PersistentTasksClusterServiceTests (elastic#92002) [docs] Update search-settings documentation to reflect the fact that the indices.query.bool.max_clause_count setting has been deprecated (elastic#91811) Clarify writability in Netty4HttpPipeliningHandler (elastic#91982) Load stable plugins as synthetic modules (elastic#91869) Handle any exception thrown while generating source for an IngestDocument (elastic#91981) fixing Apache HttpHost url on java-rest doc (elastic#91945) Implement repair functionality for aliases colliding with indices bug (elastic#91887) ...

Speedup time_series agg by caching current tsid ordinal, parent bucke…

7157974

…t ordinal and buck ordinal. This avoids needlessly adding the same parent bucket ordinal or TSIDs to `BytesKeyedBucketOrds`. Relates to elastic#74660

elasticsearchmachine added the v8.7.0 label Nov 22, 2022

martijnvg marked this pull request as ready for review November 22, 2022 09:45

martijnvg added >non-issue :StorageEngine/TSDB You know, for Metrics labels Nov 22, 2022

elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Nov 22, 2022

martijnvg assigned jpountz Nov 22, 2022

jpountz reviewed Nov 22, 2022

View reviewed changes

nik9000 reviewed Nov 22, 2022

View reviewed changes

martijnvg added 2 commits November 23, 2022 13:31

Merge remote-tracking branch 'es/main' into time_series_aggregation_e…

5cfd652

…nhencement

added comment

f23de49

jpountz approved these changes Nov 23, 2022

View reviewed changes

martijnvg added 2 commits November 25, 2022 17:31

Merge remote-tracking branch 'es/main' into time_series_aggregation_e…

f3edf86

…nhencement

Merge remote-tracking branch 'es/main' into time_series_aggregation_e…

d395a01

…nhencement

martijnvg merged commit da119b0 into elastic:main Nov 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speedup time_series agg by caching current tsid ordinal, parent bucket ordinal and buck ordinal #91784

Speedup time_series agg by caching current tsid ordinal, parent bucket ordinal and buck ordinal #91784

martijnvg commented Nov 22, 2022

elasticsearchmachine commented Nov 22, 2022

jpountz Nov 22, 2022

martijnvg Nov 22, 2022

jpountz Nov 22, 2022

martijnvg Nov 22, 2022

nik9000 Nov 22, 2022

jpountz Nov 23, 2022

martijnvg Nov 23, 2022

jpountz Nov 23, 2022

martijnvg Nov 23, 2022

martijnvg Nov 23, 2022

nik9000 Nov 22, 2022

nik9000 Nov 22, 2022

martijnvg Nov 23, 2022

jpountz Nov 23, 2022

martijnvg Nov 23, 2022

jpountz left a comment

martijnvg commented Nov 30, 2022

Speedup time_series agg by caching current tsid ordinal, parent bucket ordinal and buck ordinal #91784

Speedup time_series agg by caching current tsid ordinal, parent bucket ordinal and buck ordinal #91784

Conversation

martijnvg commented Nov 22, 2022

elasticsearchmachine commented Nov 22, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jpountz left a comment

Choose a reason for hiding this comment

martijnvg commented Nov 30, 2022