fix: ordering terms aggregation on top metrics null values #85774

salvatore-campagna · 2022-04-11T08:39:56Z

If top_metrics buckets are never collected (i.e. a filter never
passing data to a top_metrics nested aggregator) sorting results
in an index out of bounds exception. The data array, filled at
collection time, is actually empty and we try to access it to do
the sorting using ordinals as indices. To be more precise, we do the
comparison when filling the priority queue used to build aggregation
results. The first insertion is successful since the priority queue is empty
and there is no comparison to perform. Anyway, as soon as we insert
the second element in the priority queue, the comparator is invoked
and access to the empty array takes place.

Here we use the special Double.NaN value to report missing data
for the top_metrics metric field. This happens for empty or null
fields. The comparator we use is a 'NaN-aware' comparator.
As a consequence, data is sorted using another comparator,
the default '_key' comparator added when creating the compound
comparator.

This is a fix for #85127.

If top metrics buckets are never collected (i.e. a filter never passing data to a top metrics aggregator) ordering results in an index out of bounds exception. THis is because the data array expected to include data is actually empty. Here we use the special Double.NaN value to report missing data to sort on for the top metrics metric field. This results in the comparator ordering data using another comparator (the default '_key' comprator).

elasticmachine · 2022-04-11T08:40:46Z

Pinging @elastic/es-analytics-geo (Team:Analytics)

elasticsearchmachine · 2022-04-11T08:41:21Z

Hi @salvatore-campagna, I've created a changelog YAML for you.

salvatore-campagna · 2022-04-11T10:07:04Z

...k/plugin/src/yamlRestTest/resources/rest-api-spec/test/analytics/nested_top_metrics_sort.yml

+  - match: { aggregations.name.buckets.4.field_exists.top_metrics.top.0.sort.0: "2021-11-09T17:29:00.000Z" }
+  - match: { aggregations.name.buckets.4.field_exists.top_metrics.top.0.metrics.version: 1 }
+
+---


The last two tests are here just to highlight that the behaviour of the top_metric aggregation changes with the value of rewrite_to_filter_by_filter. This is not correct, we would like to have consistent behaviour. I will submit an issue to track this and point the issue to these two tests (and their counterpart where the setting is true). I am not going to fix this issue in this PR.

I will create the issue after merging this PR so that I can reference the two tests in the ticket.

salvatore-campagna · 2022-04-11T10:28:32Z

@elasticmachine update branch

elasticmachine · 2022-04-11T11:44:08Z

Pinging @elastic/clients-team (Team:Clients)

salvatore-campagna · 2022-04-11T13:12:21Z

...k/plugin/src/yamlRestTest/resources/rest-api-spec/test/analytics/nested_top_metrics_sort.yml

+              terms:
+                field: name
+                order:
+                  "field_exists>top_metrics[unknown_metric]": asc


Will change this to actually use the unknown_metric in the top_metrics aggregations rather than the filter.

salvatore-campagna · 2022-04-11T13:12:53Z

...k/plugin/src/yamlRestTest/resources/rest-api-spec/test/analytics/nested_top_metrics_sort.yml

+              terms:
+                field: name
+                order:
+                  "field_exists>top_metrics[unknown_metric]": asc


salvatore-campagna · 2022-04-11T13:50:54Z

@elasticmachine update branch

salvatore-campagna · 2022-04-12T07:57:13Z

@elasticmachine update branch

not-napoleon

LGTM

* master: (104 commits) fix: ordering terms aggregation on top metrics null values (elastic#85774) Fix up whitespace error introduced in elastic#85948 More docs re. removing cluster.initial_master_nodes (elastic#85948) [Test] Remove API key methods from HLRC (elastic#85802) Remove references to bootstrap.system_call_filter (elastic#85964) Move docker cgroup override to SystemJvmOptions (elastic#85960) Add connection accounting tests (elastic#85966) Remove MacOS from platform support testing matrix Remove custom KnnVectorFieldExistsQuery (elastic#85945) Relax data path deprecations from critical to warn (elastic#85952) Remove hppc from some "common" classes (elastic#85957) Move docker env var settings handling out of bash (elastic#85913) Remove hppc from task manager (elastic#85889) [ML] rename trained model allocations to assignments (elastic#85503) Remove hppc from multi*shard request and responses (elastic#85888) Consolidating logging initialization in cli launcher (elastic#85920) Convert license tools to use unified cli entrypoint (elastic#85919) Add noop detection to node shutdown actions (elastic#85914) Adjust SQL expended test output TSDB: Add timestamp provider to AggregationExecutionContext (elastic#85850) ... # Conflicts: # server/src/main/java/org/elasticsearch/search/aggregations/AggregationExecutionContext.java

elasticsearchmachine added the v8.3.0 label Apr 11, 2022

salvatore-campagna added the :Analytics/Aggregations Aggregations label Apr 11, 2022

elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Apr 11, 2022

salvatore-campagna added >bug and removed Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) labels Apr 11, 2022

Update docs/changelog/85774.yaml

73269ab

salvatore-campagna added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Apr 11, 2022

salvatore-campagna added 4 commits April 11, 2022 10:43

test: remove test as a result of using existing yaml tests

c0990b0

fix: move top_metrics tests to x-pack

0b93430

fix: duplicate yaml test name

60c41a3

fix: duplicate yaml test name

abb5fe2

salvatore-campagna commented Apr 11, 2022

View reviewed changes

Merge branch 'master' into fix/85127-top-metrics-index-out-of-bounds

5cb7f45

sethmlarson added the Team:Clients Meta label for clients team label Apr 11, 2022

fix: move index out of body inside search

f7c0df8

salvatore-campagna commented Apr 11, 2022

View reviewed changes

fix: use the unknown_metric in top_metrics agg

db8d5e0

elasticmachine and others added 3 commits April 11, 2022 23:20

Merge branch 'master' into fix/85127-top-metrics-index-out-of-bounds

c61d11a

fix: use the unknown_metric in top_metrics agg

c6d9edb

fix: use actual field names for non existing metrics

bbbcf94

elasticmachine and others added 3 commits April 12, 2022 17:27

Merge branch 'master' into fix/85127-top-metrics-index-out-of-bounds

440b2c2

fix: code format

fc02ec5

fix: escape square brakets

6d6e019

salvatore-campagna requested review from nik9000 and not-napoleon April 12, 2022 09:53

not-napoleon approved these changes Apr 15, 2022

View reviewed changes

salvatore-campagna merged commit 3b78a4c into elastic:master Apr 19, 2022

yan4105 mentioned this pull request Apr 29, 2022

Better message in guessRootCauses #86280

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: ordering terms aggregation on top metrics null values #85774

fix: ordering terms aggregation on top metrics null values #85774

salvatore-campagna commented Apr 11, 2022 •

edited

Loading

elasticmachine commented Apr 11, 2022

elasticsearchmachine commented Apr 11, 2022

salvatore-campagna Apr 11, 2022 •

edited

Loading

salvatore-campagna Apr 12, 2022

salvatore-campagna commented Apr 11, 2022

elasticmachine commented Apr 11, 2022

salvatore-campagna Apr 11, 2022

salvatore-campagna Apr 11, 2022

salvatore-campagna commented Apr 11, 2022

salvatore-campagna commented Apr 12, 2022

not-napoleon left a comment

fix: ordering terms aggregation on top metrics null values #85774

fix: ordering terms aggregation on top metrics null values #85774

Conversation

salvatore-campagna commented Apr 11, 2022 • edited Loading

elasticmachine commented Apr 11, 2022

elasticsearchmachine commented Apr 11, 2022

salvatore-campagna Apr 11, 2022 • edited Loading

Choose a reason for hiding this comment

salvatore-campagna Apr 12, 2022

Choose a reason for hiding this comment

salvatore-campagna commented Apr 11, 2022

elasticmachine commented Apr 11, 2022

salvatore-campagna Apr 11, 2022

Choose a reason for hiding this comment

salvatore-campagna Apr 11, 2022

Choose a reason for hiding this comment

salvatore-campagna commented Apr 11, 2022

salvatore-campagna commented Apr 12, 2022

not-napoleon left a comment

Choose a reason for hiding this comment

salvatore-campagna commented Apr 11, 2022 •

edited

Loading

salvatore-campagna Apr 11, 2022 •

edited

Loading