Add setting to disable aggs optimization #73589

nik9000 · 2021-06-01T12:39:10Z

Sometimes our fancy "run this agg as a Query" optimizations end up
slower than running the aggregation in the old way. We know that and use
heuristics to dissable the optimization in that case. But it turns out
that the process of running the heuristics itself can be slow, depending
on the query. Worse, changing the heuristics requires an upgrade, which
means waiting. If the heurisics make a terrible choice folks need a
quick way out. This adds such a way: a cluster level setting that
contains a list of queries that are considered "too expensive" to try
and optimize. If the top level query contains any of those queries we'll
disable the "run as Query" optimization.

The default for this settings is wildcard and term-in-set queries, which
is fairly conservative. There are certainly wildcard and term-in-set
queries that the optimization works well with, but there are other queries
of that type that it works very badly with. So we're being careful.

Better, you can modify this setting in a running cluster to disable the
optimization if we find a new type of query that doesn't work well.

Closes #73426

Sometimes our fancy "run this agg as a Query" optimizations end up slower than running the aggregation in the old way. We know that and use heuristics to dissable the optimization in that case. But it turns out that the process of running the heuristics itself can be slow, depending on the query. Worse, changing the heuristics requires an upgrade, which means waiting. If the heurisics make a terrible choice folks need a quick way out. This adds such a way: a cluster level setting that contains a list of queries that are considered "too expensive" to try and optimize. If the top level query contains any of those queries we'll disable the "run as Query" optimization. The default for this settings is wildcard and term-in-set queries, which is fairly conservative. There are certainly wildcard and term-in-set queries that the optimization works well with, but there are other queries of that type that it works very badly with. So we're being careful. Better, you can modify this setting in a running cluster to disable the optimization if we find a new type of query that doesn't work well. Closes elastic#73426

nik9000 · 2021-06-01T12:39:41Z

I need to reread this again and add a few more tests, but I'd like to get this out there.

nik9000 · 2021-06-01T12:48:38Z

I'd like to add an integration test for this in the terms aggregator - that'd let us make sure the settings works.

iverase · 2021-06-01T13:31:55Z

...rnalClusterTest/java/org/elasticsearch/search/profile/aggregation/AggregationProfilerIT.java

 import java.util.stream.Collectors;

+import static io.github.nik9000.mapmatcher.MapMatcher.assertMap;
+import static io.github.nik9000.mapmatcher.MapMatcher.matchesMap;


Oops, that should not be here

nik9000 · 2021-06-01T13:52:21Z

It looks like there is another class of issue around these optimization that comes from the query being very very large. We don't copy the query, but we do make its weight many times. I'd expected that these weights would be cached. But that's not always guaranteed. I think in a follow up change we should skip this optimization also based on the size of the query. We should also do more to reduce the re-preparation of the weights. That's a little more complicated though. Lucene has some hard won heuristics around what to cache when and we'd have to be careful there. I think it's best to do the simplest thing.

nik9000 · 2021-06-01T14:30:22Z

After talking with some folks it turns out that the setting is too precise and difficult to use. And might cause issues on upgrade. And who knows what else. It was clever, but not worth it.

nik9000 force-pushed the disable_filters_opt_on_some branch from 0775318 to 810b282 Compare June 1, 2021 12:41

Explain

810b282

iverase reviewed Jun 1, 2021

View reviewed changes

Merge branch 'master' into disable_filters_opt_on_some

55a31cf

nik9000 closed this Jun 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add setting to disable aggs optimization #73589

Add setting to disable aggs optimization #73589

nik9000 commented Jun 1, 2021

nik9000 commented Jun 1, 2021

nik9000 commented Jun 1, 2021

iverase Jun 1, 2021

nik9000 Jun 1, 2021

nik9000 commented Jun 1, 2021

nik9000 commented Jun 1, 2021

Add setting to disable aggs optimization #73589

Add setting to disable aggs optimization #73589

Conversation

nik9000 commented Jun 1, 2021

nik9000 commented Jun 1, 2021

nik9000 commented Jun 1, 2021

iverase Jun 1, 2021

Choose a reason for hiding this comment

nik9000 Jun 1, 2021

Choose a reason for hiding this comment

nik9000 commented Jun 1, 2021

nik9000 commented Jun 1, 2021