From 68a2cb255979d02e181108e6d12475837744b9c8 Mon Sep 17 00:00:00 2001 From: Nik Everett <nik9000@gmail.com> Date: Mon, 1 Nov 2021 17:07:31 -0400 Subject: [PATCH] Rework docs for the `size` of `terms` agg (#79205) The `terms` agg picks the top `size` terms in a single scatter/gather pass across all the shards. For the default `order` and if you `order` by `_key` this works quite well. Some errors creep in, but it's fairly easy to point to them and understand them. But ordering by doc count ascending is like inviting the error vampire into your agg. It's super easy to get inaccurate results. This updates the docs to be more stark about it. Closes #72684 --- .../bucket/terms-aggregation.asciidoc | 118 ++++++++++-------- .../percolator/QueryBuilderStoreTests.java | 2 +- .../query/CoordinatorRewriteContext.java | 6 +- .../CoordinatorRewriteContextProvider.java | 18 +-- .../index/query/QueryRewriteContext.java | 12 +- .../index/query/SearchExecutionContext.java | 11 +- .../index/query/WrapperQueryBuilder.java | 6 +- .../elasticsearch/indices/IndicesService.java | 8 +- .../CompletionSuggestionBuilder.java | 16 +-- .../suggest/phrase/PhraseSuggester.java | 3 +- .../CanMatchPreFilterSearchPhaseTests.java | 4 +- .../index/mapper/DateFieldTypeTests.java | 6 +- .../AggregatorFactoriesTests.java | 8 +- .../aggregations/bucket/FiltersTests.java | 14 +-- .../builder/SearchSourceBuilderTests.java | 2 +- .../CompletionSuggesterBuilderTests.java | 2 +- .../aggregations/AggregatorTestCase.java | 2 +- .../authz/permission/DocumentPermissions.java | 4 +- .../action/TransportTermsEnumAction.java | 2 +- .../permission/DocumentPermissionsTests.java | 2 +- .../RewriteCachingDirectoryReaderTests.java | 2 +- 21 files changed, 121 insertions(+), 127 deletions(-) diff --git a/docs/reference/aggregations/bucket/terms-aggregation.asciidoc b/docs/reference/aggregations/bucket/terms-aggregation.asciidoc index c8fcb8d4a6bd7..465acde1a1ab2 100644 --- a/docs/reference/aggregations/bucket/terms-aggregation.asciidoc +++ b/docs/reference/aggregations/bucket/terms-aggregation.asciidoc @@ -101,9 +101,6 @@ Response: <2> when there are lots of unique terms, Elasticsearch only returns the top terms; this number is the sum of the document counts for all buckets that are not part of the response <3> the list of the top buckets, the meaning of `top` being defined by the <<search-aggregations-bucket-terms-aggregation-order,order>> -By default, the `terms` aggregation will return the buckets for the top ten terms ordered by the `doc_count`. One can -change this default behaviour by setting the `size` parameter. - [[search-aggregations-bucket-terms-aggregation-types]] The `field` can be <<keyword>>, <<number>>, <<ip, `ip`>>, <<boolean, `boolean`>>, or <<binary, `binary`>>. @@ -117,59 +114,68 @@ memory usage. [[search-aggregations-bucket-terms-aggregation-size]] ==== Size -The `size` parameter can be set to define how many term buckets should be returned out of the overall terms list. By -default, the node coordinating the search process will request each shard to provide its own top `size` term buckets -and once all shards respond, it will reduce the results to the final list that will then be returned to the client. -This means that if the number of unique terms is greater than `size`, the returned list is slightly off and not accurate -(it could be that the term counts are slightly off and it could even be that a term that should have been in the top -size buckets was not returned). +By default, the `terms` aggregation returns the top ten terms with the most +documents. Use the `size` parameter to return more terms, up to the +<<search-settings-max-buckets,search.max_buckets>> limit. + +If your data contains 100 or 1000 unique terms, you can increase the `size` of +the `terms` aggregation to return them all. If you have more unique terms and +you need them all, use the +<<search-aggregations-bucket-composite-aggregation,composite aggregation>> +instead. -NOTE: If you want to retrieve **all** terms or all combinations of terms in a nested `terms` aggregation - you should use the <<search-aggregations-bucket-composite-aggregation,Composite>> aggregation which - allows to paginate over all possible terms rather than setting a size greater than the cardinality of the field in the - `terms` aggregation. The `terms` aggregation is meant to return the `top` terms and does not allow pagination. +Larger values of `size` use more memory to compute and, push the whole +aggregation close to the `max_buckets` limit. You'll know you've gone too large +if the request fails with a message about `max_buckets`. +[[search-aggregations-bucket-terms-aggregation-shard-size]] ==== Shard Size -The higher the requested `size` is, the more accurate the results will be, but also, the more expensive it will be to -compute the final results (both due to bigger priority queues that are managed on a shard level and due to bigger data -transfers between the nodes and the client). +To get more accurate results, the `terms` agg fetches more than +the top `size` terms from each shard. It fetches the top `shard_size` terms, +which defaults to `size * 1.5 + 10`. + +This is to handle the case when one term has many documents on one shard but is +just below the `size` threshold on all other shards. If each shard only +returned `size` terms, the aggregation would return an partial doc count for +the term. So `terms` returns more terms in an attempt to catch the missing +terms. This helps, but it's still quite possible to return a partial doc +count for a term. It just takes a term with more disparate per-shard doc counts. -The `shard_size` parameter can be used to minimize the extra work that comes with bigger requested `size`. When defined, -it will determine how many terms the coordinating node will request from each shard. Once all the shards responded, the -coordinating node will then reduce them to a final result which will be based on the `size` parameter - this way, -one can increase the accuracy of the returned terms and avoid the overhead of streaming a big list of buckets back to -the client. +You can increase `shard_size` to better account for these disparate doc counts +and improve the accuracy of the selection of top terms. It is much cheaper to increase +the `shard_size` than to increase the `size`. However, it still takes more +bytes over the wire and waiting in memory on the coordinating node. +IMPORTANT: This guidance only applies if you're using the `terms` aggregation's +default sort `order`. If you're sorting by anything other than document count in +descending order, see <<search-aggregations-bucket-terms-aggregation-order>>. NOTE: `shard_size` cannot be smaller than `size` (as it doesn't make much sense). When it is, Elasticsearch will override it and reset it to be equal to `size`. - -The default `shard_size` is `(size * 1.5 + 10)`. - [[terms-agg-doc-count-error]] ==== Document count error -`doc_count` values for a `terms` aggregation may be approximate. As a result, -any sub-aggregations on the `terms` aggregation may also be approximate. - -To calculate `doc_count` values, each shard provides its own top terms and -document counts. The aggregation combines these shard-level results to calculate -its final `doc_count` values. To measure the accuracy of `doc_count` values, the -aggregation results include the following properties: - -`sum_other_doc_count`:: -(integer) The total document count for any terms not included in the results. - -`doc_count_error_upper_bound`:: -(integer) The highest possible document count for any single term not included -in the results. If `0`, `doc_count` values are accurate. +`sum_other_doc_count` is the number of documents that didn't make it into the +the top `size` terms. If this is greater than `0`, you can be sure that the +`terms` agg had to throw away some buckets, either because they didn't fit into +`size` on the coordinating node or they didn't fit into `shard_size` on the +data node. ==== Per bucket document count error -To get the `doc_count_error_upper_bound` for each term, set -`show_term_doc_count_error` to `true`: +If you set the `show_term_doc_count_error` parameter to `true`, the `terms` +aggregation will include `doc_count_error_upper_bound`, which is an upper bound +to the error on the `doc_count` returned by each shard. It's the +sum of the size of the largest bucket on each shard that didn't fit into +`shard_size`. + +In more concrete terms, imagine there is one bucket that is very large on one +shard and just outside the `shard_size` on all the other shards. In that case, +the `terms` agg will return the bucket because it is large, but it'll be missing +data from many documents on the shards where the term fell below the `shard_size` threshold. +`doc_count_error_upper_bound` is the maximum number of those missing documents. [source,console,id=terms-aggregation-doc-count-error-example] -------------------------------------------------- @@ -189,10 +195,6 @@ GET /_search // TEST[s/_search/_search\?filter_path=aggregations/] -This shows an error value for each term returned by the aggregation which represents the 'worst case' error in the document count -and can be useful when deciding on a value for the `shard_size` parameter. This is calculated by summing the document counts for -the last term returned by all shards which did not return the term. - These errors can only be calculated in this way when the terms are ordered by descending document count. When the aggregation is ordered by the terms values themselves (either ascending or descending) there is no error in the document count since if a shard does not return a particular term which appears in the results from another shard, it must not have that term in its index. When the @@ -202,19 +204,17 @@ determined and is given a value of -1 to indicate this. [[search-aggregations-bucket-terms-aggregation-order]] ==== Order -The order of the buckets can be customized by setting the `order` parameter. By default, the buckets are ordered by -their `doc_count` descending. It is possible to change this behaviour as documented below: +By default, the `terms` aggregation orders terms by descending document `_count`. +Use the `order` parameter to specify a different sort order. -WARNING: Sorting by ascending `_count` or by sub aggregation is discouraged as it increases the -<<terms-agg-doc-count-error,error>> on document counts. -It is fine when a single shard is queried, or when the field that is being aggregated was used -as a routing key at index time: in these cases results will be accurate since shards have disjoint -values. However otherwise, errors are unbounded. One particular case that could still be useful -is sorting by <<search-aggregations-metrics-min-aggregation,`min`>> or -<<search-aggregations-metrics-max-aggregation,`max`>> aggregation: counts will not be accurate -but at least the top buckets will be correctly picked. +WARNING: Avoid using `"order": { "_count": "asc" }`. If you need to find rare +terms, use the +<<search-aggregations-bucket-rare-terms-aggregation,`rare_terms`>> aggregation +instead. Due to the way the `terms` aggregation +<<search-aggregations-bucket-terms-aggregation-shard-size,gets terms from +shards>>, sorting by ascending doc count often produces inaccurate results. -Ordering the buckets by their doc `_count` in an ascending manner: +If you really must, this is how you sort on doc count ascending: [source,console,id=terms-aggregation-count-example] -------------------------------------------------- @@ -248,6 +248,14 @@ GET /_search } -------------------------------------------------- +WARNING: Test any sorts on sub-aggregations before using them in production. +Sorting on a sub-aggregation may return errors or inaccurate results. For +example, due to the way the `terms` aggregation +<<search-aggregations-bucket-terms-aggregation-shard-size,gets results from +shards>>, sorting on a `max` sub-aggregation in _ascending_ order often produces +inaccurate results. However, sorting on a `max` sub-aggregation in _descending_ +order is typically safe. + Ordering the buckets by single value metrics sub-aggregation (identified by the aggregation name): [source,console,id=terms-aggregation-subaggregation-example] diff --git a/modules/percolator/src/test/java/org/elasticsearch/percolator/QueryBuilderStoreTests.java b/modules/percolator/src/test/java/org/elasticsearch/percolator/QueryBuilderStoreTests.java index bb58f919602ed..e2d3ffe259e47 100644 --- a/modules/percolator/src/test/java/org/elasticsearch/percolator/QueryBuilderStoreTests.java +++ b/modules/percolator/src/test/java/org/elasticsearch/percolator/QueryBuilderStoreTests.java @@ -75,7 +75,7 @@ public void testStoringQueryBuilders() throws IOException { SearchExecutionContext searchExecutionContext = mock(SearchExecutionContext.class); when(searchExecutionContext.indexVersionCreated()).thenReturn(version); when(searchExecutionContext.getWriteableRegistry()).thenReturn(writableRegistry()); - when(searchExecutionContext.getXContentRegistry()).thenReturn(xContentRegistry()); + when(searchExecutionContext.getParserConfig()).thenReturn(parserConfig()); when(searchExecutionContext.getForField(fieldMapper.fieldType())).thenReturn( new BytesBinaryIndexFieldData(fieldMapper.name(), CoreValuesSourceType.KEYWORD) ); diff --git a/server/src/main/java/org/elasticsearch/index/query/CoordinatorRewriteContext.java b/server/src/main/java/org/elasticsearch/index/query/CoordinatorRewriteContext.java index a1a29af11e2da..1a2e6d0a8f979 100644 --- a/server/src/main/java/org/elasticsearch/index/query/CoordinatorRewriteContext.java +++ b/server/src/main/java/org/elasticsearch/index/query/CoordinatorRewriteContext.java @@ -15,7 +15,7 @@ import org.elasticsearch.index.mapper.DateFieldMapper; import org.elasticsearch.index.mapper.MappedFieldType; import org.elasticsearch.index.shard.IndexLongFieldRange; -import org.elasticsearch.xcontent.NamedXContentRegistry; +import org.elasticsearch.xcontent.XContentParserConfiguration; import java.util.function.LongSupplier; @@ -32,7 +32,7 @@ public class CoordinatorRewriteContext extends QueryRewriteContext { private final DateFieldMapper.DateFieldType timestampFieldType; public CoordinatorRewriteContext( - NamedXContentRegistry xContentRegistry, + XContentParserConfiguration parserConfig, NamedWriteableRegistry writeableRegistry, Client client, LongSupplier nowInMillis, @@ -40,7 +40,7 @@ public CoordinatorRewriteContext( IndexLongFieldRange indexLongFieldRange, DateFieldMapper.DateFieldType timestampFieldType ) { - super(xContentRegistry, writeableRegistry, client, nowInMillis); + super(parserConfig, writeableRegistry, client, nowInMillis); this.index = index; this.indexLongFieldRange = indexLongFieldRange; this.timestampFieldType = timestampFieldType; diff --git a/server/src/main/java/org/elasticsearch/index/query/CoordinatorRewriteContextProvider.java b/server/src/main/java/org/elasticsearch/index/query/CoordinatorRewriteContextProvider.java index ae139f2e8a5d2..6f9260dd8a943 100644 --- a/server/src/main/java/org/elasticsearch/index/query/CoordinatorRewriteContextProvider.java +++ b/server/src/main/java/org/elasticsearch/index/query/CoordinatorRewriteContextProvider.java @@ -16,14 +16,14 @@ import org.elasticsearch.index.Index; import org.elasticsearch.index.mapper.DateFieldMapper; import org.elasticsearch.index.shard.IndexLongFieldRange; -import org.elasticsearch.xcontent.NamedXContentRegistry; +import org.elasticsearch.xcontent.XContentParserConfiguration; import java.util.function.Function; import java.util.function.LongSupplier; import java.util.function.Supplier; public class CoordinatorRewriteContextProvider { - private final NamedXContentRegistry xContentRegistry; + private final XContentParserConfiguration parserConfig; private final NamedWriteableRegistry writeableRegistry; private final Client client; private final LongSupplier nowInMillis; @@ -31,14 +31,14 @@ public class CoordinatorRewriteContextProvider { private final Function<Index, DateFieldMapper.DateFieldType> mappingSupplier; public CoordinatorRewriteContextProvider( - NamedXContentRegistry xContentRegistry, + XContentParserConfiguration parserConfig, NamedWriteableRegistry writeableRegistry, Client client, LongSupplier nowInMillis, Supplier<ClusterState> clusterStateSupplier, Function<Index, DateFieldMapper.DateFieldType> mappingSupplier ) { - this.xContentRegistry = xContentRegistry; + this.parserConfig = parserConfig; this.writeableRegistry = writeableRegistry; this.client = client; this.nowInMillis = nowInMillis; @@ -62,14 +62,6 @@ public CoordinatorRewriteContext getCoordinatorRewriteContext(Index index) { } IndexLongFieldRange timestampRange = indexMetadata.getTimestampRange(); - return new CoordinatorRewriteContext( - xContentRegistry, - writeableRegistry, - client, - nowInMillis, - index, - timestampRange, - dateFieldType - ); + return new CoordinatorRewriteContext(parserConfig, writeableRegistry, client, nowInMillis, index, timestampRange, dateFieldType); } } diff --git a/server/src/main/java/org/elasticsearch/index/query/QueryRewriteContext.java b/server/src/main/java/org/elasticsearch/index/query/QueryRewriteContext.java index 930a572e081cb..20e90755fd9e9 100644 --- a/server/src/main/java/org/elasticsearch/index/query/QueryRewriteContext.java +++ b/server/src/main/java/org/elasticsearch/index/query/QueryRewriteContext.java @@ -11,8 +11,8 @@ import org.elasticsearch.client.Client; import org.elasticsearch.common.io.stream.NamedWriteableRegistry; import org.elasticsearch.common.util.concurrent.CountDown; -import org.elasticsearch.xcontent.NamedXContentRegistry; import org.elasticsearch.xcontent.XContentParser; +import org.elasticsearch.xcontent.XContentParserConfiguration; import java.util.ArrayList; import java.util.List; @@ -23,20 +23,20 @@ * Context object used to rewrite {@link QueryBuilder} instances into simplified version. */ public class QueryRewriteContext { - private final NamedXContentRegistry xContentRegistry; + private final XContentParserConfiguration parserConfiguration; private final NamedWriteableRegistry writeableRegistry; protected final Client client; protected final LongSupplier nowInMillis; private final List<BiConsumer<Client, ActionListener<?>>> asyncActions = new ArrayList<>(); public QueryRewriteContext( - NamedXContentRegistry xContentRegistry, + XContentParserConfiguration parserConfiguration, NamedWriteableRegistry writeableRegistry, Client client, LongSupplier nowInMillis ) { - this.xContentRegistry = xContentRegistry; + this.parserConfiguration = parserConfiguration; this.writeableRegistry = writeableRegistry; this.client = client; this.nowInMillis = nowInMillis; @@ -45,8 +45,8 @@ public QueryRewriteContext( /** * The registry used to build new {@link XContentParser}s. Contains registered named parsers needed to parse the query. */ - public NamedXContentRegistry getXContentRegistry() { - return xContentRegistry; + public XContentParserConfiguration getParserConfig() { + return parserConfiguration; } /** diff --git a/server/src/main/java/org/elasticsearch/index/query/SearchExecutionContext.java b/server/src/main/java/org/elasticsearch/index/query/SearchExecutionContext.java index ec0f1eb77a051..8e0ed6cc9ff76 100644 --- a/server/src/main/java/org/elasticsearch/index/query/SearchExecutionContext.java +++ b/server/src/main/java/org/elasticsearch/index/query/SearchExecutionContext.java @@ -24,6 +24,7 @@ import org.elasticsearch.common.io.stream.NamedWriteableRegistry; import org.elasticsearch.common.lucene.search.Queries; import org.elasticsearch.common.regex.Regex; +import org.elasticsearch.common.xcontent.LoggingDeprecationHandler; import org.elasticsearch.core.CheckedFunction; import org.elasticsearch.core.Nullable; import org.elasticsearch.index.Index; @@ -59,6 +60,7 @@ import org.elasticsearch.transport.RemoteClusterAware; import org.elasticsearch.xcontent.NamedXContentRegistry; import org.elasticsearch.xcontent.XContentParser; +import org.elasticsearch.xcontent.XContentParserConfiguration; import java.io.IOException; import java.util.Collections; @@ -144,7 +146,8 @@ public SearchExecutionContext( mappingLookup, similarityService, scriptService, - xContentRegistry, + // TODO pass the parser configuration into this method? + XContentParserConfiguration.EMPTY.withDeprecationHandler(LoggingDeprecationHandler.INSTANCE).withRegistry(xContentRegistry), namedWriteableRegistry, client, searcher, @@ -172,7 +175,7 @@ public SearchExecutionContext(SearchExecutionContext source) { source.mappingLookup, source.similarityService, source.scriptService, - source.getXContentRegistry(), + source.getParserConfig(), source.getWriteableRegistry(), source.client, source.searcher, @@ -196,7 +199,7 @@ private SearchExecutionContext( MappingLookup mappingLookup, SimilarityService similarityService, ScriptService scriptService, - NamedXContentRegistry xContentRegistry, + XContentParserConfiguration parserConfig, NamedWriteableRegistry namedWriteableRegistry, Client client, IndexSearcher searcher, @@ -208,7 +211,7 @@ private SearchExecutionContext( Map<String, MappedFieldType> runtimeMappings, Predicate<String> allowedFields ) { - super(xContentRegistry, namedWriteableRegistry, client, nowInMillis); + super(parserConfig, namedWriteableRegistry, client, nowInMillis); this.shardId = shardId; this.shardRequestIndex = shardRequestIndex; this.similarityService = similarityService; diff --git a/server/src/main/java/org/elasticsearch/index/query/WrapperQueryBuilder.java b/server/src/main/java/org/elasticsearch/index/query/WrapperQueryBuilder.java index 1605e25191ee4..1fa93ee9904a6 100644 --- a/server/src/main/java/org/elasticsearch/index/query/WrapperQueryBuilder.java +++ b/server/src/main/java/org/elasticsearch/index/query/WrapperQueryBuilder.java @@ -15,7 +15,6 @@ import org.elasticsearch.common.bytes.BytesReference; import org.elasticsearch.common.io.stream.StreamInput; import org.elasticsearch.common.io.stream.StreamOutput; -import org.elasticsearch.common.xcontent.LoggingDeprecationHandler; import org.elasticsearch.xcontent.ParseField; import org.elasticsearch.xcontent.XContentBuilder; import org.elasticsearch.xcontent.XContentFactory; @@ -148,10 +147,7 @@ protected boolean doEquals(WrapperQueryBuilder other) { @Override protected QueryBuilder doRewrite(QueryRewriteContext context) throws IOException { - try ( - XContentParser qSourceParser = XContentFactory.xContent(source) - .createParser(context.getXContentRegistry(), LoggingDeprecationHandler.INSTANCE, source) - ) { + try (XContentParser qSourceParser = XContentFactory.xContent(source).createParser(context.getParserConfig(), source)) { final QueryBuilder queryBuilder = parseInnerQueryBuilder(qSourceParser).rewrite(context); if (boost() != DEFAULT_BOOST || queryName() != null) { diff --git a/server/src/main/java/org/elasticsearch/indices/IndicesService.java b/server/src/main/java/org/elasticsearch/indices/IndicesService.java index 47c9d359ce8b1..2a8c7178bde0b 100644 --- a/server/src/main/java/org/elasticsearch/indices/IndicesService.java +++ b/server/src/main/java/org/elasticsearch/indices/IndicesService.java @@ -131,6 +131,7 @@ import org.elasticsearch.xcontent.NamedXContentRegistry; import org.elasticsearch.xcontent.XContentFactory; import org.elasticsearch.xcontent.XContentParser; +import org.elasticsearch.xcontent.XContentParserConfiguration; import org.elasticsearch.xcontent.XContentType; import java.io.Closeable; @@ -203,6 +204,7 @@ public class IndicesService extends AbstractLifecycleComponent private final PluginsService pluginsService; private final NodeEnvironment nodeEnv; private final NamedXContentRegistry xContentRegistry; + private final XContentParserConfiguration parserConfig; private final TimeValue shardsClosedTimeout; private final AnalysisRegistry analysisRegistry; private final IndexNameExpressionResolver indexNameExpressionResolver; @@ -283,6 +285,8 @@ public IndicesService( this.pluginsService = pluginsService; this.nodeEnv = nodeEnv; this.xContentRegistry = xContentRegistry; + this.parserConfig = XContentParserConfiguration.EMPTY.withDeprecationHandler(LoggingDeprecationHandler.INSTANCE) + .withRegistry(xContentRegistry); this.valuesSourceRegistry = valuesSourceRegistry; this.shardsClosedTimeout = settings.getAsTime(INDICES_SHARDS_CLOSED_TIMEOUT, new TimeValue(1, TimeUnit.DAYS)); this.analysisRegistry = analysisRegistry; @@ -1683,12 +1687,12 @@ public AliasFilter buildAliasFilter(ClusterState state, String index, Set<String * Returns a new {@link QueryRewriteContext} with the given {@code now} provider */ public QueryRewriteContext getRewriteContext(LongSupplier nowInMillis) { - return new QueryRewriteContext(xContentRegistry, namedWriteableRegistry, client, nowInMillis); + return new QueryRewriteContext(parserConfig, namedWriteableRegistry, client, nowInMillis); } public CoordinatorRewriteContextProvider getCoordinatorRewriteContextProvider(LongSupplier nowInMillis) { return new CoordinatorRewriteContextProvider( - xContentRegistry, + parserConfig, namedWriteableRegistry, client, nowInMillis, diff --git a/server/src/main/java/org/elasticsearch/search/suggest/completion/CompletionSuggestionBuilder.java b/server/src/main/java/org/elasticsearch/search/suggest/completion/CompletionSuggestionBuilder.java index 92faac0f61c58..ee39fbbbf340c 100644 --- a/server/src/main/java/org/elasticsearch/search/suggest/completion/CompletionSuggestionBuilder.java +++ b/server/src/main/java/org/elasticsearch/search/suggest/completion/CompletionSuggestionBuilder.java @@ -12,7 +12,6 @@ import org.elasticsearch.common.io.stream.StreamInput; import org.elasticsearch.common.io.stream.StreamOutput; import org.elasticsearch.common.unit.Fuzziness; -import org.elasticsearch.common.xcontent.LoggingDeprecationHandler; import org.elasticsearch.common.xcontent.XContentHelper; import org.elasticsearch.index.mapper.CompletionFieldMapper; import org.elasticsearch.index.mapper.MappedFieldType; @@ -21,13 +20,13 @@ import org.elasticsearch.search.suggest.SuggestionSearchContext.SuggestionContext; import org.elasticsearch.search.suggest.completion.context.ContextMapping; import org.elasticsearch.search.suggest.completion.context.ContextMappings; -import org.elasticsearch.xcontent.NamedXContentRegistry; import org.elasticsearch.xcontent.ObjectParser; import org.elasticsearch.xcontent.ParseField; import org.elasticsearch.xcontent.ToXContent; import org.elasticsearch.xcontent.XContentBuilder; import org.elasticsearch.xcontent.XContentFactory; import org.elasticsearch.xcontent.XContentParser; +import org.elasticsearch.xcontent.XContentParserConfiguration; import org.elasticsearch.xcontent.XContentType; import java.io.IOException; @@ -288,7 +287,7 @@ public SuggestionContext build(SearchExecutionContext context) throws IOExceptio if (type.hasContextMappings() && contextBytes != null) { Map<String, List<ContextMapping.InternalQueryContext>> queryContexts = parseContextBytes( contextBytes, - context.getXContentRegistry(), + context.getParserConfig(), type.getContextMappings() ); suggestionContext.setQueryContexts(queryContexts); @@ -301,17 +300,10 @@ public SuggestionContext build(SearchExecutionContext context) throws IOExceptio static Map<String, List<ContextMapping.InternalQueryContext>> parseContextBytes( BytesReference contextBytes, - NamedXContentRegistry xContentRegistry, + XContentParserConfiguration parserConfig, ContextMappings contextMappings ) throws IOException { - try ( - XContentParser contextParser = XContentHelper.createParser( - xContentRegistry, - LoggingDeprecationHandler.INSTANCE, - contextBytes, - CONTEXT_BYTES_XCONTENT_TYPE - ) - ) { + try (XContentParser contextParser = XContentHelper.createParser(parserConfig, contextBytes, CONTEXT_BYTES_XCONTENT_TYPE)) { contextParser.nextToken(); Map<String, List<ContextMapping.InternalQueryContext>> queryContexts = new HashMap<>(contextMappings.size()); assert contextParser.currentToken() == XContentParser.Token.START_OBJECT; diff --git a/server/src/main/java/org/elasticsearch/search/suggest/phrase/PhraseSuggester.java b/server/src/main/java/org/elasticsearch/search/suggest/phrase/PhraseSuggester.java index bfadd7bb3f91e..e0a6b11dbda9e 100644 --- a/server/src/main/java/org/elasticsearch/search/suggest/phrase/PhraseSuggester.java +++ b/server/src/main/java/org/elasticsearch/search/suggest/phrase/PhraseSuggester.java @@ -19,7 +19,6 @@ import org.apache.lucene.util.CharsRefBuilder; import org.elasticsearch.common.lucene.Lucene; import org.elasticsearch.common.text.Text; -import org.elasticsearch.common.xcontent.LoggingDeprecationHandler; import org.elasticsearch.index.query.AbstractQueryBuilder; import org.elasticsearch.index.query.ParsedQuery; import org.elasticsearch.index.query.QueryBuilder; @@ -132,7 +131,7 @@ public Suggestion<? extends Entry<? extends Option>> innerExecute( final String querySource = scriptFactory.newInstance(vars).execute(); try ( XContentParser parser = XContentFactory.xContent(querySource) - .createParser(searchExecutionContext.getXContentRegistry(), LoggingDeprecationHandler.INSTANCE, querySource) + .createParser(searchExecutionContext.getParserConfig(), querySource) ) { QueryBuilder innerQueryBuilder = AbstractQueryBuilder.parseInnerQueryBuilder(parser); final ParsedQuery parsedQuery = searchExecutionContext.toQuery(innerQueryBuilder); diff --git a/server/src/test/java/org/elasticsearch/action/search/CanMatchPreFilterSearchPhaseTests.java b/server/src/test/java/org/elasticsearch/action/search/CanMatchPreFilterSearchPhaseTests.java index 023dc9e9d4bf2..8d7f876975f7f 100644 --- a/server/src/test/java/org/elasticsearch/action/search/CanMatchPreFilterSearchPhaseTests.java +++ b/server/src/test/java/org/elasticsearch/action/search/CanMatchPreFilterSearchPhaseTests.java @@ -45,7 +45,7 @@ import org.elasticsearch.threadpool.TestThreadPool; import org.elasticsearch.threadpool.ThreadPool; import org.elasticsearch.transport.Transport; -import org.elasticsearch.xcontent.NamedXContentRegistry; +import org.elasticsearch.xcontent.XContentParserConfiguration; import java.io.IOException; import java.util.ArrayList; @@ -822,7 +822,7 @@ private void addIndexMinMaxTimestamps(Index index, String fieldName, long minTim public CoordinatorRewriteContextProvider build() { return new CoordinatorRewriteContextProvider( - NamedXContentRegistry.EMPTY, + XContentParserConfiguration.EMPTY, mock(NamedWriteableRegistry.class), mock(Client.class), System::currentTimeMillis, diff --git a/server/src/test/java/org/elasticsearch/index/mapper/DateFieldTypeTests.java b/server/src/test/java/org/elasticsearch/index/mapper/DateFieldTypeTests.java index e94095cfd3f18..04eda92fe9216 100644 --- a/server/src/test/java/org/elasticsearch/index/mapper/DateFieldTypeTests.java +++ b/server/src/test/java/org/elasticsearch/index/mapper/DateFieldTypeTests.java @@ -53,7 +53,7 @@ public class DateFieldTypeTests extends FieldTypeTestCase { private static final long nowInMillis = 0; public void testIsFieldWithinRangeEmptyReader() throws IOException { - QueryRewriteContext context = new QueryRewriteContext(xContentRegistry(), writableRegistry(), null, () -> nowInMillis); + QueryRewriteContext context = new QueryRewriteContext(parserConfig(), writableRegistry(), null, () -> nowInMillis); IndexReader reader = new MultiReader(); DateFieldType ft = new DateFieldType("my_date"); assertEquals( @@ -90,7 +90,7 @@ public void isFieldWithinRangeTestCase(DateFieldType ft) throws IOException { doTestIsFieldWithinQuery(ft, reader, ZoneOffset.UTC, null); doTestIsFieldWithinQuery(ft, reader, ZoneOffset.UTC, alternateFormat); - QueryRewriteContext context = new QueryRewriteContext(xContentRegistry(), writableRegistry(), null, () -> nowInMillis); + QueryRewriteContext context = new QueryRewriteContext(parserConfig(), writableRegistry(), null, () -> nowInMillis); // Fields with no value indexed. DateFieldType ft2 = new DateFieldType("my_date2"); @@ -102,7 +102,7 @@ public void isFieldWithinRangeTestCase(DateFieldType ft) throws IOException { private void doTestIsFieldWithinQuery(DateFieldType ft, DirectoryReader reader, ZoneId zone, DateMathParser alternateFormat) throws IOException { - QueryRewriteContext context = new QueryRewriteContext(xContentRegistry(), writableRegistry(), null, () -> nowInMillis); + QueryRewriteContext context = new QueryRewriteContext(parserConfig(), writableRegistry(), null, () -> nowInMillis); assertEquals( Relation.INTERSECTS, ft.isFieldWithinQuery(reader, "2015-10-09", "2016-01-02", randomBoolean(), randomBoolean(), zone, null, context) diff --git a/server/src/test/java/org/elasticsearch/search/aggregations/AggregatorFactoriesTests.java b/server/src/test/java/org/elasticsearch/search/aggregations/AggregatorFactoriesTests.java index ee7d8c5e393bf..6af3a3d06f047 100644 --- a/server/src/test/java/org/elasticsearch/search/aggregations/AggregatorFactoriesTests.java +++ b/server/src/test/java/org/elasticsearch/search/aggregations/AggregatorFactoriesTests.java @@ -231,7 +231,7 @@ public void testRewriteAggregation() throws Exception { BucketScriptPipelineAggregationBuilder pipelineAgg = new BucketScriptPipelineAggregationBuilder("const", new Script("1")); AggregatorFactories.Builder builder = new AggregatorFactories.Builder().addAggregator(filterAggBuilder) .addPipelineAggregator(pipelineAgg); - AggregatorFactories.Builder rewritten = builder.rewrite(new QueryRewriteContext(xContentRegistry, null, null, () -> 0L)); + AggregatorFactories.Builder rewritten = builder.rewrite(new QueryRewriteContext(parserConfig(), null, null, () -> 0L)); assertNotSame(builder, rewritten); Collection<AggregationBuilder> aggregatorFactories = rewritten.getAggregatorFactories(); assertEquals(1, aggregatorFactories.size()); @@ -244,7 +244,7 @@ public void testRewriteAggregation() throws Exception { assertThat(rewrittenFilter, instanceOf(TermsQueryBuilder.class)); // Check that a further rewrite returns the same aggregation factories builder - AggregatorFactories.Builder secondRewritten = rewritten.rewrite(new QueryRewriteContext(xContentRegistry, null, null, () -> 0L)); + AggregatorFactories.Builder secondRewritten = rewritten.rewrite(new QueryRewriteContext(parserConfig(), null, null, () -> 0L)); assertSame(rewritten, secondRewritten); } @@ -253,7 +253,7 @@ public void testRewritePipelineAggregationUnderAggregation() throws Exception { new RewrittenPipelineAggregationBuilder() ); AggregatorFactories.Builder builder = new AggregatorFactories.Builder().addAggregator(filterAggBuilder); - QueryRewriteContext context = new QueryRewriteContext(xContentRegistry, null, null, () -> 0L); + QueryRewriteContext context = new QueryRewriteContext(parserConfig(), null, null, () -> 0L); AggregatorFactories.Builder rewritten = builder.rewrite(context); CountDownLatch latch = new CountDownLatch(1); context.executeAsyncActions(new ActionListener<Object>() { @@ -280,7 +280,7 @@ public void testRewriteAggregationAtTopLevel() throws Exception { FilterAggregationBuilder filterAggBuilder = new FilterAggregationBuilder("titles", new MatchAllQueryBuilder()); AggregatorFactories.Builder builder = new AggregatorFactories.Builder().addAggregator(filterAggBuilder) .addPipelineAggregator(new RewrittenPipelineAggregationBuilder()); - QueryRewriteContext context = new QueryRewriteContext(xContentRegistry, null, null, () -> 0L); + QueryRewriteContext context = new QueryRewriteContext(parserConfig(), null, null, () -> 0L); AggregatorFactories.Builder rewritten = builder.rewrite(context); CountDownLatch latch = new CountDownLatch(1); context.executeAsyncActions(new ActionListener<Object>() { diff --git a/server/src/test/java/org/elasticsearch/search/aggregations/bucket/FiltersTests.java b/server/src/test/java/org/elasticsearch/search/aggregations/bucket/FiltersTests.java index 7b96f9c8a4e2f..d1289ec75bba7 100644 --- a/server/src/test/java/org/elasticsearch/search/aggregations/bucket/FiltersTests.java +++ b/server/src/test/java/org/elasticsearch/search/aggregations/bucket/FiltersTests.java @@ -123,12 +123,12 @@ public void testRewrite() throws IOException { // test non-keyed filter that doesn't rewrite AggregationBuilder original = new FiltersAggregationBuilder("my-agg", new MatchAllQueryBuilder()); original.setMetadata(Collections.singletonMap(randomAlphaOfLengthBetween(1, 20), randomAlphaOfLengthBetween(1, 20))); - AggregationBuilder rewritten = original.rewrite(new QueryRewriteContext(xContentRegistry(), null, null, () -> 0L)); + AggregationBuilder rewritten = original.rewrite(new QueryRewriteContext(parserConfig(), null, null, () -> 0L)); assertSame(original, rewritten); // test non-keyed filter that does rewrite original = new FiltersAggregationBuilder("my-agg", new BoolQueryBuilder()); - rewritten = original.rewrite(new QueryRewriteContext(xContentRegistry(), null, null, () -> 0L)); + rewritten = original.rewrite(new QueryRewriteContext(parserConfig(), null, null, () -> 0L)); assertNotSame(original, rewritten); assertThat(rewritten, instanceOf(FiltersAggregationBuilder.class)); assertEquals("my-agg", ((FiltersAggregationBuilder) rewritten).getName()); @@ -139,12 +139,12 @@ public void testRewrite() throws IOException { // test keyed filter that doesn't rewrite original = new FiltersAggregationBuilder("my-agg", new KeyedFilter("my-filter", new MatchAllQueryBuilder())); - rewritten = original.rewrite(new QueryRewriteContext(xContentRegistry(), null, null, () -> 0L)); + rewritten = original.rewrite(new QueryRewriteContext(parserConfig(), null, null, () -> 0L)); assertSame(original, rewritten); // test non-keyed filter that does rewrite original = new FiltersAggregationBuilder("my-agg", new KeyedFilter("my-filter", new BoolQueryBuilder())); - rewritten = original.rewrite(new QueryRewriteContext(xContentRegistry(), null, null, () -> 0L)); + rewritten = original.rewrite(new QueryRewriteContext(parserConfig(), null, null, () -> 0L)); assertNotSame(original, rewritten); assertThat(rewritten, instanceOf(FiltersAggregationBuilder.class)); assertEquals("my-agg", ((FiltersAggregationBuilder) rewritten).getName()); @@ -156,7 +156,7 @@ public void testRewrite() throws IOException { // test sub-agg filter that does rewrite original = new TermsAggregationBuilder("terms").userValueTypeHint(ValueType.BOOLEAN) .subAggregation(new FiltersAggregationBuilder("my-agg", new KeyedFilter("my-filter", new BoolQueryBuilder()))); - rewritten = original.rewrite(new QueryRewriteContext(xContentRegistry(), null, null, () -> 0L)); + rewritten = original.rewrite(new QueryRewriteContext(parserConfig(), null, null, () -> 0L)); assertNotSame(original, rewritten); assertNotEquals(original, rewritten); assertThat(rewritten, instanceOf(TermsAggregationBuilder.class)); @@ -165,7 +165,7 @@ public void testRewrite() throws IOException { assertThat(subAgg, instanceOf(FiltersAggregationBuilder.class)); assertNotSame(original.getSubAggregations().iterator().next(), subAgg); assertEquals("my-agg", subAgg.getName()); - assertSame(rewritten, rewritten.rewrite(new QueryRewriteContext(xContentRegistry(), null, null, () -> 0L))); + assertSame(rewritten, rewritten.rewrite(new QueryRewriteContext(parserConfig(), null, null, () -> 0L))); } public void testRewritePreservesOtherBucket() throws IOException { @@ -173,7 +173,7 @@ public void testRewritePreservesOtherBucket() throws IOException { originalFilters.otherBucket(randomBoolean()); originalFilters.otherBucketKey(randomAlphaOfLength(10)); - AggregationBuilder rewritten = originalFilters.rewrite(new QueryRewriteContext(xContentRegistry(), null, null, () -> 0L)); + AggregationBuilder rewritten = originalFilters.rewrite(new QueryRewriteContext(parserConfig(), null, null, () -> 0L)); assertThat(rewritten, instanceOf(FiltersAggregationBuilder.class)); FiltersAggregationBuilder rewrittenFilters = (FiltersAggregationBuilder) rewritten; diff --git a/server/src/test/java/org/elasticsearch/search/builder/SearchSourceBuilderTests.java b/server/src/test/java/org/elasticsearch/search/builder/SearchSourceBuilderTests.java index a3c28ce2a5bb1..20cfeb295fd12 100644 --- a/server/src/test/java/org/elasticsearch/search/builder/SearchSourceBuilderTests.java +++ b/server/src/test/java/org/elasticsearch/search/builder/SearchSourceBuilderTests.java @@ -534,7 +534,7 @@ private void assertIndicesBoostParseErrorMessage(String restContent, String expe private SearchSourceBuilder rewrite(SearchSourceBuilder searchSourceBuilder) throws IOException { return Rewriteable.rewrite( searchSourceBuilder, - new QueryRewriteContext(xContentRegistry(), writableRegistry(), null, Long.valueOf(1)::longValue) + new QueryRewriteContext(parserConfig(), writableRegistry(), null, Long.valueOf(1)::longValue) ); } } diff --git a/server/src/test/java/org/elasticsearch/search/suggest/completion/CompletionSuggesterBuilderTests.java b/server/src/test/java/org/elasticsearch/search/suggest/completion/CompletionSuggesterBuilderTests.java index 0df26e63804eb..cc29c6850ff6e 100644 --- a/server/src/test/java/org/elasticsearch/search/suggest/completion/CompletionSuggesterBuilderTests.java +++ b/server/src/test/java/org/elasticsearch/search/suggest/completion/CompletionSuggesterBuilderTests.java @@ -175,7 +175,7 @@ protected void assertSuggestionContext(CompletionSuggestionBuilder builder, Sugg Map<String, List<InternalQueryContext>> parsedContextBytes; parsedContextBytes = CompletionSuggestionBuilder.parseContextBytes( builder.contextBytes, - xContentRegistry(), + parserConfig(), new ContextMappings(contextMappings) ); Map<String, List<InternalQueryContext>> queryContexts = completionSuggestionCtx.getQueryContexts(); diff --git a/test/framework/src/main/java/org/elasticsearch/search/aggregations/AggregatorTestCase.java b/test/framework/src/main/java/org/elasticsearch/search/aggregations/AggregatorTestCase.java index 83a73615ee95a..533e56e6936ef 100644 --- a/test/framework/src/main/java/org/elasticsearch/search/aggregations/AggregatorTestCase.java +++ b/test/framework/src/main/java/org/elasticsearch/search/aggregations/AggregatorTestCase.java @@ -224,7 +224,7 @@ protected <A extends Aggregator> A createAggregator( protected <A extends Aggregator> A createAggregator(AggregationBuilder builder, AggregationContext context) throws IOException { QueryRewriteContext rewriteContext = new QueryRewriteContext( - xContentRegistry(), + parserConfig(), new NamedWriteableRegistry(List.of()), null, context::nowInMillis diff --git a/x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/security/authz/permission/DocumentPermissions.java b/x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/security/authz/permission/DocumentPermissions.java index fecb9231e4c9b..69b5783c520b2 100644 --- a/x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/security/authz/permission/DocumentPermissions.java +++ b/x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/security/authz/permission/DocumentPermissions.java @@ -165,7 +165,7 @@ private static void buildRoleQuery( ) throws IOException { for (String query : queries) { SearchExecutionContext context = searchExecutionContextProvider.apply(shardId); - QueryBuilder queryBuilder = DLSRoleQueryValidator.evaluateAndVerifyRoleQuery(query, context.getXContentRegistry()); + QueryBuilder queryBuilder = DLSRoleQueryValidator.evaluateAndVerifyRoleQuery(query, context.getParserConfig().registry()); if (queryBuilder != null) { failIfQueryUsesClient(queryBuilder, context); Query roleQuery = context.toQuery(queryBuilder).query(); @@ -196,7 +196,7 @@ private static void buildRoleQuery( */ static void failIfQueryUsesClient(QueryBuilder queryBuilder, QueryRewriteContext original) throws IOException { QueryRewriteContext copy = new QueryRewriteContext( - original.getXContentRegistry(), + original.getParserConfig(), original.getWriteableRegistry(), null, original::nowInMillis diff --git a/x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/termsenum/action/TransportTermsEnumAction.java b/x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/termsenum/action/TransportTermsEnumAction.java index b67bbcd251f9f..ed1a4964c7cce 100644 --- a/x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/termsenum/action/TransportTermsEnumAction.java +++ b/x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/termsenum/action/TransportTermsEnumAction.java @@ -436,7 +436,7 @@ private boolean canAccess( QueryBuilder queryBuilder = DLSRoleQueryValidator.evaluateAndVerifyRoleQuery( querySource, scriptService, - queryShardContext.getXContentRegistry(), + queryShardContext.getParserConfig().registry(), securityContext.getUser() ); QueryBuilder rewrittenQueryBuilder = Rewriteable.rewrite(queryBuilder, queryShardContext); diff --git a/x-pack/plugin/core/src/test/java/org/elasticsearch/xpack/core/security/authz/permission/DocumentPermissionsTests.java b/x-pack/plugin/core/src/test/java/org/elasticsearch/xpack/core/security/authz/permission/DocumentPermissionsTests.java index 7278879413804..efaac90896ebb 100644 --- a/x-pack/plugin/core/src/test/java/org/elasticsearch/xpack/core/security/authz/permission/DocumentPermissionsTests.java +++ b/x-pack/plugin/core/src/test/java/org/elasticsearch/xpack/core/security/authz/permission/DocumentPermissionsTests.java @@ -68,7 +68,7 @@ public void testFailIfQueryUsesClient() throws Exception { Client client = mock(Client.class); when(client.settings()).thenReturn(Settings.EMPTY); final long nowInMillis = randomNonNegativeLong(); - QueryRewriteContext context = new QueryRewriteContext(xContentRegistry(), writableRegistry(), client, () -> nowInMillis); + QueryRewriteContext context = new QueryRewriteContext(parserConfig(), writableRegistry(), client, () -> nowInMillis); QueryBuilder queryBuilder1 = new TermsQueryBuilder("field", "val1", "val2"); DocumentPermissions.failIfQueryUsesClient(queryBuilder1, context); diff --git a/x-pack/plugin/frozen-indices/src/test/java/org/elasticsearch/index/engine/frozen/RewriteCachingDirectoryReaderTests.java b/x-pack/plugin/frozen-indices/src/test/java/org/elasticsearch/index/engine/frozen/RewriteCachingDirectoryReaderTests.java index 6bc1a462a8da4..783f0ba76089b 100644 --- a/x-pack/plugin/frozen-indices/src/test/java/org/elasticsearch/index/engine/frozen/RewriteCachingDirectoryReaderTests.java +++ b/x-pack/plugin/frozen-indices/src/test/java/org/elasticsearch/index/engine/frozen/RewriteCachingDirectoryReaderTests.java @@ -98,7 +98,7 @@ public void testIsWithinQuery() throws IOException { try (DirectoryReader reader = DirectoryReader.open(writer)) { RewriteCachingDirectoryReader cachingDirectoryReader = new RewriteCachingDirectoryReader(dir, reader.leaves(), null); DateFieldMapper.DateFieldType dateFieldType = new DateFieldMapper.DateFieldType("test"); - QueryRewriteContext context = new QueryRewriteContext(xContentRegistry(), writableRegistry(), null, () -> 0); + QueryRewriteContext context = new QueryRewriteContext(parserConfig(), writableRegistry(), null, () -> 0); MappedFieldType.Relation relation = dateFieldType.isFieldWithinQuery( cachingDirectoryReader, 0,