Skip to content

Commit

Permalink
indices.query.bool.max_clause_count now limits all query clauses (e…
Browse files Browse the repository at this point in the history
…lastic#75297)

In the upcoming Lucene 9 release, `indices.query.bool.max_clause_count` is
going to apply to the entire query tree rather than per `bool` query. In order
to avoid breaks, the limit has been bumped from 1024 to 4096.

The semantics will effectively change when we upgrade to Lucene 9, this PR
is only about agreeing on a migration strategy and documenting this change.

To avoid further breaks, I am leaning towards keeping the current setting name
even though it contains `bool`. I believe that it still makes sense given that
`bool` queries are typically the main contributors to high numbers of clauses.

Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
jpountz and jrodewig authored Jul 21, 2021

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
1 parent df5c407 commit feb6620
Showing 8 changed files with 41 additions and 19 deletions.
2 changes: 1 addition & 1 deletion docs/reference/mapping/mapping-settings-limit.asciidoc
Original file line number Diff line number Diff line change
@@ -14,7 +14,7 @@ especially in clusters with a high load or few resources.
If you increase this setting, we recommend you also increase the
<<search-settings,`indices.query.bool.max_clause_count`>> setting, which
limits the maximum number of <<query-dsl-bool-query,boolean clauses>> in a query.
limits the maximum number of clauses in a query.
====
+
[TIP]
16 changes: 16 additions & 0 deletions docs/reference/migration/migrate_8_0/search.asciidoc
Original file line number Diff line number Diff line change
@@ -20,6 +20,22 @@ Aggregating and sorting on `_id` should be avoided. As an alternative, the
`_id` field's contents can be duplicated into another field with docvalues
enabled (note that this does not apply to auto-generated IDs).
====

[[max_clause_count_change]]
.The `indices.query.bool.max_clause_count` setting now limits all query clauses.
[%collapsible]
====
*Details* +
Previously, the `indices.query.bool.max_clause_count` would apply to the number
of clauses of a single `bool` query. It now applies to the total number of
clauses of the rewritten query. In order to reduce chances of breaks, its
default value has been bumped from 1024 to 4096.
*Impact* +
Queries with many clauses should be avoided whenever possible. If you had bumped
this setting already in order to accomodate for some heavy queries, you might
need to bump it further so that these heavy queries keep working.
====
//end::notable-breaking-changes[]

.Search-related REST API endpoints containing mapping types have been removed.
24 changes: 15 additions & 9 deletions docs/reference/modules/indices/search-settings.asciidoc
Original file line number Diff line number Diff line change
@@ -7,16 +7,22 @@ limits.
[[indices-query-bool-max-clause-count]]
`indices.query.bool.max_clause_count`::
(<<static-cluster-setting,Static>>, integer)
Maximum number of clauses a Lucene BooleanQuery can contain. Defaults to `1024`.
Maximum number of clauses a query can contain. Defaults to `4096`.
+
This setting limits the number of clauses a Lucene BooleanQuery can have. The
default of 1024 is quite high and should normally be sufficient. This limit does
not only affect Elasticsearchs `bool` query, but many other queries are rewritten to Lucene's
BooleanQuery internally. The limit is in place to prevent searches from becoming too large
and taking up too much CPU and memory. In case you're considering increasing this setting,
make sure you've exhausted all other options to avoid having to do this. Higher values can lead
to performance degradations and memory issues, especially in clusters with a high load or
few resources.
This setting limits the total number of clauses that a query tree can have. The default of 4096
is quite high and should normally be sufficient. This limit applies to the rewritten query, so
not only `bool` queries can contribute high numbers of clauses, but also all queries that rewrite
to `bool` queries internally such as `fuzzy` queries. The limit is in place to prevent searches
from becoming too large, and taking up too much CPU and memory. In case you're considering
increasing this setting, make sure you've exhausted all other options to avoid having to do this.
Higher values can lead to performance degradations and memory issues, especially in clusters with
a high load or few resources.

Elasticsearch offers some tools to avoid running into issues with regards to the maximum number of
clauses such as the <<query-dsl-terms-query,`terms`>> query, which allows querying many distinct
values while still counting as a single clause, or the <<index-prefixes,`index_prefixes`>> option
of <<text-field-type,`text`>> fields, which allows executing prefix queries that expand to a high
number of terms as a single term query.

[[search-settings-max-buckets]]
`search.max_buckets`::
6 changes: 3 additions & 3 deletions docs/reference/query-dsl/combined-fields-query.asciidoc
Original file line number Diff line number Diff line change
@@ -37,9 +37,9 @@ model perfectly.)
[WARNING]
.Field number limit
===================================================
There is a limit on the number of fields that can be queried at once. It is
defined by the `indices.query.bool.max_clause_count` <<search-settings>>
which defaults to 1024.
There is a limit on the number of fields times terms that can be queried at
once. It is defined by the `indices.query.bool.max_clause_count`
<<search-settings>> which defaults to 4096.
===================================================

==== Per-field boosting
4 changes: 2 additions & 2 deletions docs/reference/query-dsl/multi-match-query.asciidoc
Original file line number Diff line number Diff line change
@@ -67,9 +67,9 @@ index settings, which in turn defaults to `*`. `*` extracts all fields in the ma
are eligible to term queries and filters the metadata fields. All extracted fields are then
combined to build a query.

WARNING: There is a limit on the number of fields that can be queried
WARNING: There is a limit on the number of fields times terms that can be queried
at once. It is defined by the `indices.query.bool.max_clause_count` <<search-settings>>
which defaults to 1024.
which defaults to 4096.

[[multi-match-types]]
[discrete]
4 changes: 2 additions & 2 deletions docs/reference/query-dsl/query-string-query.asciidoc
Original file line number Diff line number Diff line change
@@ -77,9 +77,9 @@ documents.
For mappings with a large number of fields, searching across all eligible fields
could be expensive.
There is a limit on the number of fields that can be queried at once.
There is a limit on the number of fields times terms that can be queried at once.
It is defined by the `indices.query.bool.max_clause_count`
<<search-settings,search setting>>, which defaults to 1024.
<<search-settings,search setting>>, which defaults to 4096.
====
--

2 changes: 1 addition & 1 deletion docs/reference/query-dsl/span-multi-term-query.asciidoc
Original file line number Diff line number Diff line change
@@ -39,7 +39,7 @@ GET /_search
--------------------------------------------------

WARNING: `span_multi` queries will hit too many clauses failure if the number of terms that match the query exceeds the
boolean query limit (defaults to 1024).To avoid an unbounded expansion you can set the <<query-dsl-multi-term-rewrite,
boolean query limit (defaults to 4096).To avoid an unbounded expansion you can set the <<query-dsl-multi-term-rewrite,
rewrite method>> of the multi term query to `top_terms_*` rewrite. Or, if you use `span_multi` on `prefix` query only,
you can activate the <<index-prefixes,`index_prefixes`>> field option of the `text` field instead. This will
rewrite any prefix query on the field to a single term query that matches the indexed prefix.
Original file line number Diff line number Diff line change
@@ -263,7 +263,7 @@
*/
public class SearchModule {
public static final Setting<Integer> INDICES_MAX_CLAUSE_COUNT_SETTING = Setting.intSetting("indices.query.bool.max_clause_count",
1024, 1, Integer.MAX_VALUE, Setting.Property.NodeScope);
4096, 1, Integer.MAX_VALUE, Setting.Property.NodeScope);

public static final Setting<Integer> INDICES_MAX_NESTED_DEPTH_SETTING = Setting.intSetting("indices.query.bool.max_nested_depth",
20, 1, Integer.MAX_VALUE, Setting.Property.NodeScope);

0 comments on commit feb6620

Please sign in to comment.