Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add heuristics to compute pre_filter_shard_size when unspecified #53873

Merged
merged 4 commits into from
Mar 23, 2020
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -410,7 +410,9 @@ static void addSearchRequestParams(Params params, SearchRequest searchRequest) {
params.withIndicesOptions(searchRequest.indicesOptions());
params.withSearchType(searchRequest.searchType().name().toLowerCase(Locale.ROOT));
params.putParam("ccs_minimize_roundtrips", Boolean.toString(searchRequest.isCcsMinimizeRoundtrips()));
params.putParam("pre_filter_shard_size", Integer.toString(searchRequest.getPreFilterShardSize()));
if (searchRequest.getPreFilterShardSize() != null) {
params.putParam("pre_filter_shard_size", Integer.toString(searchRequest.getPreFilterShardSize()));
}
params.withMaxConcurrentShardRequests(searchRequest.getMaxConcurrentShardRequests());
if (searchRequest.requestCache() != null) {
params.withRequestCache(searchRequest.requestCache());
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1875,7 +1875,9 @@ private static void setRandomSearchParams(SearchRequest searchRequest,
if (randomBoolean()) {
searchRequest.setPreFilterShardSize(randomIntBetween(2, Integer.MAX_VALUE));
}
expectedParams.put("pre_filter_shard_size", Integer.toString(searchRequest.getPreFilterShardSize()));
if (searchRequest.getPreFilterShardSize() != null) {
expectedParams.put("pre_filter_shard_size", Integer.toString(searchRequest.getPreFilterShardSize()));
}
}

public static void setRandomIndicesOptions(Consumer<IndicesOptions> setter, Supplier<IndicesOptions> getter,
Expand Down
13 changes: 2 additions & 11 deletions docs/reference/frozen-indices.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -74,8 +74,8 @@ POST /twitter/_forcemerge?max_num_segments=1
== Searching a frozen index

Frozen indices are throttled in order to limit memory consumptions per node. The number of concurrently loaded frozen indices per node is
limited by the number of threads in the <<search-throttled,search_throttled>> threadpool, which is `1` by default.
Search requests will not be executed against frozen indices by default, even if a frozen index is named explicitly. This is
limited by the number of threads in the <<search-throttled,search_throttled>> threadpool, which is `1` by default.
Search requests will not be executed against frozen indices by default, even if a frozen index is named explicitly. This is
to prevent accidental slowdowns by targeting a frozen index by mistake. To include frozen indices a search request must be executed with
the query parameter `ignore_throttled=false`.

Expand All @@ -85,15 +85,6 @@ GET /twitter/_search?q=user:kimchy&ignore_throttled=false
--------------------------------------------------
// TEST[setup:twitter]

[IMPORTANT]
================================
While frozen indices are slow to search, they can be pre-filtered efficiently. The request parameter `pre_filter_shard_size` specifies
a threshold that, when exceeded, will enforce a round-trip to pre-filter search shards that cannot possibly match.
This filter phase can limit the number of shards significantly. For instance, if a date range filter is applied, then all indices (frozen or unfrozen) that do not contain documents within the date range can be skipped efficiently.
The default value for `pre_filter_shard_size` is `128` but it's recommended to set it to `1` when searching frozen indices. There is no
significant overhead associated with this pre-filter phase.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make sense to still explain and make sure that users don't mess with pre_filter_shard_size at this point? I would not want them to end up setting it when searching frozen indices. It sounds like there is never a good reason to do so?

================================

[role="xpack"]
[testenv="basic"]
[[monitoring_frozen_indices]]
Expand Down
35 changes: 20 additions & 15 deletions docs/reference/search/multi-search.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ GET twitter/_msearch
==== {api-description-title}

The multi search API executes several searches from a single API request.
The format of the request is similar to the bulk API format and makes use
The format of the request is similar to the bulk API format and makes use
of the newline delimited JSON (NDJSON) format.

The structure is as follows:
Expand Down Expand Up @@ -85,7 +85,7 @@ Maximum number of concurrent searches the multi search API can execute.
--
(Optional, integer)
Maximum number of concurrent shard requests that each sub-search request
executes per node. Defaults to `5`.
executes per node. Defaults to `5`.

You can use this parameter to prevent a request from overloading a cluster. For
example, a default request hits all indices in a cluster. This could cause shard
Expand All @@ -104,7 +104,12 @@ shards based on query rewriting if the number of shards the search request
expands to exceeds the threshold. This filter roundtrip can limit the number of
shards significantly if for instance a shard can not match any documents based
on it's rewrite method i.e., if date filters are mandatory to match but the
shard bounds and the query are disjoint. Defaults to `128`.
shard bounds and the query are disjoint.
When unspecified, the pre-filter phase is executed if any of these
conditions is met:
- The request targets more than `128` shards.
- The request contains read-only indices.
- The primary sort of the query targets an indexed field.

`rest_total_hits_as_int`::
(Optional, boolean)
Expand All @@ -121,7 +126,7 @@ to a specific shard.
--
(Optional, string)
Indicates whether global term and document frequencies should be used when
scoring returned documents.
scoring returned documents.

Options are:

Expand All @@ -134,7 +139,7 @@ This is usually faster but less accurate.
Documents are scored using global term and document frequencies across all
shards. This is usually slower but more accurate.
--

`typed_keys`::
(Optional, boolean)
Specifies whether aggregation and suggester names should be prefixed by their
Expand Down Expand Up @@ -196,7 +201,7 @@ to a specific shard.
--
(Optional, string)
Indicates whether global term and document frequencies should be used when
scoring returned documents.
scoring returned documents.

Options are:

Expand Down Expand Up @@ -234,18 +239,18 @@ Number of hits to return. Defaults to `10`.
==== {api-response-body-title}

`responses`::
(array) Includes the search response and status code for each search request
matching its order in the original multi search request. If there was a
complete failure for a specific search request, an object with `error` message
and corresponding status code will be returned in place of the actual search
(array) Includes the search response and status code for each search request
matching its order in the original multi search request. If there was a
complete failure for a specific search request, an object with `error` message
and corresponding status code will be returned in place of the actual search
response.


[[search-multi-search-api-example]]
==== {api-examples-title}

The header part includes which index / indices to search on, the `search_type`,
`preference`, and `routing`. The body includes the typical search body request
The header part includes which index / indices to search on, the `search_type`,
`preference`, and `routing`. The body includes the typical search body request
(including the `query`, `aggregations`, `from`, `size`, and so on).

[source,js]
Expand Down Expand Up @@ -308,7 +313,7 @@ See <<url-access-control>>
==== Template support

Much like described in <<search-template>> for the _search resource, _msearch
also provides support for templates. Submit them like follows for inline
also provides support for templates. Submit them like follows for inline
templates:

[source,console]
Expand Down Expand Up @@ -377,6 +382,6 @@ GET _msearch/template
[[multi-search-partial-responses]]
==== Partial responses

To ensure fast responses, the multi search API will respond with partial results
if one or more shards fail. See <<shard-failures, Shard failures>> for more
To ensure fast responses, the multi search API will respond with partial results
if one or more shards fail. See <<shard-failures, Shard failures>> for more
information.
Loading