-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Always rewrite search shard request outside of the search thread pool #51708
Always rewrite search shard request outside of the search thread pool #51708
Conversation
This change ensures that the rewrite of the shard request is executed in the network thread or in the refresh listener when waiting for an active shard. This allows queries that rewrite to match_no_docs to bypass the search thread pool entirely even if the can_match phase was skipped (pre_filter_shard_size > number of shards). Coordinating nodes don't have the ability to create empty responses so this change also ensures that at least one shard creates a full empty response while the other can return null ones. This is needed since creating true empty responses on shards require to create concrete aggregators which would be too costly to build on a network thread. We should move this functionality to aggregation builders in a follow up but that would be a much bigger change. This change is also important for elastic#49601 since we want to add the ability to use the result of other shards to rewrite the request of subsequent ones. For instance if the first M shards have their top N computed, the top worst document in the global queue can be pass to subsequent shards that can then rewrite to match_no_docs if they can guarantee that they don't have any document better than the provided one.
Pinging @elastic/es-search (:Search/Search) |
Pinging @elastic/es-distributed (:Distributed/Engine) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand this PR, and it looks great. Thanks, Jim. I left a question to discuss.
server/src/main/java/org/elasticsearch/action/search/AbstractSearchAsyncAction.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/search/internal/ShardSearchRequest.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two minor nits, LGTM.
server/src/main/java/org/elasticsearch/action/search/AbstractSearchAsyncAction.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/frozen-indices/src/test/java/org/elasticsearch/index/engine/FrozenIndexTests.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this change could help a lot in certain cases.
…elastic#51708) This change ensures that the rewrite of the shard request is executed in the network thread or in the refresh listener when waiting for an active shard. This allows queries that rewrite to match_no_docs to bypass the search thread pool entirely even if the can_match phase was skipped (pre_filter_shard_size > number of shards). Coordinating nodes don't have the ability to create empty responses so this change also ensures that at least one shard creates a full empty response while the other can return null ones. This is needed since creating true empty responses on shards require to create concrete aggregators which would be too costly to build on a network thread. We should move this functionality to aggregation builders in a follow up but that would be a much bigger change. This change is also important for elastic#49601 since we want to add the ability to use the result of other shards to rewrite the request of subsequent ones. For instance if the first M shards have their top N computed, the top worst document in the global queue can be pass to subsequent shards that can then rewrite to match_no_docs if they can guarantee that they don't have any document better than the provided one.
…#51708) (#51979) This change ensures that the rewrite of the shard request is executed in the network thread or in the refresh listener when waiting for an active shard. This allows queries that rewrite to match_no_docs to bypass the search thread pool entirely even if the can_match phase was skipped (pre_filter_shard_size > number of shards). Coordinating nodes don't have the ability to create empty responses so this change also ensures that at least one shard creates a full empty response while the other can return null ones. This is needed since creating true empty responses on shards require to create concrete aggregators which would be too costly to build on a network thread. We should move this functionality to aggregation builders in a follow up but that would be a much bigger change. This change is also important for #49601 since we want to add the ability to use the result of other shards to rewrite the request of subsequent ones. For instance if the first M shards have their top N computed, the top worst document in the global queue can be pass to subsequent shards that can then rewrite to match_no_docs if they can guarantee that they don't have any document better than the provided one.
We need to either exclude null responses from the scroll search response or always create a search context for every target shards, although that scroll query can be written to match_no_docs. Otherwise, we won't find search_context for subsequent scroll requests. This commit implements the latter option as it's less error-prone. Relates #51708
We need to either exclude null responses from the scroll search response or always create a search context for every target shards, although that scroll query can be written to match_no_docs. Otherwise, we won't find search_context for subsequent scroll requests. This commit implements the latter option as it's less error-prone. Relates #51708
This commit, built on top of #51708, allows to modify shard search requests based on informations collected on other shards. It is intended to speed up sorted queries on time-based indices. For queries that are only interested in the top documents. This change will rewrite the shard queries to match none if the bottom sort value computed in prior shards is better than all values in the shard. For queries that mix top documents and aggregations this change will reset the size of the top documents to 0 instead of rewriting to match none. This means that we don't need to keep a search context open for this shard since we know in advance that it doesn't contain any competitive hit.
This commit, built on top of #51708, allows to modify shard search requests based on informations collected on other shards. It is intended to speed up sorted queries on time-based indices. For queries that are only interested in the top documents. This change will rewrite the shard queries to match none if the bottom sort value computed in prior shards is better than all values in the shard. For queries that mix top documents and aggregations this change will reset the size of the top documents to 0 instead of rewriting to match none. This means that we don't need to keep a search context open for this shard since we know in advance that it doesn't contain any competitive hit.
Field level security was interacting in bad ways with the can-match phase on frozen tier shards (interaction between FieldSubsetReader and RewriteCachingDirectoryReader). This made can-match phase fail, which in the normal case would result in extra load on the frozen tier, and in the extreme case (in interaction with #51708) made searches fail. This is a bug that was indirectly introduced by #78988. Closes #82044
Field level security was interacting in bad ways with the can-match phase on frozen tier shards (interaction between FieldSubsetReader and RewriteCachingDirectoryReader). This made can-match phase fail, which in the normal case would result in extra load on the frozen tier, and in the extreme case (in interaction with elastic#51708) made searches fail. This is a bug that was indirectly introduced by elastic#78988. Closes elastic#82044
Field level security was interacting in bad ways with the can-match phase on frozen tier shards (interaction between FieldSubsetReader and RewriteCachingDirectoryReader). This made can-match phase fail, which in the normal case would result in extra load on the frozen tier, and in the extreme case (in interaction with elastic#51708) made searches fail. This is a bug that was indirectly introduced by elastic#78988. Closes elastic#82044
Field level security was interacting in bad ways with the can-match phase on frozen tier shards (interaction between FieldSubsetReader and RewriteCachingDirectoryReader). This made can-match phase fail, which in the normal case would result in extra load on the frozen tier, and in the extreme case (in interaction with #51708) made searches fail. This is a bug that was indirectly introduced by #78988. Closes #82044
Field level security was interacting in bad ways with the can-match phase on frozen tier shards (interaction between FieldSubsetReader and RewriteCachingDirectoryReader). This made can-match phase fail, which in the normal case would result in extra load on the frozen tier, and in the extreme case (in interaction with #51708) made searches fail. This is a bug that was indirectly introduced by #78988. Closes #82044 Co-authored-by: Elastic Machine <[email protected]>
This change ensures that the rewrite of the shard request is executed in the network thread or in the refresh listener when waiting for an active shard. This allows queries that rewrite to match_no_docs to bypass the search thread pool entirely even if the can_match phase was skipped (pre_filter_shard_size > number of shards).
Coordinating nodes don't have the ability to create empty responses so this change also ensures that at least one shard creates a full empty response while the other can return null ones. This is needed since creating true empty responses on shards require to create concrete aggregators which would be too costly to build on a network thread. We should move this functionality to aggregation builders in a follow up but that would be a much bigger change.
This change is also important for #49601 since we want to add the ability to use the result of other shards to rewrite the request of subsequent ones. For instance if the first M shards have their top N computed, the top worst document in the global queue can be pass to subsequent shards that can then rewrite to match_no_docs if they can guarantee that they don't have any document better than the provided one.