Optimize distributed numeric sort for time-based indices #49601

jimczi · 2019-11-26T15:06:25Z

Today sorting by timestamp on a top-hits query that targets time-based indices doesn't take into account that the ranges of timestamp in each index don't overlap. The query phase computes the top N in each shard, independently of the results returned by shards that contain data before/after it. Considering that searches are now throttled by default and that we perform partial merges efficiently, it should be possible to record the bottom hit of the top hits after a partial merge and use it as a hint for any subsequent shard search. Each shard could then compare the bottom hit sort values with the range of values that it contains using the indexed BKD-tree and shortcut the query if the global bottom values are greater/smaller than the values contained in the shard.
In a sense that is the opposite of search_after.
There are multiple benefits if we apply this strategy:

Most/Least recent top hit queries on time-based indices would be considerably faster if they don't need to compute aggregations especially now that shards are pre-sorted by the primary sort field.
Shards that contain non-competitive document would not need to keep their context open since we'd early detect that the fetch phase is not needed. This would also work if aggregations are needed since aggregations and top_hits can run independently.
We could automatically set the max_concurrent_shard_requests and batched_reduce_size to a low value if we detect that shards have sorted values that don't overlap. This would reduce the impact on the cluster while still providing much faster sorted queries.
We could impose a default sort on timestamp for time-based indices in order to ensure that we don't run costly queries on this type of pattern by default.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2019-11-26T15:06:31Z

Pinging @elastic/es-distributed (:Distributed/Distributed)

elasticmachine · 2019-11-26T15:06:32Z

Pinging @elastic/es-search (:Search/Search)

This change ensures that the rewrite of the shard request is executed in the network thread or in the refresh listener when waiting for an active shard. This allows queries that rewrite to match_no_docs to bypass the search thread pool entirely even if the can_match phase was skipped (pre_filter_shard_size > number of shards). Coordinating nodes don't have the ability to create empty responses so this change also ensures that at least one shard creates a full empty response while the other can return null ones. This is needed since creating true empty responses on shards require to create concrete aggregators which would be too costly to build on a network thread. We should move this functionality to aggregation builders in a follow up but that would be a much bigger change. This change is also important for elastic#49601 since we want to add the ability to use the result of other shards to rewrite the request of subsequent ones. For instance if the first M shards have their top N computed, the top worst document in the global queue can be pass to subsequent shards that can then rewrite to match_no_docs if they can guarantee that they don't have any document better than the provided one.

…#51708) This change ensures that the rewrite of the shard request is executed in the network thread or in the refresh listener when waiting for an active shard. This allows queries that rewrite to match_no_docs to bypass the search thread pool entirely even if the can_match phase was skipped (pre_filter_shard_size > number of shards). Coordinating nodes don't have the ability to create empty responses so this change also ensures that at least one shard creates a full empty response while the other can return null ones. This is needed since creating true empty responses on shards require to create concrete aggregators which would be too costly to build on a network thread. We should move this functionality to aggregation builders in a follow up but that would be a much bigger change. This change is also important for #49601 since we want to add the ability to use the result of other shards to rewrite the request of subsequent ones. For instance if the first M shards have their top N computed, the top worst document in the global queue can be pass to subsequent shards that can then rewrite to match_no_docs if they can guarantee that they don't have any document better than the provided one.

…elastic#51708) This change ensures that the rewrite of the shard request is executed in the network thread or in the refresh listener when waiting for an active shard. This allows queries that rewrite to match_no_docs to bypass the search thread pool entirely even if the can_match phase was skipped (pre_filter_shard_size > number of shards). Coordinating nodes don't have the ability to create empty responses so this change also ensures that at least one shard creates a full empty response while the other can return null ones. This is needed since creating true empty responses on shards require to create concrete aggregators which would be too costly to build on a network thread. We should move this functionality to aggregation builders in a follow up but that would be a much bigger change. This change is also important for elastic#49601 since we want to add the ability to use the result of other shards to rewrite the request of subsequent ones. For instance if the first M shards have their top N computed, the top worst document in the global queue can be pass to subsequent shards that can then rewrite to match_no_docs if they can guarantee that they don't have any document better than the provided one.

…#51708) (#51979) This change ensures that the rewrite of the shard request is executed in the network thread or in the refresh listener when waiting for an active shard. This allows queries that rewrite to match_no_docs to bypass the search thread pool entirely even if the can_match phase was skipped (pre_filter_shard_size > number of shards). Coordinating nodes don't have the ability to create empty responses so this change also ensures that at least one shard creates a full empty response while the other can return null ones. This is needed since creating true empty responses on shards require to create concrete aggregators which would be too costly to build on a network thread. We should move this functionality to aggregation builders in a follow up but that would be a much bigger change. This change is also important for #49601 since we want to add the ability to use the result of other shards to rewrite the request of subsequent ones. For instance if the first M shards have their top N computed, the top worst document in the global queue can be pass to subsequent shards that can then rewrite to match_no_docs if they can guarantee that they don't have any document better than the provided one.

jimczi added >enhancement :Search/Search Search-related issues that do not fall into other categories :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. labels Nov 26, 2019

This was referenced Jan 30, 2020

Always rewrite search shard request outside of the search thread pool #51708

Merged

Shortcut query phase using the results of other shards #51852

Merged

jimczi closed this as completed in #51852 Mar 17, 2020

codebrain mentioned this issue Apr 1, 2020

7.7.0 meta ticket (Part 2) elastic/elasticsearch-net#4533

Closed

dnhatn mentioned this issue May 25, 2020

7.7.0 bug with _search and idle shards #57006

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize distributed numeric sort for time-based indices #49601

Optimize distributed numeric sort for time-based indices #49601

jimczi commented Nov 26, 2019

elasticmachine commented Nov 26, 2019

elasticmachine commented Nov 26, 2019

Optimize distributed numeric sort for time-based indices #49601

Optimize distributed numeric sort for time-based indices #49601

Comments

jimczi commented Nov 26, 2019

elasticmachine commented Nov 26, 2019

elasticmachine commented Nov 26, 2019