Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip final reduction if SearchRequest holds a cluster alias #37000

Merged

Conversation

javanna
Copy link
Member

@javanna javanna commented Dec 27, 2018

With #36997 we added the ability to provide a cluster alias with a SearchRequest.

The next step is to disable the final reduction whenever a cluster alias is provided. A cluster alias will be provided when executing a cross-cluster search request with alternate execution mode, where each cluster does its own reduction locally. In order for the CCS node to be able to later perform an additional reduction of the results, we need to make sure that there is all the needed info available. This means that terms aggregations can be reduced but not pruned, and pipeline aggs should not be executed. The final reduction will happen in the CCS coordinating node.

Relates to #36997 & #32125

@javanna javanna added >enhancement :Search/Search Search-related issues that do not fall into other categories v7.0.0 v6.7.0 labels Dec 27, 2018
@javanna javanna requested review from s1monw and jimczi December 27, 2018 11:05
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

@javanna javanna force-pushed the enhancement/perform_final_reduce_flag branch 3 times, most recently from 7b2660b to 420ab39 Compare December 28, 2018 11:46
With elastic#36997 we added the ability to provide a cluster alias with a
SearchRequest.

The next step is to disable the final reduction whenever a cluster alias
is provided with the SearchRequest. A cluster alias will be provided
when executing a cross-cluster search request with alternate execution
mode, where each cluster does its own reduction locally. In order for
the CCS node to be able to later perform an additional reduction of the
results, we need to make sure that all the needed info stays available.
This means that terms aggregations can be reduced but not pruned, and
pipeline aggs should not be executed. The final reduction will happen
later in the CCS coordinating node.

Relates to elastic#36997 & elastic#32125
@javanna javanna force-pushed the enhancement/perform_final_reduce_flag branch from 420ab39 to f6a868e Compare December 28, 2018 11:47
@javanna
Copy link
Member Author

javanna commented Dec 28, 2018

@jimczi this is now ready for review ;)

Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@javanna javanna merged commit cb6bac3 into elastic:master Dec 28, 2018
javanna added a commit to javanna/elasticsearch that referenced this pull request Dec 28, 2018
With elastic#36997 and elastic#37000 we added the ability to provide a cluster alias with a SearchRequest and to perform non-final reduction when a cluster alias is set to the incoming search request.

With CCS soon supporting local reduction on each remote cluster, we also need to prevent shard_size of terms aggs to be adjusted on the data nodes. We may end up querying a single shard per cluster, but later we will have to reduce results coming from multiple cluster hence we still have to collect more buckets than needed to guarantee some level of precision.

This is done by adding the ability to override the shard_size on the coordinating node as part of the rewrite phase. This will be done only
when searching against multiple clusters, meaning when using CCS with alternate execution mode. Setting the shard_size explicitly on the coord node means that it will not be overridden later on the data nodes. The shard_size will still be set on the data nodes like before in all other cases (local search, or CCS with ordinary execution mode).

 Relates to elastic#32125
javanna added a commit that referenced this pull request Jan 7, 2019
With #36997 we added the ability to provide a cluster alias with a
SearchRequest.

The next step is to disable the final reduction whenever a cluster alias
is provided with the SearchRequest. A cluster alias will be provided
when executing a cross-cluster search request with alternate execution
mode, where each cluster does its own reduction locally. In order for
the CCS node to be able to later perform an additional reduction of the
results, we need to make sure that all the needed info stays available.
This means that terms aggregations can be reduced but not pruned, and
pipeline aggs should not be executed. The final reduction will happen
later in the CCS coordinating node.

Relates to #36997 & #32125
javanna added a commit to javanna/elasticsearch that referenced this pull request Jan 31, 2019
WIth elastic#37000 we made sure that fnial reduction is automatically disabled
whenever a localClusterAlias is provided with a SearchRequest.

While working on elastic#37838, we found a scenario where we do need to set a
localClusterAlias yet we would like to perform a final reduction in the
remote cluster: when searching on a single remote cluster.

This commit adds support for a separate finalReduce flag to
SearchRequest and makes use of it in TransportSearchAction in case we
are searching against a single remote cluster.

This also makes sure that num_reduce_phases is correct when searching
against a single remote cluster: it makes little sense to return
`num_reduce_phases` set to `2`, which looks especially weird in case
the search was performed against a single remote shard. We should
perform one reduction phase only in this case and `num_reduce_phases`
should reflect that.
javanna added a commit that referenced this pull request Feb 1, 2019
With #37000 we made sure that fnial reduction is automatically disabled
whenever a localClusterAlias is provided with a SearchRequest.

While working on #37838, we found a scenario where we do need to set a
localClusterAlias yet we would like to perform a final reduction in the
remote cluster: when searching on a single remote cluster.

Relates to #32125

This commit adds support for a separate finalReduce flag to
SearchRequest and makes use of it in TransportSearchAction in case we
are searching against a single remote cluster.

This also makes sure that num_reduce_phases is correct when searching
against a single remote cluster: it makes little sense to return
`num_reduce_phases` set to `2`, which looks especially weird in case
the search was performed against a single remote shard. We should
perform one reduction phase only in this case and `num_reduce_phases`
should reflect that.

* line length
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search/Search Search-related issues that do not fall into other categories v6.7.0 v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants