Move states of search to coordinating node #52741

dnhatn · 2020-02-25T03:49:02Z

This commit moves the states of search to the coordinating node instead of keeping them in the data node. Changes in this commit:

Include ShardSearchRequest to QuerySearchResult
Resend ShardSearchRequest in step 1 in the fetch phase so we can recreate a SearchContext from ShardFetchRequest
Include AggregatedDfs to ShardFetchRequest as we need it in both the query and fetch phase
Include RecoreDocIds to QuerySearchResult and send it back in the fetch phase

Relates #46523

I ran a few benchmarks for this change on a single node with http_logs, geonames, and geopoint. There was no regression. I will try to benchmark it with multiple nodes.

elasticmachine · 2020-02-25T03:49:05Z

Pinging @elastic/es-search (:Search/Search)

dnhatn · 2020-02-26T18:54:23Z

testHistoryIsWrittenWithFailure

@elasticmachine test this please

jimczi

I really like the approach here. The backward compatibility layer is simple and efficient since the main logic is unchanged (we rebuild the search context in all cases). I also find it nice that we send the rewritten shard request back and forth between phases. This effectively moves the state of the search to the coordinating node.
I left one comment that could be addressed in a follow up. The rest looks good to me.

jimczi · 2020-02-28T12:41:22Z

modules/percolator/src/test/java/org/elasticsearch/percolator/PercolatorQuerySearchIT.java

+        plugins.add(PercolatorPlugin.class);
+        return plugins;
+    }
+


Is this change needed for this pr ? It seems unrelated.

The problem is that MockNioTransport does not register NamedWriteables from the bundled plugins. We need to manually register the percolate plugin in the test as we now de/serialize the Percolate QueryBuilder between search phases. I looked into how to fix this issue, but it should be in a follow-up.

jimczi · 2020-02-28T12:45:33Z

server/src/main/java/org/elasticsearch/search/SearchPhaseResult.java

+    public void setRescoreDocIds(RescoreDocIds rescoreDocIds) {
+        this.rescoreDocIds = rescoreDocIds;
+    }
+
    @Override
    public void writeTo(StreamOutput out) throws IOException {
        // TODO: this seems wrong, SearchPhaseResult should have a writeTo?


we could apply the TODO here and clean the (de)serialization ? This would mean reconciling the logic between the different implementations and move the serialization of the object accessible here to this function. Happy to do this in a follow up though.

++. I will do this in a follow-up.

dnhatn · 2020-03-03T19:33:04Z

@jimczi Thank you for your review.

This commit introduces a new API that manages point-in-times in x-pack basic. Elasticsearch pit (point in time) is a lightweight view into the state of the data as it existed when initiated. A search request by default executes against the most recent point in time. In some cases, it is preferred to perform multiple search requests using the same point in time. For example, if refreshes happen between search_after requests, then the results of those requests might not be consistent as changes happening between searches are only visible to the more recent point in time. A point in time must be opened before being used in search requests. The `keep_alive` parameter tells Elasticsearch how long it should keep a point in time around. ``` POST /my_index/_pit?keep_alive=1m ``` The response from the above request includes a `id`, which should be passed to the `id` of the `pit` parameter of search requests. ``` POST /_search { "query": { "match" : { "title" : "elasticsearch" } }, "pit": { "id": "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWICBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==", "keep_alive": "1m" } } ``` Point-in-times are automatically closed when the `keep_alive` is elapsed. However, keeping point-in-times has a cost; hence, point-in-times should be closed as soon as they are no longer used in search requests. ``` DELETE /_pit { "id" : "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWIBBXV1aWQyAAA=" } ``` #### Notable works in this change: - Move the search state to the coordinating node: #52741 - Allow searches with a specific reader context: #53989 - Add the ability to acquire readers in IndexShard: #54966 Relates #46523 Relates #26472 Co-authored-by: Jim Ferenczi <jimczi@apache.org>

This commit introduces a new API that manages point-in-times in x-pack basic. Elasticsearch pit (point in time) is a lightweight view into the state of the data as it existed when initiated. A search request by default executes against the most recent point in time. In some cases, it is preferred to perform multiple search requests using the same point in time. For example, if refreshes happen between search_after requests, then the results of those requests might not be consistent as changes happening between searches are only visible to the more recent point in time. A point in time must be opened before being used in search requests. The `keep_alive` parameter tells Elasticsearch how long it should keep a point in time around. ``` POST /my_index/_pit?keep_alive=1m ``` The response from the above request includes a `id`, which should be passed to the `id` of the `pit` parameter of search requests. ``` POST /_search { "query": { "match" : { "title" : "elasticsearch" } }, "pit": { "id": "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWICBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==", "keep_alive": "1m" } } ``` Point-in-times are automatically closed when the `keep_alive` is elapsed. However, keeping point-in-times has a cost; hence, point-in-times should be closed as soon as they are no longer used in search requests. ``` DELETE /_pit { "id" : "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWIBBXV1aWQyAAA=" } ``` #### Notable works in this change: - Move the search state to the coordinating node: elastic#52741 - Allow searches with a specific reader context: elastic#53989 - Add the ability to acquire readers in IndexShard: elastic#54966 Relates elastic#46523 Relates elastic#26472 Co-authored-by: Jim Ferenczi <jimczi@apache.org>

This commit introduces a new API that manages point-in-times in x-pack basic. Elasticsearch pit (point in time) is a lightweight view into the state of the data as it existed when initiated. A search request by default executes against the most recent point in time. In some cases, it is preferred to perform multiple search requests using the same point in time. For example, if refreshes happen between search_after requests, then the results of those requests might not be consistent as changes happening between searches are only visible to the more recent point in time. A point in time must be opened before being used in search requests. The `keep_alive` parameter tells Elasticsearch how long it should keep a point in time around. ``` POST /my_index/_pit?keep_alive=1m ``` The response from the above request includes a `id`, which should be passed to the `id` of the `pit` parameter of search requests. ``` POST /_search { "query": { "match" : { "title" : "elasticsearch" } }, "pit": { "id": "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWICBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==", "keep_alive": "1m" } } ``` Point-in-times are automatically closed when the `keep_alive` is elapsed. However, keeping point-in-times has a cost; hence, point-in-times should be closed as soon as they are no longer used in search requests. ``` DELETE /_pit { "id" : "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWIBBXV1aWQyAAA=" } ``` #### Notable works in this change: - Move the search state to the coordinating node: #52741 - Allow searches with a specific reader context: #53989 - Add the ability to acquire readers in IndexShard: #54966 Relates #46523 Relates #26472 Co-authored-by: Jim Ferenczi <jimczi@apache.org>

Previously, the search states are stored in ReaderContext on data nodes. Since 7.10, we send them to the coordinating node in a QuerySearchResult of a ShardSearchRequest and the coordinating node then sends them back in ShardFetchSearchRequest. We must keep the search states in data nodes unless they are sent back in the fetch phase. We used the channel version to determine this guarantee. However, it's not correct in CCS requests in mixed clusters. 1. The coordinating node of the local cluster on the old version sends a ShardSearchRequest to a proxy node of the remote cluster on the new version. That proxy node delivers the request to the data node. In this case, the channel version between the data node and the proxy node is >= 7.10, but we won't receive the search states in the fetch phase as they are stripped out in the channel between the old coordinating node and the new proxy. ``` [coordinating node v7.9] --> [proxy node v7.10] --> [data node on v7.10] ``` 2. The coordinating node of the local on the new version sends a ShardSearchRequest to a proxy node of the remote cluster on the new version. However, the coordinating node sends a ShardFetchSearchRequest to another proxy node of the remote cluster that is still on an old version. The search states then are stripped out and never reach the data node. ``` -> query phase: [coordinating node v7.10] --> [proxy node v7.10] --> [data node on v7.10] -> fetch phase: [coordinating node v7.10] --> [proxy node v7.9] --> [data node on v7.10] ``` This commit fixes the first issue by explicitly serializing the channel version in a ShardSearchRequest and the second by continue storing the search states in ReaderContext unless all nodes are upgraded. Relates #52741

Move search state to coordinating node

6e6fbee

dnhatn added the :Search/Search label Feb 25, 2020

dnhatn requested a review from jimczi February 25, 2020 03:49

dnhatn added 2 commits February 26, 2020 11:51

Register percolator plugin

f8c7e4e

Merge branch 'feature/reader-context' into stateless-search-context

1ccec44

dnhatn mentioned this pull request Feb 28, 2020

Simplify AbstractSearchAsyncAction #52935

Closed

jimczi approved these changes Feb 28, 2020

View reviewed changes

Merge branch 'feature/reader-context' into stateless-search-context

f1ba9dd

dnhatn merged commit 206381e into elastic:reader-context Mar 3, 2020

dnhatn deleted the stateless-search-context branch March 3, 2020 19:33

dnhatn mentioned this pull request May 9, 2020

Introduce search context - point in time view of indices #56480

Closed

dnhatn mentioned this pull request Aug 12, 2020

Introduce point in time APIs in x-pack basic #61062

Merged

dnhatn mentioned this pull request Sep 2, 2020

Introduce point in time APIs in x-pack basic #61872

Closed

This was referenced Mar 28, 2021

Fix search states of CCS requests in mixed cluster #70948

Merged

Fix search states of CCS requests in mixed cluster #71264

Merged

dnhatn mentioned this pull request Apr 3, 2021

Fix search states of CCS requests in mixed cluster #71265

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move states of search to coordinating node #52741

Move states of search to coordinating node #52741

dnhatn commented Feb 25, 2020

elasticmachine commented Feb 25, 2020

dnhatn commented Feb 26, 2020

jimczi left a comment

jimczi Feb 28, 2020

dnhatn Mar 3, 2020

jimczi Feb 28, 2020

dnhatn Mar 3, 2020

dnhatn commented Mar 3, 2020

Move states of search to coordinating node #52741

Move states of search to coordinating node #52741

Conversation

dnhatn commented Feb 25, 2020

elasticmachine commented Feb 25, 2020

dnhatn commented Feb 26, 2020

jimczi left a comment

Choose a reason for hiding this comment

jimczi Feb 28, 2020

Choose a reason for hiding this comment

dnhatn Mar 3, 2020

Choose a reason for hiding this comment

jimczi Feb 28, 2020

Choose a reason for hiding this comment

dnhatn Mar 3, 2020

Choose a reason for hiding this comment

dnhatn commented Mar 3, 2020