Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix search states of CCS requests in mixed cluster #71265

Merged
merged 6 commits into from
Apr 4, 2021

Conversation

dnhatn
Copy link
Member

@dnhatn dnhatn commented Apr 2, 2021

Forward-port of #70948 to 8.0

Previously, the search states are stored in ReaderContext on data nodes. Since 7.10, we send them to the coordinating node in a QuerySearchResult of a `ShardSearchRequest and the coordinating node then sends them back in ShardFetchSearchRequest. We must keep the search states in data nodes unless they are sent back in the fetch phase. We used the channel version to determine this guarantee. However, it's not correct in CCS requests in mixed clusters.

  1. The coordinating node of the local cluster on the old version sends a ShardSearchRequest to a proxy node of the remote cluster on the new version. That proxy node delivers the request to the data node. In this case, the channel version between the data node and the proxy node is >= 7.10, but we won't receive the search states in the fetch phase as they are stripped out in the channel between the old coordinating node and the new proxy.
[coordinating node v7.9] --> [proxy node v7.10] --> [data node on v7.10]
  1. The coordinating node of the local on the new version sends a ShardSearchRequest to a proxy node of the remote cluster on the new version. However, the coordinating node sends a ShardFetchSearchRequest to another proxy node of the remote cluster that is still on an old version. The search states then are stripped out and never reach the data node.
- query phase: [coordinating node v7.10] --> [proxy node v7.10] --> [data node on v7.10]
- fetch phase: [coordinating node v7.10] --> [proxy node v7.9] --> [data node on v7.10]

This commit fixes the first issue by adding an explicit flag keepSearchStatesInContext to ShardSearchRequest and the second by continue storing the search states in ReaderContext unless all nodes are upgraded.

Relates #52741

dnhatn added a commit that referenced this pull request Apr 3, 2021
@dnhatn dnhatn marked this pull request as ready for review April 3, 2021 19:54
@dnhatn dnhatn added the backport label Apr 3, 2021
@dnhatn
Copy link
Member Author

dnhatn commented Apr 4, 2021

@elasticmachine test this please

@dnhatn dnhatn merged commit 541cb02 into elastic:master Apr 4, 2021
@dnhatn dnhatn deleted the fix-ccs-search-state branch April 4, 2021 13:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant