-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] Enable data/ coordinator node in the cluster to serve cluster state and its entities (like index, alias etc.) read APIs #12272
Comments
In the first iteration, in case node is not in sync with active cluster manager term and version, let it fallback to active leader to serve the request. Subsequently, we should evaluate how to refresh the state on the node. |
Besides latency benefits, this reduces the read API overhead on the leader cluster manager significantly and thereby, reducing its memory, cpu usage and transport overhead. |
Wondering if this can be done as a part of the same call?
|
Having it as seperate API which returns only term-version helps to reuse at multiple places to make a decision at follower nodes as to whether it needs to fallback to cluster-manager / not. Refreshing the local state when not in-sync, is something needs to be evaluated. The pull-based refresh should be able to work in background, while it consumes the ClusterUpdates pushed from cluster-manager. This will be a follow-up. |
Sounds good |
Is your feature request related to a problem? Please describe
cluster-state
typically grows into 100's of MB in size in case of large clusters with thousands of shard. API requests such ascat/shards
,cat/indices
andnode/_stats
require copy of cluster-state and fetch it from cluster-manager. However cluster-state is also cached at node serving the API request and can be consumed if the node is in-sync with cluster-manager. This would avoid the serialization and transport overhead on cluster-manager to serve large cluster-state responses.Describe the solution you'd like
We propose to introduce a new light-weight transport request to cluster-manager to return the cluster name, UUID, Term and Version of cluster-state. The node serving API request would use the new transport endpoint to verify if the cluster-state cached on the node is in-sync with the cluster-manager in the context of read request.
Related component
Cluster Manager
Describe alternatives you've considered
No response
Additional context
API Response time
/_cluster/state
from local and remotecurl http://localhost:9200/_cluster/state?local=true
-> 2652 mscurl http://localhost:9200/_cluster/state?local=false
-> 3858 msSize of cluster-state ->
153 MB
The text was updated successfully, but these errors were encountered: