-
Notifications
You must be signed in to change notification settings - Fork 919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Ensure OpenSearch Dashboards stays available in large clusters #330
Labels
enhancement
New feature or request
migration
Any plans, changes, or enhancements needed for migration
v1.0.0
Milestone
Comments
boktorbb
added
enhancement
New feature or request
migration
Any plans, changes, or enhancements needed for migration
labels
May 5, 2021
mihirsoni
changed the title
[Patch] Ensure OpenSearch Dashboards stays available in large clusters
[Bug] Ensure OpenSearch Dashboards stays available in large clusters
May 6, 2021
boktorbb
pushed a commit
to boktorbb/OpenSearch-Dashboards
that referenced
this issue
Jun 8, 2021
Ensures that Dashboards checks only the local OpenSearch node when cluster_id node attribute is present and all nodes have some cluster_id value; Otherwise, it uses default behavior Closes opensearch-project#330 Signed-off-by: Bishoy Boktor <[email protected]>
5 tasks
boktorbb
added a commit
that referenced
this issue
Jun 11, 2021
* Implement optimized healthcheck for Dashboards Ensures that Dashboards checks only the local OpenSearch node when cluster_id node attribute is present and all nodes have some cluster_id value; Otherwise, it uses default behavior Closes #330 Signed-off-by: Bishoy Boktor <[email protected]> * Update optimizedHealthcheck setting to be configurable opensearch.optimizedHealthcheck is now {string|undefined} setting that corresponds to the user's node attribute created in OpenSearch. Healthcheck will now check the node attribute path ending in the value of the setting. Signed-off-by: Bishoy Boktor <[email protected]> * Simplify getNodeId logic and update documentation Simplifies getNodeId code. Also, updates healthcheck param to healthcheckAttributeName. Signed-off-by: Bishoy Boktor <[email protected]> * Update opensearch_dashboards.yml with setting example Signed-off-by: Bishoy Boktor <[email protected]> * Update healthcheck setting name to optimizedHealthcheckId Signed-off-by: Bishoy Boktor <[email protected]>
boktorbb
added a commit
that referenced
this issue
Jun 11, 2021
* Implement optimized healthcheck for Dashboards Ensures that Dashboards checks only the local OpenSearch node when cluster_id node attribute is present and all nodes have some cluster_id value; Otherwise, it uses default behavior Closes #330 Signed-off-by: Bishoy Boktor <[email protected]> * Update optimizedHealthcheck setting to be configurable opensearch.optimizedHealthcheck is now {string|undefined} setting that corresponds to the user's node attribute created in OpenSearch. Healthcheck will now check the node attribute path ending in the value of the setting. Signed-off-by: Bishoy Boktor <[email protected]> * Simplify getNodeId logic and update documentation Simplifies getNodeId code. Also, updates healthcheck param to healthcheckAttributeName. Signed-off-by: Bishoy Boktor <[email protected]> * Update opensearch_dashboards.yml with setting example Signed-off-by: Bishoy Boktor <[email protected]> * Update healthcheck setting name to optimizedHealthcheckId Signed-off-by: Bishoy Boktor <[email protected]>
kavilla
pushed a commit
that referenced
this issue
Jun 21, 2021
* Implement optimized healthcheck for Dashboards Ensures that Dashboards checks only the local OpenSearch node when cluster_id node attribute is present and all nodes have some cluster_id value; Otherwise, it uses default behavior Closes #330 Signed-off-by: Bishoy Boktor <[email protected]> * Update optimizedHealthcheck setting to be configurable opensearch.optimizedHealthcheck is now {string|undefined} setting that corresponds to the user's node attribute created in OpenSearch. Healthcheck will now check the node attribute path ending in the value of the setting. Signed-off-by: Bishoy Boktor <[email protected]> * Simplify getNodeId logic and update documentation Simplifies getNodeId code. Also, updates healthcheck param to healthcheckAttributeName. Signed-off-by: Bishoy Boktor <[email protected]> * Update opensearch_dashboards.yml with setting example Signed-off-by: Bishoy Boktor <[email protected]> * Update healthcheck setting name to optimizedHealthcheckId Signed-off-by: Bishoy Boktor <[email protected]>
7 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
enhancement
New feature or request
migration
Any plans, changes, or enhancements needed for migration
v1.0.0
Problem Statement
For sufficiently large OpenSearch clusters, when Dashboards sends a health check out there can be a failure and Dashboards becomes unavailable.
Root Cause of the issue
The default Dashboards behavior is to fan out healthcheck requests across the entire cluster. For large clusters, if any nodes are processing or ingestion heavy and time out then it fails the healthcheck and Dashboards becomes unavailable.
Proposed Dashboards solution
The proposal is to create effectively a similar node attribute and use that for an optimized healthcheck
Dashboards Configuration: Create a setting that’s called
optimized_healthcheck
inopensearch_dashboards.yml
that looks for the OpenSearch Node attributecluster_id
By default,optimized_healthcheck
will default tonull
which lets Dashboards continue fanning out healthcheck requests across all nodes. If the value iscluster_id
, then it will switch to the logic outlined in the below algorithm section.OpenSearch Configuration:
cluster_id
→ a new node attribute to be added in by customers that would differentiate cluster instancescluster_id can take the form of an integer that is assigned during cluster creation to all OpenSearch nodes. It will increment when new instances of the cluster are spun up
Using this cluster_id we can follow a general algorithm:
Step 1: Aggregate all cluster_id for OpenSearch nodes
Step 2: if all the nodes have same
cluster_id
, retrievenodes.info
from_local
node only.- Using
_cluster/state/nodes
to retrieve from each master nodeStep 3: if the nodes have different
cluster_id
, fan out the request to all the nodes.The text was updated successfully, but these errors were encountered: