Apply index create block when all the nodes of the cluster has breached high disk watermark #4456

RS146BIJAY · 2022-09-08T07:37:40Z

Describe the bug

OpenSearch stops allocating any shards to nodes that have breached High Disk Watermark. If a scenario arises that all the nodes in the cluster breached high disk watermark, no new shards will be created on this cluster. Now if we try to create a new index on this cluster, it will be a red index (since no primary or replica shards can be created for this index).

Proposal

In order to prevent red cluster, we propose that whenever all the nodes in the cluster has breached high disk watermark we apply a INDEX CREATE BLOCK on the cluster to prevent creation of any red indices. We will modify the DiskThresholdMonitor (which monitors for disk watermarks thresholds on a domain and take appropriate action) to include an extra check on whether low disk watermark is breached on all the nodes in cluster. If it has, it will apply an index create block on the entire cluster.

Consideration:

Once low disk watermark is no longer breached on any node of cluster, blocks applied should be automatically removed.
If index create block is already applied on the cluster will this change cause any conflict.
Do we need to handle the case when low disk watermark gets unbreached during rerouting?

Gaganjuneja · 2022-09-20T02:19:22Z

@RS146BIJAY Definitely a good guard rail to protect the system. Just out of curiosity, I want to know if we can evaluate(dry run) the index creation upfront if it is going to make the cluster red and respond with the reason? The reason could be anything be it low disk, low capacity, number of shards exceeding at node level etc.

Bukhtawar · 2022-09-27T18:54:55Z

The problem doing a dry-run is we would usually go with an optimistic locking approach assuming that between the dry-run and the actual call nothing else has changes. However it is quite possible that when we did pre-checks the validations returns just about fine but fails during an actual call due to multiple concurrent requests changing the state of the system.

Then we need to understand how shards get assigned. When the leader assigns shards to the node for them to start, it tries to pick node based on some algorithm. Now its quite possible that between the assignment and the actual shard initialisation, disks could go full so it's actually tricky to get things right. Let me know if you have thoughts here

Gaganjuneja · 2022-10-03T09:57:16Z

Thanks @Bukhtawar for clarifying and it does make sense as well. Optimistic locking is definitely a very costly operation here. We could think of something like resource allocation. Each request has some system requirements in terms of CPU, Disk, Memory, etc. and if we can try reserving these resources for the request and if it succeeds then the actual request will be processed otherwise reserved resources will be released. It will also help in managing the overall cluster resources. I would like to hear your thoughts on this.

RS146BIJAY added bug Something isn't working untriaged labels Sep 8, 2022

Bukhtawar added enhancement Enhancement or improvement to existing feature or request and removed untriaged bug Something isn't working labels Sep 8, 2022

dreamer-89 added discuss Issues intended to help drive brainstorming and decision making Indexing & Search distributed framework labels Sep 13, 2022

RS146BIJAY mentioned this issue Sep 27, 2022

Block new index creation #4603

Closed

RS146BIJAY changed the title ~~Apply index create block when all the nodes of the cluster has breached low disk watermark~~ Apply index create block when all the nodes of the cluster has breached high disk watermark Oct 26, 2022

RS146BIJAY mentioned this issue Jan 5, 2023

[META] Insufficient guardrails leading to disk going full on nodes #5712

Open

8 tasks

RS146BIJAY mentioned this issue Jan 13, 2023

Adding index create block when all nodes have breached high disk watermark #5852

Merged

6 tasks

RS146BIJAY closed this as completed Feb 17, 2023

This was referenced Feb 21, 2023

[Fix] Fixing the condition to remove a node over high disk watermark in DiskThresholdMonitor #6404

Closed

[Fix] Fix last runtime millisec reset when applying index create block #6419

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Apply index create block when all the nodes of the cluster has breached high disk watermark #4456

Apply index create block when all the nodes of the cluster has breached high disk watermark #4456

RS146BIJAY commented Sep 8, 2022 •

edited

Loading

Gaganjuneja commented Sep 20, 2022

Bukhtawar commented Sep 27, 2022

Gaganjuneja commented Oct 3, 2022 •

edited

Loading

Apply index create block when all the nodes of the cluster has breached high disk watermark #4456

Apply index create block when all the nodes of the cluster has breached high disk watermark #4456

Comments

RS146BIJAY commented Sep 8, 2022 • edited Loading

Gaganjuneja commented Sep 20, 2022

Bukhtawar commented Sep 27, 2022

Gaganjuneja commented Oct 3, 2022 • edited Loading

RS146BIJAY commented Sep 8, 2022 •

edited

Loading

Gaganjuneja commented Oct 3, 2022 •

edited

Loading