-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] DiskThresholdDecider should consider shard size of all relocating shards on the target node while making relocation decisions #5386
Comments
@RS146BIJAY Thanks for reporting the issue, can you please provide details on how to reproduce this or at least a test for this scenario? |
As per my understanding of the issue , we are not failing fast before starting the allocation in case of disk space breach. If no one has started on the issue, I can pick this one. |
@jayeshathila we are already working on this fix. |
@RS146BIJAY In The DiskThresholdDecider, it iterates over all the relocating shards which are in "INITIALIZING" state so it does handle multiple recoveries in single reroute operation. Line 133 in 4b4d84e
Can you expand when you say concurrent relocation? Any change in the shard allocation/ movement is executed by the active leader which processes them sequentially so there will never be concurrent relocations triggered from multiple threads. Now there could be issues with
|
@shwetathareja yeah we crossed check this and validated that parallel relocation may not be possible. We have kept this issue on hold as of now. We will revisit this issue again once we have a bit more data points around what may have caused the issue on the affected domain. |
Describe the bug
OpenSearch relocates shards away from the node on which it has breached high disk watermark. While selecting the target node for a relocating shard, DiskThresholdDecider considers that allocating this shard on a node will not bring the target node above the high watermark. Since relocation can happen concurrently, it is possible that DiskThresholdDecider selects the same node as the target node for multiple relocations. Here, it is possible that even though individual checks for shard relocation decision can pass, relocating all the shards (across concurrent relocations) can cause free disk space on the target node to become 0.
Expected behavior
While selecting the target node, DiskThresholdDecider should consider shard size of all the shards that are migrated to this node, not just the current relocating shard.
The text was updated successfully, but these errors were encountered: