Avoid overshooting watermarks during relocation #46128
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Today the
DiskThresholdDecider
attempts to account for already-relocatingshards when deciding how to allocate or relocate a shard. Its goal is to stop
relocating shards onto a node before that node exceeds the low watermark, and
to stop relocating shards away from a node as soon as the node drops below the
high watermark.
The decider handles multiple data paths by only accounting for relocating
shards that affect the appropriate data path. However, this mechanism does not
correctly account for new relocating shards, which are unwittingly ignored.
This means that we may evict far too many shards from a node above the high
watermark, and may relocate far too many shards onto a node causing it to blow
right past the low watermark and potentially other watermarks too.
There are in fact two distinct issues that this PR fixes. New incoming shards
have an unknown data path until the
ClusterInfoService
refreshes itsstatistics. New outgoing shards have a known data path, but we fail to account
for the change of the corresponding
ShardRouting
fromSTARTED
toRELOCATING
, meaning that we fail to find the correct data path and treat thepath as unknown here too.
This PR also reworks the
MockDiskUsagesIT
test to avoid using fake data pathsfor all shards. With the changes here, the data paths are handled in tests as
they are in production, except that their sizes are fake.
Fixes #45177
Backport of #46079