-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rolling restarts and allocation disabling #19739
Comments
Would also be useful to list (in the same area) the exception or response returned when a request to index to a primary shard for that host is attempted. |
Yes, I think we should probably change this advice. In fact, it's not even a requirement to increase this timeout. If the shard starts reallocating and then a node joins with the shard intact, the reallocation should be cancelled. (not sure what happens with unsync'ed shards). @ywelsch what do you think? |
@clintongormley The cancelling only works for synced shards. If there has been write activity on the primary while the node was restarted (or 5 minutes prior to the restart with no explicit synced flush), the existing replica allocation is not cancelled. There is certainly room for future improvements in this area. If there are writes to be expected on the indices while the node is restarted and it is likely that the node will miss the default delayed timeout, increasing the timeout is a solution. The risk with temporarily increasing the value of |
Am I understanding correctly that the current best practice for rolling restarts is to fiddle with delay allocation instead of setting |
The documentation still hasn't been changed on this, and no agreement has been found. To me, it sounds like Is the suggestion per #19739 (comment) to set |
Pinging @elastic/es-distributed |
Clarify the “one minute” in the instructions to disable the shard allocation when doing maintenance to say that it is configurable. Add a note about making sure that no rebalancing occurs until the maintenance is complete. Relates elastic#19739.
I think the docs are right: it seems appropriate to use Note that there are some alterations to this area of the docs in flight (#29670, #29671) which continue to recommend using It could be problematic if rebalancing kicks in while the node is coming back into the cluster. By default rebalancing only has an effect once the cluster is green ( |
I think there's no more action to take on this issue, closing. |
Here we state to use
"cluster.routing.allocation.enable" : "none"
before restarting nodes.With the inclusion of
index.unassigned.node_left.delayed_timeout
in later versions, would it make sense to update our recommended practises to temporarily increase this setting, instead of entirely disabling allocation?This was raised on this forum issue.
The text was updated successfully, but these errors were encountered: