Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autoscaling during shrink #88292

Conversation

henningandersen
Copy link
Contributor

Fix autoscaling during shrink to disregard pinning to nodes for the total tier size.
Instead signal that we need a minimum node size to hold the entire shrink
operation. This avoids scaling far higher than necessary when cluster
balancing does not allow a shrink to proceed. It is considered a
(separate) balancing issue when a shrink cannot complete with enough
space in the tier.

This changes autoscaling in general for node pinning filters (based on
_id, _name or name filters).

Clone and split also pins to the shards they clone or split, similarly
this is changed to ignore that pinning when calculating the total tier size.

Closes #85480

Fix autoscaling during shrink to disregard pinning to nodes. Instead
signal that we need a minimum node size to hold the entire shrink
operation. This avoids scaling far higher than necessary when cluster
balancing does not allow a shrink to proceed. It is considered a
(separate) balancing issue when a shrink cannot complete with enough
space in the tier.

This changes autoscaling in general for node pinning filters (based on
`_id`, `_name` or `name` filters).

Clone and split also pins to the shards they clone or split, similarly
this is changed to ignore that pinning during autoscaling.

Closes elastic#85480
@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Jul 5, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@elasticsearchmachine
Copy link
Collaborator

Hi @henningandersen, I've created a changelog YAML for you.

@fcofdez fcofdez self-requested a review July 6, 2022 11:14
Copy link
Contributor

@fcofdez fcofdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly good, I left a few comments 👍

// For resize shards only allow autoscaling if there is no other node where the shard could fit had it not been
// a resize shard. Notice that we already removed any initial_recovery filters.
diskOnly = nodesInTier(allocation.routingNodes()).map(node -> allocationDeciders.canAllocate(shard, node, allocation))
.anyMatch(ReactiveStorageDeciderService::isResizeOnlyNoDecision) == false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a test where there's enough space to hold the resize shard so we verify that we don't request more capacity in that case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find double negations a bit trappy 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I missed that.

@@ -222,7 +298,8 @@ public static class AllocationState {
Set<DiscoveryNode> nodes,
Set<DiscoveryNodeRole> roles
) {
this.state = state;
this.state = removeNodeLockFilters(state);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if removing the node lock filter in all cases couldn't lead to a different allocation outcome? Maybe we should only remove the initial_recovery setting here? I'm not 100% sure though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea I followed was that we should ensure that we have a node that is large enough to hold the node-locked data, hence the addition to ensure we deliver a proper node-level size.

With that done, we can assume that it is an allocation problem if it cannot fit. Hence it seems fair to remove the node locking here. I am aware that our allocation system is not yet sophisticated enough, but I'd rather not autoscale to a too large setup (since that may be multiple steps too large) in that case. For ILM controlled shrink, it will eventually fail and retry. Manual intervention may be necessary until our allocation system can handle this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, thanks for clarifying.

@henningandersen
Copy link
Contributor Author

Thanks @fcofdez for the review and sorry about the left-over code from previous generations of this PR. This is ready for another round (commented on the two other outstanding comments).

@henningandersen henningandersen requested a review from fcofdez July 7, 2022 19:19
@henningandersen
Copy link
Contributor Author

@elasticmachine update branch

Copy link
Contributor

@fcofdez fcofdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@henningandersen
Copy link
Contributor Author

Thanks Francisco!

@henningandersen henningandersen merged commit 66e750c into elastic:master Jul 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed Coordination/Autoscaling Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v8.4.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Autoscaling may scale too far during index shrink
4 participants