Improve index-allocation retry limitation for primary shards #19446

clintongormley · 2016-07-15T10:49:15Z

#18467 was intended to limit how many times shard allocation was attempted before giving up, instead of constantly retrying. eg if an analyzer requires a local synonyms file, which is missing, then we should stop trying until the situation is resolved.

Unfortunately it doesn't work for unassigned primary shards because of the way allocation works today. For instance, a primary shard might be sitting on a disk which is over the high watermark, or it may have allocation filtering which prevents it from being allocated to the node where it already exists. Today, we just go ahead and try to assign the primary regardless of the allocation deciders (which means that we also don't limit retries to 5).

Instead, we could add extra logic to the appropriate deciders to say "always return YES if the shard in question is a primary which already exists on this node". The decision returned would be YES, but the explanation provided by the reroute or cluster allocation explain could include the reason this decider was ignored.

/cc @ywelsch @dakrone @s1monw

Related to #18321

The text was updated successfully, but these errors were encountered:

ywelsch · 2016-07-15T11:17:09Z

One thing to add here: The primary shard allocator forces allocation of a primary shard if allocation deciders for ALL nodes with shard copies say NO (*). If there’s a node with a shard copy where deciders says YES, we allocate to that node. If there’s a node with a shard copy where deciders say THROTTLE, and there is no node with YES, we don't allocate the shard in that round.

What this means, however, is that we cannot simply add this override logic "always return YES if the shard in question is a primary which already exists on this node" at the decider level, because the overriding logic only comes into play once we've looked at deciders of ALL nodes with shard copies.

If we want to reflect this correctly in the cluster allocation explain API, it would also need to check the condition marked by * above.

bleskes · 2016-07-22T10:00:25Z

We discussed possible solutions on fix it friday. Here's what we came up with to deal with the current primary allocation logic. Instead of how it works today, which @ywelsch explained, we should do the following:

Add a canForceAssignPrimary method to the AllocationDecider class. That one will say YES by default and the MaxRetryAllocationDecider will forward it to it's canAllocate method.

Using this new method the PrimaryShardAllocator logic becomes:

Iterate on all valid found allocation and ask the deciders for a decision.
If there is a node which resulted in YES assign the primary there.
If there is a node which resulted in THROTTLE, wait.
If all nodes got a NO result, instead of what we do now, we will iterate all nodes again, this time calling the new canForceAssignPrimary method. If some node has a YES now, we assign there. If some node has THROTTLE we wait. On NO on all nodes, we stay red.

Note that we will also need to adapt the allocation explain API to reflect this

bleskes · 2016-07-22T10:01:07Z

@abeyad maybe this is something for you?

abeyad · 2016-07-22T13:24:46Z

@bleskes Sure, just assigned it to myself.

Previously, during primary shards allocation of shards with prior allocation IDs, if all nodes returned a NO decision for allocation (e.g. the settings blocked allocation on that node), we would chose one of those nodes and force the primary shard to be allocated to it. However, this meant that primary shard allocation would not adhere to the decision of the MaxRetryAllocationDecider, which would lead to attempting to allocate a shard which has failed N number of times already (presumably due to some configuration issue). This commit solves this issue by introducing the notion of force allocating a primary shard to a node and each decider implementation must implement whether this is allowed or not. In the case of MaxRetryAllocationDecider, it just forwards the request to canAllocate. Closes elastic#19446

Primary shard allocation observes limits in forcing allocation Previously, during primary shards allocation of shards with prior allocation IDs, if all nodes returned a NO decision for allocation (e.g. the settings blocked allocation on that node), we would chose one of those nodes and force the primary shard to be allocated to it. However, this meant that primary shard allocation would not adhere to the decision of the MaxRetryAllocationDecider, which would lead to attempting to allocate a shard which has failed N number of times already (presumably due to some configuration issue). This commit solves this issue by introducing the notion of force allocating a primary shard to a node and each decider implementation must implement whether this is allowed or not. In the case of MaxRetryAllocationDecider, it just forwards the request to canAllocate. Closes #19446

clintongormley added >enhancement discuss :Allocation labels Jul 15, 2016

bleskes added help wanted adoptme and removed discuss labels Jul 22, 2016

abeyad self-assigned this Jul 22, 2016

abeyad removed the help wanted adoptme label Jul 23, 2016

bleskes mentioned this issue Jul 25, 2016

Elasticsearch continues to attempt writes to an index that has shards on disks that are full #19584

Closed

abeyad mentioned this issue Aug 4, 2016

Primary shard allocator observes limits in forcing allocation #19811

Merged

abeyad closed this as completed in #19811 Aug 16, 2016

lcawl added :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. and removed :Allocation labels Feb 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve index-allocation retry limitation for primary shards #19446

Improve index-allocation retry limitation for primary shards #19446

clintongormley commented Jul 15, 2016

ywelsch commented Jul 15, 2016

bleskes commented Jul 22, 2016

bleskes commented Jul 22, 2016

abeyad commented Jul 22, 2016

Improve index-allocation retry limitation for primary shards #19446

Improve index-allocation retry limitation for primary shards #19446

Comments

clintongormley commented Jul 15, 2016

ywelsch commented Jul 15, 2016

bleskes commented Jul 22, 2016

bleskes commented Jul 22, 2016

abeyad commented Jul 22, 2016