Partial network partitioning leads to cluster unavailability. #43183

101alex · 2019-06-13T02:36:50Z

Elasticsearch version: 6.8.0

Plugins installed: []

JVM version (java -version): 1.8.0_212

OS version (uname -a if on a Unix-like system): OS[Linux/4.4.0-145-generic/amd64]

Description of the problem including expected versus actual behavior:

I have a cluster of 5 nodes (node1, node2, node3, node4, node5). Their ids are in the following order (this is critical for master election, which I believe is the core of this failure): node2 < node3 < the rest of nodes.

When the cluster first starts, as expected, node2 is elected as a master, however, after a partial network partitioning occurs, isolating node2 from nodes 3,4, and 5, (while node 3 is able to communicate with all other nodes) the cluster is stuck in an unavailability status. In other words, no new master is elected and node2 steps down from being a master.

I believe the expected behavior is to have node3 become the new master as it is connected to a majority of nodes and it has the smallest id (after node2). This does not happen, however. From examining the logs it seems that node3 keeps trying to join node2 as it thinks it should be the master while it does not accept the join requests from nodes 3, 4, and 5.

Steps to reproduce:

In my setup I have 1 index with the following settings:
* Number of shards: 1
* Number of replicas: 2
* write.wait_for_active_shards: 3

I have 5 nodes, and then once a partial network partitioning occurs (as described above), the system stays unavailable (I had waited for more than 1 hour and it stayed in that status) and it does not accept index operations.

***Log files are provided for the 5 nodes.

node1.log
node2.log
node3.log
node4.log
node5.log

The text was updated successfully, but these errors were encountered:

elasticmachine · 2019-06-13T06:44:49Z

Pinging @elastic/es-distributed

DaveCTurner · 2019-06-13T06:48:56Z

Can you describe more precisely what "partial network partition" means in this context? Which nodes can communicate with each other? Is the partition symmetric or asymmetric?

I think this is fixed in 7.x by #32006, and we're unlikely to take any action to address this in the 6.x series. Can you reproduce it in a more recent version?

101alex · 2019-06-14T06:28:30Z

Can you describe more precisely what "partial network partition" means in this context? Which nodes can communicate with each other? Is the partition symmetric or asymmetric?

A partial network partitioning is a network partitioning where two groups of nodes cannot communicate with each other, while there is a third group of nodes that can communicate with both.

I just realized that I accidentally wrote node3 instead of node1 in some places in the original post, sorry.

In my case the three groups are:
g1: node2.
g2: nodes 1, 4, and 5.
g3: node3

g1 members cannot communicate with g2 members (neither can g2 members communicate with g1 members) while g3 members can communicate with all other nodes (and all other nodes can communicate with g3). Hope this clarifies what I mean by partial network partitioning.

I think this is fixed in 7.x by #32006, and we're unlikely to take any action to address this in the 6.x series. Can you reproduce it in a more recent version?

Will try to reproduce this in 7.x soon and will let you know how it goes.

Thanks.

DaveCTurner · 2019-06-14T08:52:40Z

g1 members cannot communicate with g2 members (neither can g2 members communicate with g1 members) while g3 members can communicate with all other nodes (and all other nodes can communicate with g3).

Ok, this is a situation that master elections in 6.x and earlier are known not to tolerate, but in which 7.x and later clusters will still be able to elect a master. It also doesn't seem very likely to occur in the wild. It's always possible to find a partition that results in unavailability (e.g. disconnect all pairs of nodes) so we must make choices about the kinds of partition that are reasonable to tolerate. Typically we choose to concentrate on the ones that occur in practice in a properly-configured network, which don't look like this.

Closing this as there's no further action required here, but please do report back on your experiments with 7.x.

DaveCTurner self-assigned this Jun 13, 2019

DaveCTurner added the :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. label Jun 13, 2019

DaveCTurner added the feedback_needed label Jun 13, 2019

DaveCTurner closed this as completed Jun 14, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Partial network partitioning leads to cluster unavailability. #43183

Partial network partitioning leads to cluster unavailability. #43183

101alex commented Jun 13, 2019

elasticmachine commented Jun 13, 2019

DaveCTurner commented Jun 13, 2019

101alex commented Jun 14, 2019

DaveCTurner commented Jun 14, 2019

Partial network partitioning leads to cluster unavailability. #43183

Partial network partitioning leads to cluster unavailability. #43183

Comments

101alex commented Jun 13, 2019

elasticmachine commented Jun 13, 2019

DaveCTurner commented Jun 13, 2019

101alex commented Jun 14, 2019

DaveCTurner commented Jun 14, 2019