Skip to content

Commit

Permalink
Add nuance around stretched clusters (#77360)
Browse files Browse the repository at this point in the history
Today the multi-zone-cluster design docs say to keep all the nodes in a
single datacenter. This doesn't really reflect what we do in practice:
each zone in AWS/GCP/Azure/etc is a separate datacenter with decent
connectivity to the other zones in the same region. This commit adjusts
the docs to allow for this.

Co-authored-by: James Rodewig <[email protected]>
  • Loading branch information
DaveCTurner and jrodewig committed Sep 9, 2021
1 parent 67997dc commit 6f6863d
Showing 1 changed file with 37 additions and 13 deletions.
50 changes: 37 additions & 13 deletions docs/reference/high-availability/cluster-design.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -230,24 +230,48 @@ The cluster will be resilient to the loss of any node as long as:
[[high-availability-cluster-design-large-clusters]]
=== Resilience in larger clusters

It is not unusual for nodes to share some common infrastructure, such as a power
supply or network router. If so, you should plan for the failure of this
It's not unusual for nodes to share common infrastructure, such as network
interconnects or a power supply. If so, you should plan for the failure of this
infrastructure and ensure that such a failure would not affect too many of your
nodes. It is common practice to group all the nodes sharing some infrastructure
into _zones_ and to plan for the failure of any whole zone at once.

Your cluster’s zones should all be contained within a single data centre. {es}
expects its node-to-node connections to be reliable and have low latency and
high bandwidth. Connections between data centres typically do not meet these
expectations. Although {es} will behave correctly on an unreliable or slow
network, it will not necessarily behave optimally. It may take a considerable
length of time for a cluster to fully recover from a network partition since it
must resynchronize any missing data and rebalance the cluster once the
partition heals. If you want your data to be available in multiple data centres,
deploy a separate cluster in each data centre and use
<<modules-cross-cluster-search,{ccs}>> or <<xpack-ccr,{ccr}>> to link the
{es} expects node-to-node connections to be reliable, have low latency, and
have adequate bandwidth. Many {es} tasks require multiple round-trips between
nodes. A slow or unreliable interconnect may have a significant effect on the
performance and stability of your cluster.

For example, a few milliseconds of latency added to each round-trip can quickly
accumulate into a noticeable performance penalty. An unreliable network may
have frequent network partitions. {es} will automatically recover from a
network partition as quickly as it can but your cluster may be partly
unavailable during a partition and will need to spend time and resources to
resynchronize any missing data and rebalance itself once the partition heals.
Recovering from a failure may involve copying a large amount of data between
nodes so the recovery time is often determined by the available bandwidth.

If you've divided your cluster into zones, the network connections within each
zone are typically of higher quality than the connections between the zones.
Ensure the network connections between zones are of sufficiently high quality.
You will see the best results by locating all your zones within a single data
center with each zone having its own independent power supply and other
supporting infrastructure. You can also _stretch_ your cluster across nearby
data centers as long as the network interconnection between each pair of data
centers is good enough.

[[high-availability-cluster-design-min-network-perf]]
There is no specific minimum network performance required to run a healthy {es}
cluster. In theory, a cluster will work correctly even if the round-trip
latency between nodes is several hundred milliseconds. In practice, if your
network is that slow then the cluster performance will be very poor. In
addition, slow networks are often unreliable enough to cause network partitions
that lead to periods of unavailability.

If you want your data to be available in multiple data centers that are further
apart or not well connected, deploy a separate cluster in each data center and
use <<modules-cross-cluster-search,{ccs}>> or <<xpack-ccr,{ccr}>> to link the
clusters together. These features are designed to perform well even if the
cluster-to-cluster connections are less reliable or slower than the network
cluster-to-cluster connections are less reliable or performant than the network
within each cluster.

After losing a whole zone's worth of nodes, a properly-designed cluster may be
Expand Down

0 comments on commit 6f6863d

Please sign in to comment.