Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify remote clusters' use of transport layer #60268

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions docs/reference/modules/remote-clusters.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,10 @@ to a limited number of nodes in that remote cluster. There are two modes for
remote cluster connections: <<sniff-mode,sniff mode>> and
<<proxy-mode,proxy mode>>.

All the communication required between different clusters
goes through the <<modules-transport,transport layer>>. Remote cluster
connections consist of uni-directional connections from the coordinating
node to the remote remote connections.
Communication with a remote cluster uses the <<modules-transport,transport
layer>> to establishe a number of <<long-lived-connections,long-lived>> TCP
connections from the coordinating nodes of the local cluster to the chosen
nodes in the remote cluster.

[discrete]
[[sniff-mode]]
Expand Down
39 changes: 17 additions & 22 deletions docs/reference/modules/transport.asciidoc
Original file line number Diff line number Diff line change
@@ -1,18 +1,12 @@
[[modules-transport]]
=== Transport

The transport networking layer is used for internal communication between nodes
within the cluster. Each call that goes from one node to the other uses
the transport layer (for example, when an HTTP GET request is processed
by one node, and should actually be processed by another node that holds
the data).

The transport mechanism is completely asynchronous in nature, meaning
that there is no blocking thread waiting for a response. The benefit of
using asynchronous communication is first solving the
http://en.wikipedia.org/wiki/C10k_problem[C10k problem], as well as
being the ideal solution for scatter (broadcast) / gather operations such
as search in Elasticsearch.
Clients send requests to your {es} cluster over <<modules-http,HTTP>>, but the
node that receives a client request cannot always handle it alone and must
normally pass it on to other nodes for further processing. It does this using
the transport networking layer. The transport layer is used for all internal
communication between nodes within a cluster, and also for all communication
with the nodes of a <<modules-remote-clusters,remote cluster>>.

[[transport-settings]]
==== Transport settings
Expand Down Expand Up @@ -137,17 +131,18 @@ configured, and defaults otherwise to `transport.tcp.reuse_address`.
[[long-lived-connections]]
===== Long-lived idle connections

Elasticsearch opens a number of long-lived TCP connections between each pair of
nodes in the cluster, and some of these connections may be idle for an extended
period of time. Nonetheless, Elasticsearch requires these connections to remain
open, and it can disrupt the operation of the cluster if any inter-node
connections are closed by an external influence such as a firewall. It is
important to configure your network to preserve long-lived idle connections
between Elasticsearch nodes, for instance by leaving `tcp.keep_alive` enabled
and ensuring that the keepalive interval is shorter than any timeout that might
A transport connection between two nodes is made up of a number of long-lived
TCP connections, some of which may be idle for an extended period of time.
Nonetheless, Elasticsearch requires these connections to remain open, and it
can disrupt the operation of your cluster if any inter-node connections are
closed by an external influence such as a firewall. It is important to
configure your network to preserve long-lived idle connections between
Elasticsearch nodes, for instance by leaving `tcp.keep_alive` enabled and
ensuring that the keepalive interval is shorter than any timeout that might
cause idle connections to be closed, or by setting `transport.ping_schedule` if
keepalives cannot be configured.

keepalives cannot be configured. Devices which drop connections when they reach
a certain age are a common source of problems to Elasticsearch clusters, and
must not be used.

[[request-compression]]
===== Request compression
Expand Down
26 changes: 18 additions & 8 deletions docs/reference/setup/sysconfig/tcpretries.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,9 @@
=== TCP retransmission timeout

Each pair of nodes in a cluster communicates via a number of TCP connections
which remain open until one of the nodes shuts down or communication between
the nodes is disrupted by a failure in the underlying infrastructure.
which <<long-lived-connections,remain open>> until one of the nodes shuts down
or communication between the nodes is disrupted by a failure in the underlying
infrastructure.

TCP provides reliable communication over occasionally-unreliable networks by
hiding temporary network disruptions from the communicating applications. Your
Expand Down Expand Up @@ -36,14 +37,23 @@ To set this value permanently, update the `net.ipv4.tcp_retries2` setting in
`/etc/sysctl.conf`. To verify after rebooting, run
`sysctl net.ipv4.tcp_retries2`.

{es} also implements its own health checks with timeouts that are much shorter
than the default retransmission timeout on Linux. However these health checks
must allow for application-level effects such as garbage collection pauses. We
do not recommend reducing any timeouts related to these application-level
health checks.

IMPORTANT: This setting applies to all TCP connections and will affect the
reliability of communication with systems outside your cluster too. If your
cluster communicates with external systems over an unreliable network then you
may need to select a higher value for `net.ipv4.tcp_retries2`. For this reason,
{es} does not adjust this setting automatically.

==== Related configuration

{es} also implements its own internal health checks with timeouts that are much
shorter than the default retransmission timeout on Linux. Since these are
application-level health checks their timeouts must allow for application-level
effects such as garbage collection pauses. You should not reduce any timeouts
related to these application-level health checks.

You must also ensure your network infrastructure does not interfere with the
long-lived connections between nodes, <<long-lived-connections,even if those
connections appear to be idle>>. Devices which drop connections when they reach
a certain age are a common source of problems to Elasticsearch clusters, and
must not be used.