[DOCS] Merges list of discovery and cluster formation settings (#36909)

elastic · Dec 21, 2018 · 33e9cf3 · 33e9cf3
1 parent c8a8391
commit 33e9cf3
Show file tree

Hide file tree

Showing 10 changed files with 217 additions and 195 deletions.
diff --git a/docs/reference/modules/discovery.asciidoc b/docs/reference/modules/discovery.asciidoc
@@ -40,22 +40,15 @@ module. This module is divided into the following sections:
     Cluster state publishing is the process by which the elected master node
     updates the cluster state on all the other nodes in the cluster.
 
-<<no-master-block>>::
-
-    The no-master block is put in place when there is no known elected master,
-    and can be configured to determine which operations should be rejected when
-    it is in place.
-
-Advanced settings::
-
-    There are settings that allow advanced users to influence the
-    <<master-election-settings,master election>> and
-    <<fault-detection-settings,fault detection>> processes.
-
 <<modules-discovery-quorums>>::
 
     This section describes the detailed design behind the master election and
     auto-reconfiguration logic.
+
+<<modules-discovery-settings,Settings>>::
+
+    There are settings that enable users to influence the discovery, cluster
+    formation, master election and fault detection processes.    
 
 include::discovery/discovery.asciidoc[]
 
@@ -65,11 +58,8 @@ include::discovery/adding-removing-nodes.asciidoc[]
 
 include::discovery/publishing.asciidoc[]
 
-include::discovery/no-master-block.asciidoc[]
-
-include::discovery/master-election.asciidoc[]
+include::discovery/quorums.asciidoc[]
 
 include::discovery/fault-detection.asciidoc[]
 
-include::discovery/quorums.asciidoc[]
-
+include::discovery/discovery-settings.asciidoc[]
diff --git a/docs/reference/modules/discovery/adding-removing-nodes.asciidoc b/docs/reference/modules/discovery/adding-removing-nodes.asciidoc
@@ -14,9 +14,15 @@ desirable to add or remove some master-eligible nodes to or from a cluster.
 
 ==== Adding master-eligible nodes
 
-If you wish to add some master-eligible nodes to your cluster, simply configure
-the new nodes to find the existing cluster and start them up. Elasticsearch will
-add the new nodes to the voting configuration if it is appropriate to do so.
+If you wish to add some nodes to your cluster, simply configure the new nodes
+to find the existing cluster and start them up. Elasticsearch adds the new nodes
+to the voting configuration if it is appropriate to do so.
+
+During master election or when joining an existing formed cluster, a node
+sends a join request to the master in order to be officially added to the
+cluster. You can use the `cluster.join.timeout` setting to configure how long a
+node waits after sending a request to join a cluster. Its default value is `30s`.
+See <<modules-discovery-settings>>.
 
 ==== Removing master-eligible nodes
 
@@ -93,18 +99,13 @@ GET /_cluster/state?filter_path=metadata.cluster_coordination.voting_config_excl
 --------------------------------------------------
 // CONSOLE
 
-This list is limited in size by the following setting:
-
-`cluster.max_voting_config_exclusions`::
-
-    Sets a limits on the number of voting configuration exclusions at any one
-    time.  Defaults to `10`.
-
-Since voting configuration exclusions are persistent and limited in number, they
-must be cleaned up. Normally an exclusion is added when performing some
-maintenance on the cluster, and the exclusions should be cleaned up when the
-maintenance is complete. Clusters should have no voting configuration exclusions
-in normal operation.
+This list is limited in size by the `cluster.max_voting_config_exclusions` 
+setting, which defaults to `10`. See <<modules-discovery-settings>>. Since
+voting configuration exclusions are persistent and limited in number, they must
+be cleaned up. Normally an exclusion is added when performing some maintenance
+on the cluster, and the exclusions should be cleaned up when the maintenance is
+complete. Clusters should have no voting configuration exclusions in normal
+operation.
 
 If a node is excluded from the voting configuration because it is to be shut
 down permanently, its exclusion can be removed after it is shut down and removed

diff --git a/docs/reference/modules/discovery/bootstrapping.asciidoc b/docs/reference/modules/discovery/bootstrapping.asciidoc
@@ -7,19 +7,13 @@ more of the master-eligible nodes in the cluster. This is known as _cluster
 bootstrapping_.  This is only required the very first time the cluster starts
 up: nodes that have already joined a cluster store this information in their
 data folder and freshly-started nodes that are joining an existing cluster
-obtain this information from the cluster's elected master. This information is
-given using this setting:
+obtain this information from the cluster's elected master. 
 
-`cluster.initial_master_nodes`::
-
-    Sets a list of the <<node.name,node names>> or transport addresses of the
-    initial set of master-eligible nodes in a brand-new cluster. By default
-    this list is empty, meaning that this node expects to join a cluster that
-    has already been bootstrapped.
-
-This setting can be given on the command line or in the `elasticsearch.yml`
-configuration file when starting up a master-eligible node. Once the cluster
-has formed this setting is no longer required and is ignored. It need not be set
+The initial set of master-eligible nodes is defined in the 
+<<initial_master_nodes,`cluster.initial_master_nodes` setting>>. When you
+start a master-eligible node, you can provide this setting on the command line
+or in the `elasticsearch.yml` file. After the cluster has formed, this setting
+is no longer required and is ignored. It need not be set
 on master-ineligible nodes, nor on master-eligible nodes that are started to
 join an existing cluster. Note that master-eligible nodes should use storage
 that persists across restarts. If they do not, and

diff --git a/docs/reference/modules/discovery/discovery-settings.asciidoc b/docs/reference/modules/discovery/discovery-settings.asciidoc
@@ -0,0 +1,160 @@
+[[modules-discovery-settings]]
+=== Discovery and cluster formation settings
+
+Discovery and cluster formation are affected by the following settings:
+
+[[master-election-settings]]`cluster.election.back_off_time`::
+
+    Sets the amount to increase the upper bound on the wait before an election
+    on each election failure. Note that this is _linear_ backoff. This defaults
+    to `100ms`
+
+`cluster.election.duration`::
+
+    Sets how long each election is allowed to take before a node considers it to
+    have failed and schedules a retry. This defaults to `500ms`.
+
+`cluster.election.initial_timeout`::
+
+    Sets the upper bound on how long a node will wait initially, or after the
+    elected master fails, before attempting its first election. This defaults
+    to `100ms`.
+
+
+`cluster.election.max_timeout`::
+
+    Sets the maximum upper bound on how long a node will wait before attempting
+    an first election, so that an network partition that lasts for a long time
+    does not result in excessively sparse elections. This defaults to `10s`
+
+[[fault-detection-settings]]`cluster.fault_detection.follower_check.interval`::
+
+    Sets how long the elected master waits between follower checks to each
+    other node in the cluster. Defaults to `1s`.
+
+`cluster.fault_detection.follower_check.timeout`::
+
+    Sets how long the elected master waits for a response to a follower check
+    before considering it to have failed. Defaults to `30s`.
+
+`cluster.fault_detection.follower_check.retry_count`::
+
+    Sets how many consecutive follower check failures must occur to each node
+    before the elected master considers that node to be faulty and removes it
+    from the cluster. Defaults to `3`.
+
+`cluster.fault_detection.leader_check.interval`::
+
+    Sets how long each node waits between checks of the elected master.
+    Defaults to `1s`.
+
+`cluster.fault_detection.leader_check.timeout`::
+
+    Sets how long each node waits for a response to a leader check from the
+    elected master before considering it to have failed. Defaults to `30s`.
+
+`cluster.fault_detection.leader_check.retry_count`::
+
+    Sets how many consecutive leader check failures must occur before a node
+    considers the elected master to be faulty and attempts to find or elect a
+    new master. Defaults to `3`.
+
+`cluster.follower_lag.timeout`::
+
+    Sets how long the master node waits to receive acknowledgements for cluster
+    state updates from lagging nodes. The default value is `90s`. If a node does
+    not successfully apply the cluster state update within this period of time,
+    it is considered to have failed and is removed from the cluster. See
+    <<cluster-state-publishing>>.  
+
+`cluster.initial_master_nodes`::
+
+    Sets a list of the <<node.name,node names>> or transport addresses of the
+    initial set of master-eligible nodes in a brand-new cluster. By default
+    this list is empty, meaning that this node expects to join a cluster that
+    has already been bootstrapped. See <<initial_master_nodes>>.
+
+`cluster.join.timeout`::
+
+    Sets how long a node will wait after sending a request to join a cluster
+    before it considers the request to have failed and retries. Defaults to
+    `60s`.
+
+`cluster.max_voting_config_exclusions`::
+
+    Sets a limit on the number of voting configuration exclusions at any one
+    time. The default value is `10`. See
+    <<modules-discovery-adding-removing-nodes>>.
+
+`cluster.publish.timeout`:: 
+
+    Sets how long the master node waits for each cluster state update to be
+    completely published to all nodes. The default value is `30s`. If this
+    period of time elapses, the cluster state change is rejected. See
+    <<cluster-state-publishing>>.   
+
+`discovery.cluster_formation_warning_timeout`::
+
+    Sets how long a node will try to form a cluster before logging a warning
+    that the cluster did not form. Defaults to `10s`. If a cluster has not 
+    formed after `discovery.cluster_formation_warning_timeout` has elapsed then
+    the node will log a warning message that starts with the phrase `master not discovered` which describes the current state of the discovery process.
+
+`discovery.find_peers_interval`::
+
+    Sets how long a node will wait before attempting another discovery round.
+    Defaults to `1s`.
+
+`discovery.probe.connect_timeout`::
+
+    Sets how long to wait when attempting to connect to each address. Defaults
+    to `3s`.
+
+`discovery.probe.handshake_timeout`::
+
+    Sets how long to wait when attempting to identify the remote node via a
+    handshake. Defaults to `1s`.
+
+`discovery.request_peers_timeout`::
+    Sets how long a node will wait after asking its peers again before
+    considering the request to have failed. Defaults to `3s`.    
+
+`discovery.zen.hosts_provider`:: 
+    Specifies which type of <<built-in-hosts-providers,hosts provider>> provides
+    the list of seed nodes. By default, it is the 
+    <<settings-based-hosts-provider,settings-based hosts provider>>.
+
+[[no-master-block]]`discovery.zen.no_master_block`::
+Specifies which operations are rejected when there is no active master in a
+cluster. This setting has two valid values:
++
+--
+`all`::: All operations on the node (both read and write operations) are rejected.
+This also applies for API cluster state read or write operations, like the get
+index settings, put mapping and cluster state API.
+
+`write`::: (default) Write operations are rejected. Read operations succeed,
+based on the last known cluster configuration. This situation may result in
+partial reads of stale data as this node may be isolated from the rest of the
+cluster.
+
+[NOTE]
+===============================
+* The `discovery.zen.no_master_block` setting doesn't apply to nodes-based APIs
+(for example, cluster stats, node info, and node stats APIs). Requests to these
+APIs are not be blocked and can run on any available node.
+  
+* For the cluster to be fully operational, it must have an active master.
+===============================
+--
+
+`discovery.zen.ping.unicast.hosts`::
+
+    Provides a list of master-eligible nodes in the cluster. The list contains
+    either an array of hosts or a comma-delimited string. Each value has the
+    format `host:port` or `host`, where `port` defaults to the setting `transport.profiles.default.port`. Note that IPv6 hosts must be bracketed.
+    The default value is `127.0.0.1, [::1]`. See <<unicast.hosts>>.
+
+`discovery.zen.ping.unicast.hosts.resolve_timeout`::
+
+    Sets the amount of time to wait for DNS lookups on each round of discovery. This is specified as a <<time-units, time unit>> and defaults to `5s`.
diff --git a/docs/reference/modules/discovery/discovery.asciidoc b/docs/reference/modules/discovery/discovery.asciidoc
@@ -82,9 +82,10 @@ gives a convenient mechanism for an Elasticsearch instance that is run in a
 Docker container to be dynamically supplied with a list of IP addresses to
 connect to when those IP addresses may not be known at node startup.
 
-To enable file-based discovery, configure the `file` hosts provider as follows:
+To enable file-based discovery, configure the `file` hosts provider as follows
+in the `elasticsearch.yml` file:
 
-[source,txt]
+[source,yml]
 ----------------------------------------------------------------
 discovery.zen.hosts_provider: file
 ----------------------------------------------------------------
@@ -150,39 +151,3 @@ a hosts provider that uses the Azure Classic API find a list of seed nodes.
 
 The {plugins}/discovery-gce.html[GCE discovery plugin] adds a hosts provider
 that uses the GCE API find a list of seed nodes.
-
-[float]
-==== Discovery settings
-
-The discovery process is controlled by the following settings.
-
-`discovery.find_peers_interval`::
-
-    Sets how long a node will wait before attempting another discovery round.
-    Defaults to `1s`.
-
-`discovery.request_peers_timeout`::
-
-    Sets how long a node will wait after asking its peers again before
-    considering the request to have failed. Defaults to `3s`.
-
-`discovery.probe.connect_timeout`::
-
-    Sets how long to wait when attempting to connect to each address. Defaults
-    to `3s`.
-
-`discovery.probe.handshake_timeout`::
-
-    Sets how long to wait when attempting to identify the remote node via a
-    handshake. Defaults to `1s`.
-
-`discovery.cluster_formation_warning_timeout`::
-
-    Sets how long a node will try to form a cluster before logging a warning
-    that the cluster did not form. Defaults to `10s`.
-
-If a cluster has not formed after `discovery.cluster_formation_warning_timeout`
-has elapsed then the node will log a warning message that starts with the phrase
-`master not discovered` which describes the current state of the discovery
-process.
-
diff --git a/docs/reference/modules/discovery/fault-detection.asciidoc b/docs/reference/modules/discovery/fault-detection.asciidoc
@@ -1,52 +1,19 @@
-[[fault-detection-settings]]
-=== Cluster fault detection settings
+[[cluster-fault-detection]]
+=== Cluster fault detection
 
-An elected master periodically checks each of the nodes in the cluster in order
-to ensure that they are still connected and healthy, and in turn each node in
-the cluster periodically checks the health of the elected master. These checks
+The elected master periodically checks each of the nodes in the cluster to
+ensure that they are still connected and healthy. Each node in the cluster also periodically checks the health of the elected master. These checks
 are known respectively as _follower checks_ and _leader checks_.
 
-Elasticsearch allows for these checks occasionally to fail or timeout without
-taking any action, and will only consider a node to be truly faulty after a
-number of consecutive checks have failed. The following settings control the
-behaviour of fault detection.
-
-`cluster.fault_detection.follower_check.interval`::
-
-    Sets how long the elected master waits between follower checks to each
-    other node in the cluster. Defaults to `1s`.
-
-`cluster.fault_detection.follower_check.timeout`::
-
-    Sets how long the elected master waits for a response to a follower check
-    before considering it to have failed. Defaults to `30s`.
-
-`cluster.fault_detection.follower_check.retry_count`::
-
-    Sets how many consecutive follower check failures must occur to each node
-    before the elected master considers that node to be faulty and removes it
-    from the cluster. Defaults to `3`.
-
-`cluster.fault_detection.leader_check.interval`::
-
-    Sets how long each node waits between checks of the elected master.
-    Defaults to `1s`.
-
-`cluster.fault_detection.leader_check.timeout`::
-
-    Sets how long each node waits for a response to a leader check from the
-    elected master before considering it to have failed. Defaults to `30s`.
-
-`cluster.fault_detection.leader_check.retry_count`::
-
-    Sets how many consecutive leader check failures must occur before a node
-    considers the elected master to be faulty and attempts to find or elect a
-    new master. Defaults to `3`.
-
-If the elected master detects that a node has disconnected then this is treated
-as an immediate failure, bypassing the timeouts and retries listed above, and
-the master attempts to remove the node from the cluster. Similarly, if a node
-detects that the elected master has disconnected then this is treated as an
-immediate failure, bypassing the timeouts and retries listed above, and the
-follower restarts its discovery phase to try and find or elect a new master.
-
+Elasticsearch allows these checks to occasionally fail or timeout without
+taking any action. It considers a node to be faulty only after a number of
+consecutive checks have failed. You can control fault detection behavior with
+<<modules-discovery-settings,`cluster.fault_detection.*` settings>>.
+
+If the elected master detects that a node has disconnected, however, this
+situation is treated as an immediate failure. The master bypasses the timeout
+and retry setting values and attempts to remove the node from the cluster.
+Similarly, if a node detects that the elected master has disconnected, this
+situation is treated as an immediate failure. The node bypasses the timeout and
+retry settings and restarts its discovery phase to try and find or elect a new
+master.