From 6912d22433378d84bdcfa04f7ac9b2316205e685 Mon Sep 17 00:00:00 2001 From: James Rodewig Date: Tue, 12 May 2020 17:39:06 -0400 Subject: [PATCH] [DOCS] Relocate discovery module content (#56611) * Moves `Discovery and cluster formation` content from `Modules` to `Set up Elasticsearch`. * Combines `Adding and removing nodes` with `Adding nodes to your cluster`. Adds related redirect. * Removes and redirects the `Modules` page. * Rewrites parts of `Discovery and cluster formation` to remove `module` references and meta references to the section. --- docs/reference/index.asciidoc | 2 - docs/reference/modules.asciidoc | 27 ---- docs/reference/modules/discovery.asciidoc | 22 ++- .../discovery/adding-removing-nodes.asciidoc | 134 ----------------- .../discovery/discovery-settings.asciidoc | 3 +- docs/reference/redirects.asciidoc | 33 +++++ docs/reference/setup.asciidoc | 4 + docs/reference/setup/add-nodes.asciidoc | 140 +++++++++++++++++- 8 files changed, 188 insertions(+), 177 deletions(-) delete mode 100644 docs/reference/modules.asciidoc delete mode 100644 docs/reference/modules/discovery/adding-removing-nodes.asciidoc diff --git a/docs/reference/index.asciidoc b/docs/reference/index.asciidoc index 93e3607a8674d..21a42a7f634cc 100644 --- a/docs/reference/index.asciidoc +++ b/docs/reference/index.asciidoc @@ -32,8 +32,6 @@ include::mapping.asciidoc[] include::analysis.asciidoc[] -include::modules.asciidoc[] - include::index-modules.asciidoc[] include::ingest.asciidoc[] diff --git a/docs/reference/modules.asciidoc b/docs/reference/modules.asciidoc deleted file mode 100644 index 1feafcbe3d30b..0000000000000 --- a/docs/reference/modules.asciidoc +++ /dev/null @@ -1,27 +0,0 @@ -[[modules]] -= Modules - -[partintro] --- -This section contains modules responsible for various aspects of the functionality in Elasticsearch. Each module has settings which may be: - -_static_:: - -These settings must be set at the node level, either in the -`elasticsearch.yml` file, or as an environment variable or on the command line -when starting a node. They must be set on every relevant node in the cluster. - -_dynamic_:: - -These settings can be dynamically updated on a live cluster with the -<> API. - -The modules in this section are: - -<>:: - - How nodes discover each other, elect a master and form a cluster. --- - - -include::modules/discovery.asciidoc[] \ No newline at end of file diff --git a/docs/reference/modules/discovery.asciidoc b/docs/reference/modules/discovery.asciidoc index d3e0d4fe84751..3a7a86abb818f 100644 --- a/docs/reference/modules/discovery.asciidoc +++ b/docs/reference/modules/discovery.asciidoc @@ -1,11 +1,13 @@ [[modules-discovery]] == Discovery and cluster formation -The discovery and cluster formation module is responsible for discovering +The discovery and cluster formation processes are responsible for discovering nodes, electing a master, forming a cluster, and publishing the cluster state -each time it changes. It is integrated with other modules. For example, all -communication between nodes is done using the <> -module. This module is divided into the following sections: +each time it changes. All communication between nodes is done using the +<> layer. + +The following processes and settings are part of discovery and cluster +formation: <>:: @@ -15,17 +17,17 @@ module. This module is divided into the following sections: <>:: - This section describes how {es} uses a quorum-based voting mechanism to + How {es} uses a quorum-based voting mechanism to make decisions even if some nodes are unavailable. <>:: - This section describes the concept of voting configurations, which {es} - automatically updates as nodes leave and join the cluster. + How {es} automatically updates voting configurations as nodes leave and join + a cluster. <>:: - Bootstrapping a cluster is required when an Elasticsearch cluster starts up + Bootstrapping a cluster is required when an {es} cluster starts up for the very first time. In <>, with no discovery settings configured, this is automatically performed by the nodes themselves. As this auto-bootstrapping is @@ -67,10 +69,6 @@ include::discovery/voting.asciidoc[] include::discovery/bootstrapping.asciidoc[] -include::discovery/adding-removing-nodes.asciidoc[] - include::discovery/publishing.asciidoc[] include::discovery/fault-detection.asciidoc[] - -include::discovery/discovery-settings.asciidoc[] diff --git a/docs/reference/modules/discovery/adding-removing-nodes.asciidoc b/docs/reference/modules/discovery/adding-removing-nodes.asciidoc deleted file mode 100644 index 9e316294497d7..0000000000000 --- a/docs/reference/modules/discovery/adding-removing-nodes.asciidoc +++ /dev/null @@ -1,134 +0,0 @@ -[[modules-discovery-adding-removing-nodes]] -=== Adding and removing nodes - -As nodes are added or removed Elasticsearch maintains an optimal level of fault -tolerance by automatically updating the cluster's _voting configuration_, which -is the set of <> whose responses are counted -when making decisions such as electing a new master or committing a new cluster -state. - -It is recommended to have a small and fixed number of master-eligible nodes in a -cluster, and to scale the cluster up and down by adding and removing -master-ineligible nodes only. However there are situations in which it may be -desirable to add or remove some master-eligible nodes to or from a cluster. - -[[modules-discovery-adding-nodes]] -==== Adding master-eligible nodes - -If you wish to add some nodes to your cluster, simply configure the new nodes -to find the existing cluster and start them up. Elasticsearch adds the new nodes -to the voting configuration if it is appropriate to do so. - -During master election or when joining an existing formed cluster, a node -sends a join request to the master in order to be officially added to the -cluster. You can use the `cluster.join.timeout` setting to configure how long a -node waits after sending a request to join a cluster. Its default value is `30s`. -See <>. - -[[modules-discovery-removing-nodes]] -==== Removing master-eligible nodes - -When removing master-eligible nodes, it is important not to remove too many all -at the same time. For instance, if there are currently seven master-eligible -nodes and you wish to reduce this to three, it is not possible simply to stop -four of the nodes at once: to do so would leave only three nodes remaining, -which is less than half of the voting configuration, which means the cluster -cannot take any further actions. - -More precisely, if you shut down half or more of the master-eligible nodes all -at the same time then the cluster will normally become unavailable. If this -happens then you can bring the cluster back online by starting the removed -nodes again. - -As long as there are at least three master-eligible nodes in the cluster, as a -general rule it is best to remove nodes one-at-a-time, allowing enough time for -the cluster to <> the voting -configuration and adapt the fault tolerance level to the new set of nodes. - -If there are only two master-eligible nodes remaining then neither node can be -safely removed since both are required to reliably make progress. To remove one -of these nodes you must first inform {es} that it should not be part of the -voting configuration, and that the voting power should instead be given to the -other node. You can then take the excluded node offline without preventing the -other node from making progress. A node which is added to a voting -configuration exclusion list still works normally, but {es} tries to remove it -from the voting configuration so its vote is no longer required. Importantly, -{es} will never automatically move a node on the voting exclusions list back -into the voting configuration. Once an excluded node has been successfully -auto-reconfigured out of the voting configuration, it is safe to shut it down -without affecting the cluster's master-level availability. A node can be added -to the voting configuration exclusion list using the -<> API. For example: - -[source,console] --------------------------------------------------- -# Add node to voting configuration exclusions list and wait for the system -# to auto-reconfigure the node out of the voting configuration up to the -# default timeout of 30 seconds -POST /_cluster/voting_config_exclusions/node_name - -# Add node to voting configuration exclusions list and wait for -# auto-reconfiguration up to one minute -POST /_cluster/voting_config_exclusions/node_name?timeout=1m --------------------------------------------------- -// TEST[skip:this would break the test cluster if executed] - -The node that should be added to the exclusions list is specified using -<> in place of `node_name` here. If a call to the -voting configuration exclusions API fails, you can safely retry it. Only a -successful response guarantees that the node has actually been removed from the -voting configuration and will not be reinstated. If it's the active master that -was removed from the voting configuration, then it will abdicate to another -master-eligible node that's still in the voting configuration, if such a node -is available. - -Although the voting configuration exclusions API is most useful for down-scaling -a two-node to a one-node cluster, it is also possible to use it to remove -multiple master-eligible nodes all at the same time. Adding multiple nodes to -the exclusions list has the system try to auto-reconfigure all of these nodes -out of the voting configuration, allowing them to be safely shut down while -keeping the cluster available. In the example described above, shrinking a -seven-master-node cluster down to only have three master nodes, you could add -four nodes to the exclusions list, wait for confirmation, and then shut them -down simultaneously. - -NOTE: Voting exclusions are only required when removing at least half of the -master-eligible nodes from a cluster in a short time period. They are not -required when removing master-ineligible nodes, nor are they required when -removing fewer than half of the master-eligible nodes. - -Adding an exclusion for a node creates an entry for that node in the voting -configuration exclusions list, which has the system automatically try to -reconfigure the voting configuration to remove that node and prevents it from -returning to the voting configuration once it has removed. The current list of -exclusions is stored in the cluster state and can be inspected as follows: - -[source,console] --------------------------------------------------- -GET /_cluster/state?filter_path=metadata.cluster_coordination.voting_config_exclusions --------------------------------------------------- - -This list is limited in size by the `cluster.max_voting_config_exclusions` -setting, which defaults to `10`. See <>. Since -voting configuration exclusions are persistent and limited in number, they must -be cleaned up. Normally an exclusion is added when performing some maintenance -on the cluster, and the exclusions should be cleaned up when the maintenance is -complete. Clusters should have no voting configuration exclusions in normal -operation. - -If a node is excluded from the voting configuration because it is to be shut -down permanently, its exclusion can be removed after it is shut down and removed -from the cluster. Exclusions can also be cleared if they were created in error -or were only required temporarily: - -[source,console] --------------------------------------------------- -# Wait for all the nodes with voting configuration exclusions to be removed from -# the cluster and then remove all the exclusions, allowing any node to return to -# the voting configuration in the future. -DELETE /_cluster/voting_config_exclusions - -# Immediately remove all the voting configuration exclusions, allowing any node -# to return to the voting configuration in the future. -DELETE /_cluster/voting_config_exclusions?wait_for_removal=false --------------------------------------------------- diff --git a/docs/reference/modules/discovery/discovery-settings.asciidoc b/docs/reference/modules/discovery/discovery-settings.asciidoc index db123ccc29f37..f0ce103ebb3b6 100644 --- a/docs/reference/modules/discovery/discovery-settings.asciidoc +++ b/docs/reference/modules/discovery/discovery-settings.asciidoc @@ -1,7 +1,8 @@ [[modules-discovery-settings]] === Discovery and cluster formation settings -Discovery and cluster formation are affected by the following settings: +<> are affected by the +following settings: `discovery.seed_hosts`:: + diff --git a/docs/reference/redirects.asciidoc b/docs/reference/redirects.asciidoc index 5070903b4d8ca..13408baa83940 100644 --- a/docs/reference/redirects.asciidoc +++ b/docs/reference/redirects.asciidoc @@ -752,6 +752,39 @@ See <>. See <>. +[role="exclude",id="modules"] +=== Modules + +This page has been removed. + +See <> for settings information: + +* <> +* <> +* <> +* <> +* <> +* <> +* <> +* <> +* <> +* <> +* <> +* <> + +For other information, see: + +* <> +* <> +* <> +* <> +* <> + +[role="exclude",id="modules-discovery-adding-removing-nodes"] +=== Adding and removing nodes + +See <>. + [role="exclude",id="_timing"] === Timing diff --git a/docs/reference/setup.asciidoc b/docs/reference/setup.asciidoc index 87ed8a01c5ca4..2cdc0eeec5be1 100644 --- a/docs/reference/setup.asciidoc +++ b/docs/reference/setup.asciidoc @@ -53,6 +53,8 @@ include::modules/cluster.asciidoc[] include::settings/ccr-settings.asciidoc[] +include::modules/discovery/discovery-settings.asciidoc[] + include::modules/indices/fielddata.asciidoc[] include::modules/http.asciidoc[] @@ -109,6 +111,8 @@ include::setup/starting.asciidoc[] include::setup/stopping.asciidoc[] +include::modules/discovery.asciidoc[] + include::setup/add-nodes.asciidoc[] include::setup/restart-cluster.asciidoc[] diff --git a/docs/reference/setup/add-nodes.asciidoc b/docs/reference/setup/add-nodes.asciidoc index 6e57bd945d3fa..72a08dd8fbbf6 100644 --- a/docs/reference/setup/add-nodes.asciidoc +++ b/docs/reference/setup/add-nodes.asciidoc @@ -1,5 +1,5 @@ [[add-elasticsearch-nodes]] -== Adding nodes to your cluster +== Add and remove nodes in your cluster When you start an instance of {es}, you are starting a _node_. An {es} _cluster_ is a group of nodes that have the same `cluster.name` attribute. As nodes join @@ -41,3 +41,141 @@ the rest of its cluster. For more information about discovery and shard allocation, see <> and <>. + +[discrete] +[[add-elasticsearch-nodes-master-eligible]] +=== Master-eligible nodes + +As nodes are added or removed Elasticsearch maintains an optimal level of fault +tolerance by automatically updating the cluster's _voting configuration_, which +is the set of <> whose responses are counted +when making decisions such as electing a new master or committing a new cluster +state. + +It is recommended to have a small and fixed number of master-eligible nodes in a +cluster, and to scale the cluster up and down by adding and removing +master-ineligible nodes only. However there are situations in which it may be +desirable to add or remove some master-eligible nodes to or from a cluster. + +[discrete] +[[modules-discovery-adding-nodes]] +==== Adding master-eligible nodes + +If you wish to add some nodes to your cluster, simply configure the new nodes +to find the existing cluster and start them up. Elasticsearch adds the new nodes +to the voting configuration if it is appropriate to do so. + +During master election or when joining an existing formed cluster, a node +sends a join request to the master in order to be officially added to the +cluster. You can use the `cluster.join.timeout` setting to configure how long a +node waits after sending a request to join a cluster. Its default value is `30s`. +See <>. + +[discrete] +[[modules-discovery-removing-nodes]] +==== Removing master-eligible nodes + +When removing master-eligible nodes, it is important not to remove too many all +at the same time. For instance, if there are currently seven master-eligible +nodes and you wish to reduce this to three, it is not possible simply to stop +four of the nodes at once: to do so would leave only three nodes remaining, +which is less than half of the voting configuration, which means the cluster +cannot take any further actions. + +More precisely, if you shut down half or more of the master-eligible nodes all +at the same time then the cluster will normally become unavailable. If this +happens then you can bring the cluster back online by starting the removed +nodes again. + +As long as there are at least three master-eligible nodes in the cluster, as a +general rule it is best to remove nodes one-at-a-time, allowing enough time for +the cluster to <> the voting +configuration and adapt the fault tolerance level to the new set of nodes. + +If there are only two master-eligible nodes remaining then neither node can be +safely removed since both are required to reliably make progress. To remove one +of these nodes you must first inform {es} that it should not be part of the +voting configuration, and that the voting power should instead be given to the +other node. You can then take the excluded node offline without preventing the +other node from making progress. A node which is added to a voting +configuration exclusion list still works normally, but {es} tries to remove it +from the voting configuration so its vote is no longer required. Importantly, +{es} will never automatically move a node on the voting exclusions list back +into the voting configuration. Once an excluded node has been successfully +auto-reconfigured out of the voting configuration, it is safe to shut it down +without affecting the cluster's master-level availability. A node can be added +to the voting configuration exclusion list using the +<> API. For example: + +[source,console] +-------------------------------------------------- +# Add node to voting configuration exclusions list and wait for the system +# to auto-reconfigure the node out of the voting configuration up to the +# default timeout of 30 seconds +POST /_cluster/voting_config_exclusions/node_name + +# Add node to voting configuration exclusions list and wait for +# auto-reconfiguration up to one minute +POST /_cluster/voting_config_exclusions/node_name?timeout=1m +-------------------------------------------------- +// TEST[skip:this would break the test cluster if executed] + +The node that should be added to the exclusions list is specified using +<> in place of `node_name` here. If a call to the +voting configuration exclusions API fails, you can safely retry it. Only a +successful response guarantees that the node has actually been removed from the +voting configuration and will not be reinstated. If it's the active master that +was removed from the voting configuration, then it will abdicate to another +master-eligible node that's still in the voting configuration, if such a node +is available. + +Although the voting configuration exclusions API is most useful for down-scaling +a two-node to a one-node cluster, it is also possible to use it to remove +multiple master-eligible nodes all at the same time. Adding multiple nodes to +the exclusions list has the system try to auto-reconfigure all of these nodes +out of the voting configuration, allowing them to be safely shut down while +keeping the cluster available. In the example described above, shrinking a +seven-master-node cluster down to only have three master nodes, you could add +four nodes to the exclusions list, wait for confirmation, and then shut them +down simultaneously. + +NOTE: Voting exclusions are only required when removing at least half of the +master-eligible nodes from a cluster in a short time period. They are not +required when removing master-ineligible nodes, nor are they required when +removing fewer than half of the master-eligible nodes. + +Adding an exclusion for a node creates an entry for that node in the voting +configuration exclusions list, which has the system automatically try to +reconfigure the voting configuration to remove that node and prevents it from +returning to the voting configuration once it has removed. The current list of +exclusions is stored in the cluster state and can be inspected as follows: + +[source,console] +-------------------------------------------------- +GET /_cluster/state?filter_path=metadata.cluster_coordination.voting_config_exclusions +-------------------------------------------------- + +This list is limited in size by the `cluster.max_voting_config_exclusions` +setting, which defaults to `10`. See <>. Since +voting configuration exclusions are persistent and limited in number, they must +be cleaned up. Normally an exclusion is added when performing some maintenance +on the cluster, and the exclusions should be cleaned up when the maintenance is +complete. Clusters should have no voting configuration exclusions in normal +operation. + +If a node is excluded from the voting configuration because it is to be shut +down permanently, its exclusion can be removed after it is shut down and removed +from the cluster. Exclusions can also be cleared if they were created in error +or were only required temporarily: + +[source,console] +-------------------------------------------------- +# Wait for all the nodes with voting configuration exclusions to be removed from +# the cluster and then remove all the exclusions, allowing any node to return to +# the voting configuration in the future. +DELETE /_cluster/voting_config_exclusions + +# Immediately remove all the voting configuration exclusions, allowing any node +# to return to the voting configuration in the future. +DELETE /_cluster/voting_config_exclusions?wait_for_removal=false +--------------------------------------------------