Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Zen2] Update documentation for Zen2 #34714

Merged

Conversation

DaveCTurner
Copy link
Contributor

@DaveCTurner DaveCTurner commented Oct 22, 2018

The introduction of zen2 substantially changes how discovery and cluster
formation work. This commit updates the reference documentation, migration
documentation, and various other locations to bring them in line with today's
reality.

@DaveCTurner DaveCTurner added >docs General docs changes v7.0.0 :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. labels Oct 22, 2018
Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some initial thoughts. Great write-up 👍

docs/reference/modules/coordination.asciidoc Outdated Show resolved Hide resolved
docs/reference/modules/coordination.asciidoc Outdated Show resolved Hide resolved
docs/reference/modules/coordination.asciidoc Outdated Show resolved Hide resolved
docs/reference/modules/coordination.asciidoc Outdated Show resolved Hide resolved
docs/reference/modules/coordination.asciidoc Outdated Show resolved Hide resolved
docs/reference/modules/coordination.asciidoc Outdated Show resolved Hide resolved
docs/reference/modules/coordination.asciidoc Outdated Show resolved Hide resolved
docs/reference/modules/coordination.asciidoc Outdated Show resolved Hide resolved
docs/reference/modules/coordination.asciidoc Outdated Show resolved Hide resolved
docs/reference/modules/coordination.asciidoc Outdated Show resolved Hide resolved
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@DaveCTurner
Copy link
Contributor Author

Pinging @vladimirdolzhenko and @andrershov since our discussions a couple of weeks back helped me write these docs.

Copy link
Contributor

@andrershov andrershov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DaveCTurner Thank you for the nice write-up, it really makes things easier to understand. I still feel that we can a better job of expaining cluster.master_nodes_failure_tolerance, probably by giving more examples...

docs/reference/modules/coordination.asciidoc Outdated Show resolved Hide resolved
docs/reference/modules/coordination.asciidoc Outdated Show resolved Hide resolved
@DaveCTurner
Copy link
Contributor Author

I have substantially reworked the section about auto-reconfiguration and offered a different setting as we discussed to cover the ≥2-node-redundancy case.

I think I've also addressed most of the other comments.

I haven't renamed the retirement API to something else because I'm still seeking a good name. I think the form of this API is good, it just needs renaming.

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work on the changes. I think this is the right set of APIs to get started.

docs/reference/modules/coordination.asciidoc Outdated Show resolved Hide resolved
docs/reference/modules/coordination.asciidoc Outdated Show resolved Hide resolved
docs/reference/modules/coordination.asciidoc Outdated Show resolved Hide resolved
docs/reference/modules/coordination.asciidoc Outdated Show resolved Hide resolved
docs/reference/modules/coordination.asciidoc Outdated Show resolved Hide resolved
docs/reference/modules/coordination.asciidoc Outdated Show resolved Hide resolved
docs/reference/modules/coordination.asciidoc Outdated Show resolved Hide resolved
docs/reference/modules/coordination.asciidoc Outdated Show resolved Hide resolved
docs/reference/modules/coordination.asciidoc Outdated Show resolved Hide resolved
docs/reference/modules/coordination.asciidoc Outdated Show resolved Hide resolved
Copy link
Contributor Author

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments addressed

has already been bootstrapped.

This setting can be given on the command line when starting up each
master-eligible node, or added to the `elasticsearch.yml` configuration file on
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we only did it on one node, and then that node failed at some point during cluster formation, then there's no truly safe way to proceed. Setting it on multiple nodes avoids this risk.

Reworked in 8e34a77.

It is technically sufficient to set this on a single master-eligible node in
the cluster, and only to mention that single node in the setting, but this
provides no fault tolerance before the cluster has fully formed. It
is therefore better to bootstrap using at least three master-eligible nodes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok done in 8e34a77.

discovered before bootstrapping can take place. This requirement will be
relaxed in production-ready releases.

WARNING: You must put exactly the same set of initial master nodes in each
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, redundancy reduced in 8e34a77.

The `cluster.name` allows you to create multiple clusters which are separated
from each other. Nodes verify that they agree on their cluster name when they
first connect to each other, and if two nodes have different cluster names then
they will not communicate meaningfully and will not belong to the same cluster.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really we want to say the nodes don't communicate at all, but they do: they communicate their respective cluster names. After they discover the disagreement they don't really do any further communication.

Reworked in 8e34a77 to still be technically correct without being so awkwardly worded.

@@ -0,0 +1,121 @@
[[modules-discovery-adding-removing-nodes]]
=== Adding and removing nodes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There just isn't really a lot to say about adding nodes in this context. I added headings to divide the page up in e466ed0.

has either discovered an elected master node or else it has discovered enough
masterless master-eligible nodes to complete an election. If neither of these
occur quickly enough then the node will retry after
`discovery.find_peers_interval` which defaults to `1s`.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, moved in ec4e739.

@DaveCTurner
Copy link
Contributor Author

Thank you @lcawl for all the suggestions, they are greatly appreciated. I've held off on addressing any comments asking for more major structural changes here in the interests of time. For instance, I think it makes sense to collect many of the settings into one place, although that place does not yet exist. I think I've addressed everything else, and we can move some things around in follow-ups.

@DaveCTurner DaveCTurner merged commit 1a23417 into elastic:master Dec 20, 2018
@DaveCTurner DaveCTurner deleted the 2018-10-22-cluster-coordination-docs branch December 20, 2018 13:02
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Dec 22, 2018
DaveCTurner added a commit that referenced this pull request Dec 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. >docs General docs changes v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants