Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log [initial_master_nodes] on formation failure #36466

Conversation

DaveCTurner
Copy link
Contributor

Today we log a slightly cryptic "cluster bootstrapping is disabled on this
node" message if bootstrapping hasn't been configured. Since there is today
only one way to bootstrap the cluster it seems preferable to spell out exactly
which setting is missing.

@DaveCTurner DaveCTurner added >non-issue v7.0.0 :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. labels Dec 11, 2018
@DaveCTurner DaveCTurner requested a review from bleskes December 11, 2018 08:50
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

Copy link
Contributor

@bleskes bleskes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks. Left an optional suggestion

@@ -150,7 +150,7 @@ String getDescription() {

if (INITIAL_MASTER_NODE_COUNT_SETTING.get(Settings.EMPTY).equals(INITIAL_MASTER_NODE_COUNT_SETTING.get(settings))
&& INITIAL_MASTER_NODES_SETTING.get(Settings.EMPTY).equals(INITIAL_MASTER_NODES_SETTING.get(settings))) {
bootstrappingDescription = "cluster bootstrapping is disabled on this node";
bootstrappingDescription = "[" + INITIAL_MASTER_NODES_SETTING.getKey() + "] is empty on this node";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think technically we check for is not set on this node, shall we just use that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We check if it's set here:

return Stream.of(DISCOVERY_HOSTS_PROVIDER_SETTING, DISCOVERY_ZEN_PING_UNICAST_HOSTS_SETTING,
INITIAL_MASTER_NODE_COUNT_SETTING, INITIAL_MASTER_NODES_SETTING).anyMatch(s -> s.exists(settings));

If it's not set, and nor are any of the other settings listed here, then we auto-bootstrap after a few seconds which leads to a different log message that starts master not discovered or elected yet and doesn't mention bootstrapping.

However if bootstrapping hasn't occurred by the time we write this message then this can't have happened, so the problem is actually that it's empty:

} else if (initialMasterNodeCount > 0 || initialMasterNodes.isEmpty() == false) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment was just targeted at the if clause:

INITIAL_MASTER_NODES_SETTING.get(Settings.EMPTY).equals(INITIAL_MASTER_NODES_SETTING.get(settings))```

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also checks it's not empty, since the default for this setting is empty.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It technically checks it's not set or set to the default (which is empty). As I said this was just a suggestion. I'm good anyway.

Today we log a slightly cryptic "cluster bootstrapping is disabled on this
node" message if bootstrapping hasn't been configured. Since there is today
only one way to bootstrap the cluster it seems preferable to spell out exactly
which setting is missing.
@DaveCTurner DaveCTurner force-pushed the 2018-12-11-better-cluster-formation-failure-log branch from c41fa2e to a330fac Compare December 11, 2018 10:20
@DaveCTurner
Copy link
Contributor Author

Transient failure:

 [6.5.3] * What went wrong:
:x-pack:plugin:security:testJar (Thread[Execution worker for ':' Thread 7,5,main]) completed. Took 0.585 secs.
 [6.5.3] Could not resolve all files for configuration ':buildSrc:runtimeClasspath'.
 [6.5.3] > Could not download jna.jar (org.elasticsearch:jna:4.5.1)
 [6.5.3]    > Could not get resource 'https://jcenter.bintray.com/org/elasticsearch/jna/4.5.1/jna-4.5.1.jar'.
 [6.5.3]       > Could not HEAD 'https://jcenter.bintray.com/org/elasticsearch/jna/4.5.1/jna-4.5.1.jar'.
 [6.5.3]          > Connection reset

@elasticmachine please run the Gradle build tests 1

@DaveCTurner DaveCTurner merged commit c3a6d19 into elastic:master Dec 11, 2018
@DaveCTurner DaveCTurner deleted the 2018-12-11-better-cluster-formation-failure-log branch December 11, 2018 12:53
jasontedor added a commit to dnhatn/elasticsearch that referenced this pull request Dec 11, 2018
* elastic/master: (36 commits)
  Add check for minimum required Docker version (elastic#36497)
  Minor search controller changes (elastic#36479)
  Add default methods to DocValueFormat (elastic#36480)
  Fix the mixed cluster REST test explain/11_basic_with_types.
  Modify `BigArrays` to take name of circuit breaker (elastic#36461)
  Move LoggedExec to minimumRuntime source set (elastic#36453)
  added 6.5.4 version
  Add test logging for elastic#35644
  Tests- added helper methods to ESRestTestCase for checking warnings (elastic#36443)
  SQL: move requests' parameters to requests JSON body (elastic#36149)
  [Zen2] Respect the no_master_block setting (elastic#36478)
  Require soft-deletes when access changes snapshot (elastic#36446)
  Switch more tests to zen2 (elastic#36367)
  [Painless] Add extensive tests for def to primitive casts (elastic#36455)
  Add setting to bypass Rollover action (elastic#36235)
  Try running CI against Zulu (elastic#36391)
  [DOCS] Reworked the shard allocation filtering info.  (elastic#36456)
  Log [initial_master_nodes] on formation failure (elastic#36466)
  converting ForbiddenPatternsTask to .java (elastic#36194)
  fixed typo
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. >non-issue v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants