-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add elasticsearch-node detach-cluster tool #37979
Changes from 2 commits
67a4fba
f205006
5572a83
28940db
d1c3c2b
699cfc0
a122221
80ad224
b12f50b
2f63523
e1cfd09
cbcf6c5
f84d472
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,79 @@ | ||||||
/* | ||||||
* Licensed to Elasticsearch under one or more contributor | ||||||
* license agreements. See the NOTICE file distributed with | ||||||
* this work for additional information regarding copyright | ||||||
* ownership. Elasticsearch licenses this file to you under | ||||||
* the Apache License, Version 2.0 (the "License"); you may | ||||||
* not use this file except in compliance with the License. | ||||||
* You may obtain a copy of the License at | ||||||
* | ||||||
* http://www.apache.org/licenses/LICENSE-2.0 | ||||||
* | ||||||
* Unless required by applicable law or agreed to in writing, | ||||||
* software distributed under the License is distributed on an | ||||||
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||||||
* KIND, either express or implied. See the License for the | ||||||
* specific language governing permissions and limitations | ||||||
* under the License. | ||||||
*/ | ||||||
package org.elasticsearch.cluster.coordination; | ||||||
|
||||||
import joptsimple.OptionSet; | ||||||
import org.elasticsearch.cli.Terminal; | ||||||
import org.elasticsearch.cluster.metadata.Manifest; | ||||||
import org.elasticsearch.cluster.metadata.MetaData; | ||||||
import org.elasticsearch.common.collect.Tuple; | ||||||
import org.elasticsearch.env.Environment; | ||||||
|
||||||
import java.io.IOException; | ||||||
import java.nio.file.Path; | ||||||
|
||||||
public class DetachClusterCommand extends ElasticsearchNodeCommand { | ||||||
|
||||||
static final String NODE_DETACHED_MSG = "Node was successfully detached from the cluster"; | ||||||
static final String CONFIRMATION_MSG = | ||||||
"--------------------------------------------------------------------------\n" + | ||||||
"\n" + | ||||||
"You should run this tool only if you have permanently lost all\n" + | ||||||
"your master-eligible nodes, and you cannot restore the cluster\n" + | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we also recommend to use it when having lost a majority of master-eligible nodes, no? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See below "or you have already run There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Looks ok to me. |
||||||
"from a snapshot, or you have already run `elasticsearch-node unsafe-bootstrap`\n" + | ||||||
"on master-eligible node that formed cluster with this node.\n" + | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
"This tool can result in arbitrary data loss and should be\n" + | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. perhaps: Usage of this tool can result in data loss and should be a means of last resort. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This sentence is just copied from unsafe-bootstrap command There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. sure, that still doesn't make it great :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the original is ok, but suggest this as an alternative:
|
||||||
"the last resort.\n" + | ||||||
"Do you want to proceed?\n"; | ||||||
|
||||||
public DetachClusterCommand() { | ||||||
super("Detaches this node from the cluster with old UUID, allowing it to join new cluster"); | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
} | ||||||
|
||||||
@Override | ||||||
protected void execute(Terminal terminal, OptionSet options, Environment env) throws Exception { | ||||||
super.execute(terminal, options, env); | ||||||
|
||||||
processNodePathsWithLock(terminal, options, env); | ||||||
|
||||||
terminal.println(NODE_DETACHED_MSG); | ||||||
} | ||||||
|
||||||
@Override | ||||||
protected void processNodePaths(Terminal terminal, Path[] dataPaths) throws IOException { | ||||||
final Tuple<Manifest, MetaData> manifestMetaDataTuple = loadMetaData(terminal, dataPaths); | ||||||
final Manifest manifest = manifestMetaDataTuple.v1(); | ||||||
final MetaData metaData = manifestMetaDataTuple.v2(); | ||||||
|
||||||
confirm(terminal, CONFIRMATION_MSG); | ||||||
|
||||||
final CoordinationMetaData coordinationMetaData = CoordinationMetaData.builder() | ||||||
.lastAcceptedConfiguration(CoordinationMetaData.VotingConfiguration.MUST_JOIN_ELECTED_MASTER) | ||||||
.lastCommittedConfiguration(CoordinationMetaData.VotingConfiguration.MUST_JOIN_ELECTED_MASTER) | ||||||
.build(); | ||||||
final MetaData newMetaData = MetaData.builder(metaData) | ||||||
.version(0) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why set this to 0? Is this necessary? |
||||||
.coordinationMetaData(coordinationMetaData) | ||||||
.clusterUUID(MetaData.UNKNOWN_CLUSTER_UUID) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we can keep the cluster uuid, and just set |
||||||
.clusterUUIDCommitted(false) | ||||||
.build(); | ||||||
|
||||||
writeNewMetaData(terminal, manifest, 0, 0, metaData, newMetaData, dataPaths); | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we can keep the cluster state version, and just set the term to 0. |
||||||
} | ||||||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,151 @@ | ||
/* | ||
* Licensed to Elasticsearch under one or more contributor | ||
* license agreements. See the NOTICE file distributed with | ||
* this work for additional information regarding copyright | ||
* ownership. Elasticsearch licenses this file to you under | ||
* the Apache License, Version 2.0 (the "License"); you may | ||
* not use this file except in compliance with the License. | ||
* You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, | ||
* software distributed under the License is distributed on an | ||
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
* KIND, either express or implied. See the License for the | ||
* specific language governing permissions and limitations | ||
* under the License. | ||
*/ | ||
package org.elasticsearch.cluster.coordination; | ||
|
||
import joptsimple.OptionParser; | ||
import joptsimple.OptionSet; | ||
import joptsimple.OptionSpec; | ||
import org.apache.logging.log4j.LogManager; | ||
import org.apache.logging.log4j.Logger; | ||
import org.apache.lucene.store.LockObtainFailedException; | ||
import org.elasticsearch.ElasticsearchException; | ||
import org.elasticsearch.cli.EnvironmentAwareCommand; | ||
import org.elasticsearch.cli.Terminal; | ||
import org.elasticsearch.cluster.ClusterModule; | ||
import org.elasticsearch.cluster.metadata.Manifest; | ||
import org.elasticsearch.cluster.metadata.MetaData; | ||
import org.elasticsearch.common.collect.Tuple; | ||
import org.elasticsearch.common.xcontent.NamedXContentRegistry; | ||
import org.elasticsearch.env.Environment; | ||
import org.elasticsearch.env.NodeEnvironment; | ||
|
||
import java.io.IOException; | ||
import java.nio.file.Files; | ||
import java.nio.file.Path; | ||
import java.util.Arrays; | ||
import java.util.Objects; | ||
|
||
public abstract class ElasticsearchNodeCommand extends EnvironmentAwareCommand { | ||
private static final Logger logger = LogManager.getLogger(ElasticsearchNodeCommand.class); | ||
protected final NamedXContentRegistry namedXContentRegistry; | ||
static final String STOP_WARNING_MSG = | ||
"--------------------------------------------------------------------------\n" + | ||
"\n" + | ||
" WARNING: Elasticsearch MUST be stopped before running this tool." + | ||
"\n"; | ||
static final String FAILED_TO_OBTAIN_NODE_LOCK_MSG = "failed to lock node's directory, is Elasticsearch still running?"; | ||
static final String NO_NODE_FOLDER_FOUND_MSG = "no node folder is found in data folder(s), node has not been started yet?"; | ||
static final String NO_MANIFEST_FILE_FOUND_MSG = "no manifest file is found, do you run pre 7.0 Elasticsearch?"; | ||
static final String GLOBAL_GENERATION_MISSING_MSG = "no metadata is referenced from the manifest file, cluster has never been " + | ||
"bootstrapped?"; | ||
static final String NO_GLOBAL_METADATA_MSG = "failed to find global metadata, metadata corrupted?"; | ||
static final String WRITE_METADATA_EXCEPTION_MSG = "exception occurred when writing new metadata to disk"; | ||
static final String ABORTED_BY_USER_MSG = "aborted by user"; | ||
final OptionSpec<Integer> nodeOrdinalOption; | ||
|
||
public ElasticsearchNodeCommand(String description) { | ||
super(description); | ||
nodeOrdinalOption = parser.accepts("ordinal", "Optional node ordinal, 0 if not specified") | ||
.withRequiredArg().ofType(Integer.class); | ||
namedXContentRegistry = new NamedXContentRegistry(ClusterModule.getNamedXWriteables()); | ||
} | ||
|
||
protected void processNodePathsWithLock(Terminal terminal, OptionSet options, Environment env) throws IOException { | ||
terminal.println(Terminal.Verbosity.VERBOSE, "Obtaining lock for node"); | ||
Integer nodeOrdinal = nodeOrdinalOption.value(options); | ||
if (nodeOrdinal == null) { | ||
nodeOrdinal = 0; | ||
} | ||
try (NodeEnvironment.NodeLock lock = new NodeEnvironment.NodeLock(nodeOrdinal, logger, env, Files::exists)) { | ||
final Path[] dataPaths = | ||
Arrays.stream(lock.getNodePaths()).filter(Objects::nonNull).map(p -> p.path).toArray(Path[]::new); | ||
if (dataPaths.length == 0) { | ||
throw new ElasticsearchException(NO_NODE_FOLDER_FOUND_MSG); | ||
} | ||
processNodePaths(terminal, dataPaths); | ||
} catch (LockObtainFailedException ex) { | ||
throw new ElasticsearchException( | ||
FAILED_TO_OBTAIN_NODE_LOCK_MSG + " [" + ex.getMessage() + "]"); | ||
} | ||
} | ||
|
||
protected Tuple<Manifest, MetaData> loadMetaData(Terminal terminal, Path[] dataPaths) throws IOException { | ||
terminal.println(Terminal.Verbosity.VERBOSE, "Loading manifest file"); | ||
final Manifest manifest = Manifest.FORMAT.loadLatestState(logger, namedXContentRegistry, dataPaths); | ||
|
||
if (manifest == null) { | ||
throw new ElasticsearchException(NO_MANIFEST_FILE_FOUND_MSG); | ||
} | ||
if (manifest.isGlobalGenerationMissing()) { | ||
throw new ElasticsearchException(GLOBAL_GENERATION_MISSING_MSG); | ||
} | ||
terminal.println(Terminal.Verbosity.VERBOSE, "Loading global metadata file"); | ||
final MetaData metaData = MetaData.FORMAT.loadGeneration(logger, namedXContentRegistry, manifest.getGlobalGeneration(), | ||
dataPaths); | ||
if (metaData == null) { | ||
throw new ElasticsearchException(NO_GLOBAL_METADATA_MSG + " [generation = " + manifest.getGlobalGeneration() + "]"); | ||
} | ||
|
||
return Tuple.tuple(manifest, metaData); | ||
} | ||
|
||
protected void confirm(Terminal terminal, String msg) { | ||
terminal.println(msg); | ||
String text = terminal.readText("Confirm [y/N] "); | ||
if (text.equalsIgnoreCase("y") == false) { | ||
throw new ElasticsearchException(ABORTED_BY_USER_MSG); | ||
} | ||
} | ||
|
||
@Override | ||
protected void execute(Terminal terminal, OptionSet options, Environment env) throws Exception { | ||
terminal.println(STOP_WARNING_MSG); | ||
} | ||
|
||
protected abstract void processNodePaths(Terminal terminal, Path[] dataPaths) throws IOException; | ||
|
||
|
||
protected void writeNewMetaData(Terminal terminal, Manifest oldManifest, long newCurrentTerm, long newVersion, | ||
MetaData oldMetaData, MetaData newMetaData, Path[] dataPaths) { | ||
try { | ||
terminal.println(Terminal.Verbosity.VERBOSE, | ||
"[clusterUUID = " + oldMetaData.clusterUUID() + ", committed = " + oldMetaData.clusterUUIDCommitted() + "] => " + | ||
"[clusterUUID = " + newMetaData.clusterUUID() + ", committed = " + newMetaData.clusterUUIDCommitted() + "]"); | ||
terminal.println(Terminal.Verbosity.VERBOSE, "New coordination metadata is " + newMetaData.coordinationMetaData()); | ||
terminal.println(Terminal.Verbosity.VERBOSE, "Writing new global metadata to disk"); | ||
long newGeneration = MetaData.FORMAT.write(newMetaData, dataPaths); | ||
Manifest newManifest = new Manifest(newCurrentTerm, newVersion, newGeneration, | ||
oldManifest.getIndexGenerations()); | ||
terminal.println(Terminal.Verbosity.VERBOSE, "New manifest is " + newManifest); | ||
terminal.println(Terminal.Verbosity.VERBOSE, "Writing new manifest file to disk"); | ||
Manifest.FORMAT.writeAndCleanup(newManifest, dataPaths); | ||
terminal.println(Terminal.Verbosity.VERBOSE, "Cleaning up old metadata"); | ||
MetaData.FORMAT.cleanupOldFiles(newGeneration, dataPaths); | ||
} catch (Exception e) { | ||
terminal.println(Terminal.Verbosity.VERBOSE, "Cleaning up new metadata"); | ||
MetaData.FORMAT.cleanupOldFiles(oldManifest.getGlobalGeneration(), dataPaths); | ||
throw new ElasticsearchException(WRITE_METADATA_EXCEPTION_MSG, e); | ||
} | ||
} | ||
|
||
//package-private for testing | ||
OptionParser getParser() { | ||
return parser; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this deserves a special case in
ClusterFormationFailureHelper
(and its tests) as it will yield a somewhat strange message as it is:master not discovered or elected yet, an election requires a node with id [_must_join_elected_master_], have discovered [] which is not a quorum; discovery will continue using [] from hosts providers and [...] from last-known cluster state; node term 0, last-accepted version 0 in term 0
I suggest:
master not discovered yet and this node was detached from its previous cluster, have discovered []; discovery will continue using [] from hosts providers and [...] from last-known cluster state; node term 0, last-accepted version 0 in term 0