Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify allocation explain if random shard chosen #75670

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/reference/cluster/allocation-explain.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ GET _cluster/allocation/explain

`GET _cluster/allocation/explain`

`POST _cluster/allocation/explain`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I assume that POST should have been documented already but we had missed it, but please correct me if that is not the case. If that is the case, should we also be including examples for POST, similar to the ones we have for GET, or a note that they are equivalent?

Copy link
Contributor Author

@DaveCTurner DaveCTurner Aug 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The convention is that all GET APIs that take a body should also work the same with POST, see e.g. https://www.elastic.co/guide/en/elasticsearch/reference/current/api-conventions.html#api-conventions

I've added this here to emphasise that you can send a request with a body - some tooling makes it hard to execute a GET with a body, you have to switch it into POST model

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, @DaveCTurner thanks for clarifying. Good to know there is general guidance in the documentation already.


[[cluster-allocation-explain-api-prereqs]]
==== {api-prereq-title}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@
}
},
"body":{
"description":"The index, shard, and primary flag to explain. Empty means 'explain the first unassigned shard'"
"description":"The index, shard, and primary flag to explain. Empty means 'explain a randomly-chosen unassigned shard'"
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@

package org.elasticsearch.action.admin.cluster.allocation;

import org.elasticsearch.Version;
import org.elasticsearch.cluster.ClusterInfo;
import org.elasticsearch.cluster.node.DiscoveryNode;
import org.elasticsearch.cluster.routing.ShardRouting;
Expand Down Expand Up @@ -36,15 +37,27 @@
*/
public final class ClusterAllocationExplanation implements ToXContentObject, Writeable {

static final String NO_SHARD_SPECIFIED_MESSAGE = "No shard was specified in the explain API request, so this response " +
"explains a randomly chosen unassigned shard. There may be other unassigned shards in this cluster which cannot be assigned for " +
"different reasons. It may not be possible to assign this shard until one of the other shards is assigned correctly. To explain " +
"the allocation of other shards (whether assigned or unassigned) you must specify the target shard in the request to this API.";

private final boolean specificShard;
private final ShardRouting shardRouting;
private final DiscoveryNode currentNode;
private final DiscoveryNode relocationTargetNode;
private final ClusterInfo clusterInfo;
private final ShardAllocationDecision shardAllocationDecision;

public ClusterAllocationExplanation(ShardRouting shardRouting, @Nullable DiscoveryNode currentNode,
@Nullable DiscoveryNode relocationTargetNode, @Nullable ClusterInfo clusterInfo,
ShardAllocationDecision shardAllocationDecision) {
public ClusterAllocationExplanation(
boolean specificShard,
ShardRouting shardRouting,
@Nullable DiscoveryNode currentNode,
@Nullable DiscoveryNode relocationTargetNode,
@Nullable ClusterInfo clusterInfo,
ShardAllocationDecision shardAllocationDecision) {

this.specificShard = specificShard;
this.shardRouting = shardRouting;
this.currentNode = currentNode;
this.relocationTargetNode = relocationTargetNode;
Expand All @@ -53,6 +66,11 @@ public ClusterAllocationExplanation(ShardRouting shardRouting, @Nullable Discove
}

public ClusterAllocationExplanation(StreamInput in) throws IOException {
if (in.getVersion().onOrAfter(Version.V_8_0_0)) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume BwC stands for backwards compatibility, is that correct? (I've seen it several times already, but I figured I should finally confirm).

I assume that adding an extra field in the API response is not a breaking change, is that correct?

I am also assuming we are being extra cautious and not introducing it in 7.x, but we could have technically done so, had we wanted to, is that correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct, BwC is short for backwards compatibility.

Adding an extra field to the response is indeed not considered a breaking change, we expect clients to skip fields they don't recognise.

The labels on the PR indicate how this will be backported, so yes this will go into 7.x. However we have to start out with the change being 8.0-only because we need it to pass all the mixed-version tests in CI first. We'll then backport it and (simultaneously) adjust the version on this line to match. It's quite a complex dance.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, @DaveCTurner, thanks a lot for the context.

this.specificShard = in.readBoolean();
} else {
this.specificShard = true; // suppress "this is a random shard" warning in BwC situations
}
this.shardRouting = new ShardRouting(in);
this.currentNode = in.readOptionalWriteable(DiscoveryNode::new);
this.relocationTargetNode = in.readOptionalWriteable(DiscoveryNode::new);
Expand All @@ -62,13 +80,20 @@ public ClusterAllocationExplanation(StreamInput in) throws IOException {

@Override
public void writeTo(StreamOutput out) throws IOException {
if (out.getVersion().onOrAfter(Version.V_8_0_0)) {
out.writeBoolean(specificShard);
} // else suppress "this is a random shard" warning in BwC situations
shardRouting.writeTo(out);
out.writeOptionalWriteable(currentNode);
out.writeOptionalWriteable(relocationTargetNode);
out.writeOptionalWriteable(clusterInfo);
shardAllocationDecision.writeTo(out);
}

public boolean isSpecificShard() {
return specificShard;
}

/**
* Returns the shard that the explanation is about.
*/
Expand Down Expand Up @@ -131,6 +156,9 @@ public ShardAllocationDecision getShardAllocationDecision() {

public XContentBuilder toXContent(XContentBuilder builder, Params params) throws IOException {
builder.startObject(); {
if (isSpecificShard() == false) {
builder.field("note", NO_SHARD_SPECIFIED_MESSAGE);
}
builder.field("index", shardRouting.getIndexName());
builder.field("shard", shardRouting.getId());
builder.field("primary", shardRouting.primary());
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -81,15 +81,25 @@ protected void masterOperation(Task task, final ClusterAllocationExplainRequest
ShardRouting shardRouting = findShardToExplain(request, allocation);
logger.debug("explaining the allocation for [{}], found shard [{}]", request, shardRouting);

ClusterAllocationExplanation cae = explainShard(shardRouting, allocation,
request.includeDiskInfo() ? clusterInfo : null, request.includeYesDecisions(), allocationService);
ClusterAllocationExplanation cae = explainShard(
shardRouting,
allocation,
request.includeDiskInfo() ? clusterInfo : null,
request.includeYesDecisions(),
request.useAnyUnassignedShard() == false,
allocationService);
listener.onResponse(new ClusterAllocationExplainResponse(cae));
}

// public for testing
public static ClusterAllocationExplanation explainShard(ShardRouting shardRouting, RoutingAllocation allocation,
ClusterInfo clusterInfo, boolean includeYesDecisions,
AllocationService allocationService) {
public static ClusterAllocationExplanation explainShard(
ShardRouting shardRouting,
RoutingAllocation allocation,
ClusterInfo clusterInfo,
boolean includeYesDecisions,
boolean isSpecificShard,
AllocationService allocationService) {

allocation.setDebugMode(includeYesDecisions ? DebugMode.ON : DebugMode.EXCLUDE_YES_DECISIONS);

ShardAllocationDecision shardDecision;
Expand All @@ -99,10 +109,13 @@ public static ClusterAllocationExplanation explainShard(ShardRouting shardRoutin
shardDecision = allocationService.explainShardAllocation(shardRouting, allocation);
}

return new ClusterAllocationExplanation(shardRouting,
return new ClusterAllocationExplanation(
isSpecificShard,
shardRouting,
shardRouting.currentNodeId() != null ? allocation.nodes().get(shardRouting.currentNodeId()) : null,
shardRouting.relocatingNodeId() != null ? allocation.nodes().get(shardRouting.relocatingNodeId()) : null,
clusterInfo, shardDecision);
clusterInfo,
shardDecision);
}

// public for testing
Expand All @@ -115,7 +128,9 @@ public static ShardRouting findShardToExplain(ClusterAllocationExplainRequest re
foundShard = ui.next();
}
if (foundShard == null) {
throw new IllegalArgumentException("unable to find any unassigned shards to explain [" + request + "]");
throw new IllegalArgumentException("No shard was specified in the request which means the response should explain a " +
"randomly-chosen unassigned shard, but there are no unassigned shards in this cluster. To explain the allocation of " +
"an assigned shard you must specify the target shard in the request.");
}
} else {
String index = request.getIndex();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@
import java.util.Locale;

import static org.elasticsearch.action.admin.cluster.allocation.TransportClusterAllocationExplainAction.findShardToExplain;
import static org.hamcrest.Matchers.allOf;
import static org.hamcrest.Matchers.containsString;

/**
* Tests for the {@link TransportClusterAllocationExplainAction} class.
Expand All @@ -46,7 +48,12 @@ public void testInitializingOrRelocatingShardExplanation() throws Exception {
ShardRouting shard = clusterState.getRoutingTable().index("idx").shard(0).primaryShard();
RoutingAllocation allocation = new RoutingAllocation(new AllocationDeciders(Collections.emptyList()),
clusterState.getRoutingNodes(), clusterState, null, null, System.nanoTime());
ClusterAllocationExplanation cae = TransportClusterAllocationExplainAction.explainShard(shard, allocation, null, randomBoolean(),
ClusterAllocationExplanation cae = TransportClusterAllocationExplainAction.explainShard(
shard,
allocation,
null,
randomBoolean(),
true,
new AllocationService(null, new TestGatewayAllocator(), new ShardsAllocator() {
@Override
public void allocate(RoutingAllocation allocation) {
Expand All @@ -64,6 +71,7 @@ public ShardAllocationDecision decideShardAllocation(ShardRouting shard, Routing
}, null, null));

assertEquals(shard.currentNodeId(), cae.getCurrentNode().getId());
assertTrue(cae.isSpecificShard());
assertFalse(cae.getShardAllocationDecision().isDecisionTaken());
assertFalse(cae.getShardAllocationDecision().getAllocateDecision().isDecisionTaken());
assertFalse(cae.getShardAllocationDecision().getMoveDecision().isDecisionTaken());
Expand Down Expand Up @@ -110,8 +118,13 @@ public void testFindAnyUnassignedShardToExplain() {
final ClusterState allStartedClusterState = ClusterStateCreationUtils.state("idx", randomBoolean(),
ShardRoutingState.STARTED, ShardRoutingState.STARTED);
final ClusterAllocationExplainRequest anyUnassignedShardsRequest = new ClusterAllocationExplainRequest();
expectThrows(IllegalArgumentException.class, () ->
findShardToExplain(anyUnassignedShardsRequest, routingAllocation(allStartedClusterState)));
assertThat(expectThrows(
IllegalArgumentException.class,
() -> findShardToExplain(anyUnassignedShardsRequest, routingAllocation(allStartedClusterState))).getMessage(),
allOf(
// no point in asserting the precise wording of the message into this test, but we care that it contains these bits:
containsString("No shard was specified in the request"),
containsString("specify the target shard in the request")));
}

public void testFindPrimaryShardToExplain() {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,9 @@

import static java.util.Collections.emptyMap;
import static java.util.Collections.emptySet;
import static org.hamcrest.Matchers.allOf;
import static org.hamcrest.Matchers.containsString;
import static org.hamcrest.Matchers.equalTo;

/**
* Tests for the cluster allocation explanation
Expand All @@ -50,11 +53,12 @@ public void testDecisionEquality() {
}

public void testExplanationSerialization() throws Exception {
ClusterAllocationExplanation cae = randomClusterAllocationExplanation(randomBoolean());
ClusterAllocationExplanation cae = randomClusterAllocationExplanation(randomBoolean(), randomBoolean());
BytesStreamOutput out = new BytesStreamOutput();
cae.writeTo(out);
StreamInput in = out.bytes().streamInput();
ClusterAllocationExplanation cae2 = new ClusterAllocationExplanation(in);
assertEquals(cae.isSpecificShard(), cae2.isSpecificShard());
assertEquals(cae.getShard(), cae2.getShard());
assertEquals(cae.isPrimary(), cae2.isPrimary());
assertTrue(cae2.isPrimary());
Expand All @@ -73,7 +77,7 @@ public void testExplanationSerialization() throws Exception {
}

public void testExplanationToXContent() throws Exception {
ClusterAllocationExplanation cae = randomClusterAllocationExplanation(true);
ClusterAllocationExplanation cae = randomClusterAllocationExplanation(true, true);
XContentBuilder builder = XContentFactory.jsonBuilder();
cae.toXContent(builder, ToXContent.EMPTY_PARAMS);
assertEquals("{\"index\":\"idx\",\"shard\":0,\"primary\":true,\"current_state\":\"started\",\"current_node\":" +
Expand All @@ -83,7 +87,25 @@ public void testExplanationToXContent() throws Exception {
"that can both allocate this shard and improve the cluster balance\"}", Strings.toString(builder));
}

private static ClusterAllocationExplanation randomClusterAllocationExplanation(boolean assignedShard) {
public void testRandomShardExplanationToXContent() throws Exception {
ClusterAllocationExplanation cae = randomClusterAllocationExplanation(true, false);
XContentBuilder builder = XContentFactory.jsonBuilder();
cae.toXContent(builder, ToXContent.EMPTY_PARAMS);
final String actual = Strings.toString(builder);
assertThat(actual, allOf(
equalTo("{\"note\":\"" + ClusterAllocationExplanation.NO_SHARD_SPECIFIED_MESSAGE +
"\",\"index\":\"idx\",\"shard\":0,\"primary\":true,\"current_state\":\"started\",\"current_node\":" +
"{\"id\":\"node-0\",\"name\":\"\",\"transport_address\":\"" + cae.getCurrentNode().getAddress() +
"\",\"weight_ranking\":3},\"can_remain_on_current_node\":\"yes\",\"can_rebalance_cluster\":\"yes\"," +
"\"can_rebalance_to_other_node\":\"no\",\"rebalance_explanation\":\"cannot rebalance as no target node exists " +
"that can both allocate this shard and improve the cluster balance\"}"),
// no point in asserting the precise wording of the message into this test, but we care that the note contains these bits:
containsString("No shard was specified in the explain API request"),
containsString("specify the target shard in the request")
));
}

private static ClusterAllocationExplanation randomClusterAllocationExplanation(boolean assignedShard, boolean specificShard) {
ShardRouting shardRouting = TestShardRouting.newShardRouting(new ShardId(new Index("idx", "123"), 0),
assignedShard ? "node-0" : null, true, assignedShard ? ShardRoutingState.STARTED : ShardRoutingState.UNASSIGNED);
DiscoveryNode node = assignedShard ? new DiscoveryNode("node-0", buildNewFakeTransportAddress(), emptyMap(), emptySet(),
Expand All @@ -97,6 +119,6 @@ private static ClusterAllocationExplanation randomClusterAllocationExplanation(b
AllocateUnassignedDecision allocateDecision = AllocateUnassignedDecision.no(UnassignedInfo.AllocationStatus.DECIDERS_NO, null);
shardAllocationDecision = new ShardAllocationDecision(allocateDecision, MoveDecision.NOT_TAKEN);
}
return new ClusterAllocationExplanation(shardRouting, node, null, null, shardAllocationDecision);
return new ClusterAllocationExplanation(specificShard, shardRouting, node, null, null, shardAllocationDecision);
}
}