Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail allocation of new primaries in empty cluster #43284

Merged
merged 10 commits into from
Sep 30, 2019

Conversation

Gaurav614
Copy link
Contributor

@Gaurav614 Gaurav614 commented Jun 17, 2019

Today if you create an index in a cluster without any data nodes then it will
report yellow health because it never attempts to assign any shards if there
are no data nodes, so the new shards remain at AllocationStatus.NO_ATTEMPT. This commit
moves the new primaries to AllocationStatus.DECIDERS_NO in this situation,
causing the cluster health to move to red.

Fixes #41073

Addition of test case that creates the scenario
when there are no data nodes in Cluster and user tries for index Creation.
Changing the status of primary shards that are unassigned to AllocationStatus.Deciders_NO when there are no data nodes helps in solving this issue
@danielmitterdorfer danielmitterdorfer added :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) >bug v8.0.0 labels Jun 18, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@Gaurav614
Copy link
Contributor Author

Hey! Any comments on the PR?

@ywelsch ywelsch requested a review from DaveCTurner July 1, 2019 07:16
@DaveCTurner
Copy link
Contributor

DaveCTurner commented Jul 4, 2019

@elasticmachine update branch

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Gaurav614 for taking this on. There's been a few changes in master that mean your branch no longer compiles, but it will be simple enough to fix it up. I have left some small suggestions, including an idea for a slightly different test that covers a few more cases in a simpler way.

@DaveCTurner DaveCTurner changed the title Bugfix of issue #41073 Fail allocation of new primaries in empty cluster Jul 4, 2019
@DaveCTurner
Copy link
Contributor

image

Please don't force-push to PR branches. There was no harm done in this case, but force-pushing loses history and review comments.

@Gaurav614
Copy link
Contributor Author

@DaveCTurner Thanks for reviewing the pull request. I am working on the fix to the comments.

@DaveCTurner
Copy link
Contributor

Hi @Gaurav614, just checking if you're still working on this and if you need any help?

@Gaurav614
Copy link
Contributor Author

Hi @Gaurav614, just checking if you're still working on this and if you need any help?

@DaveCTurner Thanks for your helping gesture. I am working on it. And apologies for the delay, as I was out due to some personal emergency.
Will soon raise a new revision for it. And will ping you incase need any help.

@colings86 colings86 added v7.5.0 and removed v7.4.0 labels Aug 30, 2019
@DaveCTurner
Copy link
Contributor

Hey @Gaurav614, just checking in again. Are you still working on this PR?

@Gaurav614
Copy link
Contributor Author

Hey @Gaurav614, just checking in again. Are you still working on this PR?

Hey @DaveCTurner, I had pushed the new commits on 25th July. And requested for your review on it. Kindly review that.

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Gaurav614; the last message I received on this PR indicated there'd be more changes so I was waiting for them to land before taking a look. I left some more suggestions. Could you merge the latest master too?

@@ -141,6 +143,24 @@ public ShardAllocationDecision decideShardAllocation(final ShardRouting shard, f
return new ShardAllocationDecision(allocateUnassignedDecision, moveDecision);
}

private void failedAllocationOfNewPrimaries(RoutingAllocation allocation){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor change:

Suggested change
private void failedAllocationOfNewPrimaries(RoutingAllocation allocation){
private void failAllocationOfNewPrimaries(RoutingAllocation allocation) {

(change name to the imperative mood, and fix whitespace)

RoutingNodes routingNodes = allocation.routingNodes();
assert routingNodes.size() == 0 : routingNodes;
RoutingNodes.UnassignedShards unassignedShards = routingNodes.unassigned();
RoutingNodes.UnassignedShards.UnassignedIterator unassignedIterator = unassignedShards.iterator();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
RoutingNodes.UnassignedShards.UnassignedIterator unassignedIterator = unassignedShards.iterator();
final RoutingNodes.UnassignedShards.UnassignedIterator unassignedIterator = routingNodes.unassigned().iterator();

private void failedAllocationOfNewPrimaries(RoutingAllocation allocation){
RoutingNodes routingNodes = allocation.routingNodes();
assert routingNodes.size() == 0 : routingNodes;
RoutingNodes.UnassignedShards unassignedShards = routingNodes.unassigned();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
RoutingNodes.UnassignedShards unassignedShards = routingNodes.unassigned();

RoutingNodes.UnassignedShards unassignedShards = routingNodes.unassigned();
RoutingNodes.UnassignedShards.UnassignedIterator unassignedIterator = unassignedShards.iterator();
while (unassignedIterator.hasNext()) {
ShardRouting shardRouting = unassignedIterator.next();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ShardRouting shardRouting = unassignedIterator.next();
final ShardRouting shardRouting = unassignedIterator.next();

RoutingNodes.UnassignedShards.UnassignedIterator unassignedIterator = unassignedShards.iterator();
while (unassignedIterator.hasNext()) {
ShardRouting shardRouting = unassignedIterator.next();
UnassignedInfo unassignedInfo = shardRouting.unassignedInfo();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
UnassignedInfo unassignedInfo = shardRouting.unassignedInfo();
final UnassignedInfo unassignedInfo = shardRouting.unassignedInfo();

routingTable = RoutingTable.builder(routingTable).remove("test").build();
metaData = MetaData.builder(clusterState.metaData()).remove("test").build();
clusterState = ClusterState.builder(clusterState).routingTable(routingTable).metaData(metaData).build();
assertTrue(clusterState.nodes().getDataNodes().size() == 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, prefer assertEquals here.

Suggested change
assertTrue(clusterState.nodes().getDataNodes().size() == 0);
assertEquals(0, clusterState.nodes().getDataNodes().size());

.nodes(DiscoveryNodes.builder(clusterState.getNodes())
.remove(nodeName).build())
.build();
clusterState = createAllocationService().deassociateDeadNodes(clusterState, true, "reroute");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should pass the allocationService into this method rather than creating a new one here. Also there's no need to mutate clusterState:

Suggested change
clusterState = createAllocationService().deassociateDeadNodes(clusterState, true, "reroute");
return allocationService.deassociateDeadNodes(clusterState, true, "reroute");


DiscoveryNodes.Builder nodeBuilder = DiscoveryNodes.builder(clusterState.getNodes());
if (isMaster) {
nodeBuilder = nodeBuilder.add(newNode(nodeName, Collections.singleton(DiscoveryNode.Role.MASTER)));
Copy link
Contributor

@DaveCTurner DaveCTurner Sep 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to mutate nodeBuilder here:

Suggested change
nodeBuilder = nodeBuilder.add(newNode(nodeName, Collections.singleton(DiscoveryNode.Role.MASTER)));
nodeBuilder.add(newNode(nodeName, Collections.singleton(DiscoveryNode.Role.MASTER)));

if (isMaster) {
nodeBuilder = nodeBuilder.add(newNode(nodeName, Collections.singleton(DiscoveryNode.Role.MASTER)));
} else {
nodeBuilder = nodeBuilder.add(newNode(nodeName, Collections.singleton(DiscoveryNode.Role.DATA)));
Copy link
Contributor

@DaveCTurner DaveCTurner Sep 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to mutate nodeBuilder here:

Suggested change
nodeBuilder = nodeBuilder.add(newNode(nodeName, Collections.singleton(DiscoveryNode.Role.DATA)));
nodeBuilder.add(newNode(nodeName, Collections.singleton(DiscoveryNode.Role.DATA)));

} else {
nodeBuilder = nodeBuilder.add(newNode(nodeName, Collections.singleton(DiscoveryNode.Role.DATA)));
}
clusterState = ClusterState.builder(clusterState)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to mutate clusterState here:

Suggested change
clusterState = ClusterState.builder(clusterState)
return ClusterState.builder(clusterState).nodes(nodeBuilder).build();

@DaveCTurner
Copy link
Contributor

@elasticmachine test this please

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Nice work @Gaurav614, thanks for the extra iterations to get this so polished. I merged master, fixed a couple of whitespace issues, and triggered a CI run.

@DaveCTurner
Copy link
Contributor

CI failures were apparently expected, see 3c57592.

@elasticmachine test this please

@DaveCTurner DaveCTurner merged commit d220d53 into elastic:master Sep 30, 2019
DaveCTurner pushed a commit that referenced this pull request Sep 30, 2019
Today if you create an index in a cluster without any data nodes then it will
report yellow health because it never attempts to assign any shards if there
are no data nodes, so the new shards remain at `AllocationStatus.NO_ATTEMPT`.
This commit moves the new primaries to `AllocationStatus.DECIDERS_NO` in this
situation, causing the cluster health to move to red.

Fixes #41073
@Gaurav614
Copy link
Contributor Author

@DaveCTurner Thanks a lot man for merging the changes !!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) v7.5.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

A cluster with no data nodes or indices reports YELLOW health when an index is created.
6 participants