Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating a transport action for the CoordinationDiagnosticsService #87984

Conversation

masseyke
Copy link
Member

@masseyke masseyke commented Jun 23, 2022

This exposes the CoordinationDiagnosticsService (#87672) through a transport action so that it can be called remotely as
part of the health API in the event that: (1) there has been no master recently, (2) there are master-eligible nodes in the cluster, (3) none are elected, and (4) the current node is not master eligible. In the diagram below, this action will be used in the 1.2.2.3 branch (it is the long arrow back to branch 1). The logic that actually uses it will be within the CoordinationDiagnosticsService, and will be in a follow-up PR. Requests for this action will be sent directly to the remote node, and the assumption is that the cluster that likely does not have a master node.
2022-04-18 Coordination health

@elasticsearchmachine
Copy link
Collaborator

Hi @masseyke, I've created a changelog YAML for you.

masseyke added 3 commits June 23, 2022 12:59
…f github.com:masseyke/elasticsearch into feature/health-api-master-stability-transport-action
@masseyke masseyke marked this pull request as ready for review June 23, 2022 18:29
@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Jun 23, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@masseyke masseyke requested a review from andreidan June 23, 2022 18:29
@masseyke
Copy link
Member Author

@elasticmachine update branch

Copy link
Contributor

@andreidan andreidan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this Keith.
Left a few suggestions

protected void doExecute(Task task, CoordinationDiagnosticsAction.Request request, ActionListener<Response> listener) {
listener.onResponse(
new Response(
new CoordinationDiagnosticsService(clusterService, coordinator, masterHistoryService).diagnoseMasterStability(
Copy link
Contributor

@andreidan andreidan Jun 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we inject the service or create it once in the constructor? are there thread-safety concerns if we do so? (I'd think there shouldn't be as it's only reading things here)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think you're right -- I don't think there would be any thread-safety problems.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And on second look, I'm not sure why I wasn't injecting that in the first place.

import static org.elasticsearch.cluster.coordination.CoordinationDiagnosticsService.CoordinationDiagnosticsResult;

/**
* This action exposes CoordinationDiagnosticsService.diagnoseMasterStability() so that a node can get a remote node's view of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: {@link CoordinationDiagnosticsService#diagnoseMasterStability}

}

/**
* This transport action calls the CoordinationDiagnosticsService on a remote node.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This doc is rather misleading to me. The transport action exposes the functionality via the transport service (as opposed to calling a service on a remote node)

YELLOW((byte) 2),
RED((byte) 3);

private final byte value;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this value field is needed (I think it adds a bit of confusion).

Enums have ordinals already.

Let's remove the value field and use in.readEnum and out.writeEnum to put it on the wire (it'll only write the enum ordinal).

What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh. Somehow I didn't realize that those methods existed. Switching to use them now.

@masseyke masseyke requested a review from andreidan June 28, 2022 14:50
@masseyke
Copy link
Member Author

@elasticmachine update branch

Copy link
Contributor

@andreidan andreidan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for iterating on this Keith

Left a minor suggestion

public TransportAction(
TransportService transportService,
ActionFilters actionFilters,
CoordinationDiagnosticsService coordinationDiagnosticsService
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 👍


@Override
protected void doExecute(Task task, CoordinationDiagnosticsAction.Request request, ActionListener<Response> listener) {
listener.onResponse(new Response(coordinationDiagnosticsService.diagnoseMasterStability(request.explain)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we use ActionRunnable.wrap here to make sure that a potential exception from diagnoseMasterStability gets indeed propagated to listener.onFailure?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's already handled for us in the parent class here right? https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/elasticsearch/action/support/TransportAction.java#L79. I'll confirm that that is where it is called from.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah you're right. TIL. Thanks for looking into this!

@masseyke masseyke merged commit cd0d5c8 into elastic:master Jun 29, 2022
@masseyke masseyke deleted the feature/health-api-master-stability-transport-action branch June 29, 2022 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants