-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Creating a transport action for the CoordinationDiagnosticsService #87984
Creating a transport action for the CoordinationDiagnosticsService #87984
Conversation
Hi @masseyke, I've created a changelog YAML for you. |
…f github.com:masseyke/elasticsearch into feature/health-api-master-stability-transport-action
Pinging @elastic/es-data-management (Team:Data Management) |
@elasticmachine update branch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this Keith.
Left a few suggestions
protected void doExecute(Task task, CoordinationDiagnosticsAction.Request request, ActionListener<Response> listener) { | ||
listener.onResponse( | ||
new Response( | ||
new CoordinationDiagnosticsService(clusterService, coordinator, masterHistoryService).diagnoseMasterStability( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we inject the service or create it once in the constructor? are there thread-safety concerns if we do so? (I'd think there shouldn't be as it's only reading things here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think you're right -- I don't think there would be any thread-safety problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And on second look, I'm not sure why I wasn't injecting that in the first place.
import static org.elasticsearch.cluster.coordination.CoordinationDiagnosticsService.CoordinationDiagnosticsResult; | ||
|
||
/** | ||
* This action exposes CoordinationDiagnosticsService.diagnoseMasterStability() so that a node can get a remote node's view of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: {@link CoordinationDiagnosticsService#diagnoseMasterStability}
} | ||
|
||
/** | ||
* This transport action calls the CoordinationDiagnosticsService on a remote node. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: This doc is rather misleading to me. The transport action exposes the functionality via the transport service (as opposed to calling a service on a remote node)
YELLOW((byte) 2), | ||
RED((byte) 3); | ||
|
||
private final byte value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this value
field is needed (I think it adds a bit of confusion).
Enums have ordinals already.
Let's remove the value
field and use in.readEnum
and out.writeEnum
to put it on the wire (it'll only write the enum ordinal).
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh. Somehow I didn't realize that those methods existed. Switching to use them now.
@elasticmachine update branch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for iterating on this Keith
Left a minor suggestion
public TransportAction( | ||
TransportService transportService, | ||
ActionFilters actionFilters, | ||
CoordinationDiagnosticsService coordinationDiagnosticsService |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice 👍
|
||
@Override | ||
protected void doExecute(Task task, CoordinationDiagnosticsAction.Request request, ActionListener<Response> listener) { | ||
listener.onResponse(new Response(coordinationDiagnosticsService.diagnoseMasterStability(request.explain))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we use ActionRunnable.wrap
here to make sure that a potential exception from diagnoseMasterStability
gets indeed propagated to listener.onFailure
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's already handled for us in the parent class here right? https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/elasticsearch/action/support/TransportAction.java#L79. I'll confirm that that is where it is called from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah you're right. TIL. Thanks for looking into this!
This exposes the CoordinationDiagnosticsService (#87672) through a transport action so that it can be called remotely as
part of the health API in the event that: (1) there has been no master recently, (2) there are master-eligible nodes in the cluster, (3) none are elected, and (4) the current node is not master eligible. In the diagram below, this action will be used in the 1.2.2.3 branch (it is the long arrow back to branch 1). The logic that actually uses it will be within the CoordinationDiagnosticsService, and will be in a follow-up PR. Requests for this action will be sent directly to the remote node, and the assumption is that the cluster that likely does not have a master node.