Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster state from API should always have a master #42454

Conversation

DaveCTurner
Copy link
Contributor

Today the TransportClusterStateAction ignores the state passed by the
TransportMasterNodeAction and obtains its state from the cluster applier.
This might be inconsistent, showing a different node as the master or maybe
even having no master.

This change adjusts the action to use the passed-in state directly, and adds
tests showing that the state returned is consistent with our expectations even
if there is a concurrent master failover.

Fixes #38331
Relates #38432

Today the `TransportClusterStateAction` ignores the state passed by the
`TransportMasterNodeAction` and obtains its state from the cluster applier.
This might be inconsistent, showing a different node as the master or maybe
even having no master.

This change adjusts the action to use the passed-in state directly, and adds
tests showing that the state returned is consistent with our expectations even
if there is a concurrent master failover.

Fixes elastic#38331
Relates elastic#38432
@DaveCTurner DaveCTurner added >bug :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. v8.0.0 v7.3.0 labels May 23, 2019
@DaveCTurner DaveCTurner requested review from martijnvg and ywelsch May 23, 2019 16:43
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Override
public void onClusterServiceClose() {
listener.onFailure(new NodeClosedException(clusterService.localNode()));
if (acceptableClusterStatePredicate.test(state)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@DaveCTurner DaveCTurner merged commit c1de8c2 into elastic:master May 24, 2019
@DaveCTurner DaveCTurner deleted the 2019-05-23-get-cluster-state-always-returns-master branch May 24, 2019 07:43
DaveCTurner added a commit that referenced this pull request May 24, 2019
Today the `TransportClusterStateAction` ignores the state passed by the
`TransportMasterNodeAction` and obtains its state from the cluster applier.
This might be inconsistent, showing a different node as the master or maybe
even having no master.

This change adjusts the action to use the passed-in state directly, and adds
tests showing that the state returned is consistent with our expectations even
if there is a concurrent master failover.

Fixes #38331
Relates #38432
gurkankaymak pushed a commit to gurkankaymak/elasticsearch that referenced this pull request May 27, 2019
Today the `TransportClusterStateAction` ignores the state passed by the
`TransportMasterNodeAction` and obtains its state from the cluster applier.
This might be inconsistent, showing a different node as the master or maybe
even having no master.

This change adjusts the action to use the passed-in state directly, and adds
tests showing that the state returned is consistent with our expectations even
if there is a concurrent master failover.

Fixes elastic#38331
Relates elastic#38432
cbuescher pushed a commit to cbuescher/elasticsearch that referenced this pull request Jul 16, 2019
The underlying issue is closed and the fix in elastic#42454 seems to have been
backported to 7.x and 7.3 so we can reactivate the test.
cbuescher pushed a commit that referenced this pull request Jul 16, 2019
The underlying issue is closed and the fix in #42454 seems to have been
backported to 7.x and 7.3 so we can reactivate the test.
cbuescher pushed a commit that referenced this pull request Jul 16, 2019
The underlying issue is closed and the fix in #42454 seems to have been
backported to 7.x and 7.3 so we can reactivate the test.
DaveCTurner added a commit that referenced this pull request Jul 17, 2019
Today the `TransportClusterStateAction` ignores the state passed by the
`TransportMasterNodeAction` and obtains its state from the cluster applier.
This might be inconsistent, showing a different node as the master or maybe
even having no master.

This change adjusts the action to use the passed-in state directly, and adds
tests showing that the state returned is consistent with our expectations even
if there is a concurrent master failover.

Fixes #38331
Relates #38432
@DaveCTurner
Copy link
Contributor Author

Backported to 7.2 as 52e8a64 (and unmuted the associated test)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. v7.2.2 v7.3.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CI] SpecificMasterNodesIT#testElectOnlyBetweenMasterNodes fails
4 participants