Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Overly Aggressive Request DeDuplication #51270

Merged
merged 3 commits into from
Jan 22, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -358,6 +358,9 @@ private void syncShardStatsOnNewMaster(ClusterChangedEvent event) {
return;
}

// Clear request deduplicator since we need to send all requests that were potentially not handled by the previous
// master again
remoteFailedRequestDeduplicator.clear();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why this is needed. UpdateSnapshotStatusAction is a TransportMasterNodeAction and should resend the request in case of master failover.

Copy link
Member Author

@original-brownbear original-brownbear Jan 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think currently we're only retrying on FailedToCommitClusterStateException and NotMasterException but not on things like the node closed exception. That's why we have the manual retry/re-send of these messages in SnapshotShardsService in the first place don't we?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused by this stack trace. It shows that the node that's sending the request is closed, no? Why would we want to retry sending a request from a node that's shut down?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry pasted the wrong trace:

  1> [2020-01-21T11:49:53,151][WARN ][o.e.s.SnapshotShardsService] [node_td3] [test-repo:test-snap/v_2yOVZsQHu6gk_aFsmkKg] [ShardSnapshotStatus[state=SUCCESS, nodeId=8MquPimwRTCtgnobCaxJlw, reason=null, generation=T-25oH4kSg62xXEtINmviQ]] failed to update snapshot state
  1> org.elasticsearch.transport.RemoteTransportException: [node_tm0][127.0.0.1:41909][internal:cluster/snapshot/update_snapshot_status]
  1> Caused by: org.elasticsearch.node.NodeClosedException: node closed {node_tm0}{VGisb-JmTiOkIc_vJP8FhA}{Ucpv57LoTEy1J_cuDT1L8A}{127.0.0.1}{127.0.0.1:41909}{im}
  1> 	at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$2.onClusterServiceClose(TransportMasterNodeAction.java:213) ~[main/:?]
  1> 	at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onClusterServiceClose(ClusterStateObserver.java:318) ~[main/:?]
  1> 	at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onClose(ClusterStateObserver.java:237) ~[main/:?]
  1> 	at org.elasticsearch.cluster.service.ClusterApplierService.doStop(ClusterApplierService.java:186) ~[main/:?]
  1> 	at org.elasticsearch.common.component.AbstractLifecycleComponent.stop(AbstractLifecycleComponent.java:79) ~[main/:?]
  1> 	at org.elasticsearch.cluster.service.ClusterService.doStop(ClusterService.java:96) ~[main/:?]
  1> 	at org.elasticsearch.common.component.AbstractLifecycleComponent.stop(AbstractLifecycleComponent.java:79) ~[main/:?]
  1> 	at org.elasticsearch.node.Node.stop(Node.java:805) ~[main/:?]
  1> 	at org.elasticsearch.node.Node.close(Node.java:829) ~[main/:?]
  1> 	at org.elasticsearch.test.InternalTestCluster$NodeAndClient.close(InternalTestCluster.java:954) ~[framework-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1> 	at org.elasticsearch.test.InternalTestCluster.stopNodesAndClients(InternalTestCluster.java:1573) ~[framework-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1> 	at org.elasticsearch.test.InternalTestCluster.stopNodesAndClient(InternalTestCluster.java:1563) ~[framework-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1> 	at org.elasticsearch.test.InternalTestCluster.stopCurrentMasterNode(InternalTestCluster.java:1493) ~[framework-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1> 	at org.elasticsearch.snapshots.DedicatedClusterSnapshotRestoreIT.testMasterShutdownDuringSnapshot(DedicatedClusterSnapshotRestoreIT.java:849) ~[test/:?]
  1> 	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
  1> 	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?]
  1> 	at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
  1> 	at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
  1> 	at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750) ~[randomizedtesting-runner-2.7.4.jar:?]
  1> 	at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:938) ~[randomizedtesting-runner-2.7.4.jar:?]
  1> 	at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:974) ~[randomizedtesting-runner-2.7.4.jar:?]
  1> 	at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:988) ~[randomizedtesting-runner-2.7.4.jar:?]

We have the same for the remote end as well. I'm not 100% sure we can get that exception outside of a test (because of the shut down order, but as you can see at least in the test requests are clearly not retried on the master node restart tm_0 and just go to the listener's onFailure). We could improve the master action to retry this case as well, but as long as we have the fail-safe retry on master failover in the snapshot shard's service already, we should make that actually work?

Copy link
Contributor

@ywelsch ywelsch Jan 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a bug in TransportMasterNodeAction, then. In case where a node is shut down, we first shut down the cluster service, and only then do we shut down the transport service. This means that a node that has sent a MasterNodeRequest to a master node that is in the process of shutting down can receive a TransportException with the cause being a NodeClosedException, for which we don't trigger a retry (we only do that for a ConnectTransportException). Can we fix that instead of this hack here (which also fixes this situation for any other master-level request)? Long term, the logic in syncShardStatsOnNewMaster can go away. I think it's mainly still here because the ShardSnapshotStatus action was not a TransportMasterNodeAction in some older versions.

Copy link
Member Author

@original-brownbear original-brownbear Jan 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ywelsch sounds good, I'll look into fixing the TransportMasterNodeAction and removing the retry in the snapshot shards service then shortly :)
Still, so long as we have syncShardStatsOnNewMaster I feel like we should have this hack in place as without it that method doesn't really make sense?

for (SnapshotsInProgress.Entry snapshot : snapshotsInProgress.entries()) {
if (snapshot.state() == State.STARTED || snapshot.state() == State.ABORTED) {
Map<ShardId, IndexShardSnapshotStatus> localShards = currentSnapshotShards(snapshot.snapshot());
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,14 @@ public void executeOnce(T request, ActionListener<Void> listener, BiConsumer<T,
}
}

/**
* Remove all tracked requests from this instance so that the first time {@link #executeOnce} is invoked with any request it triggers
* an actual request execution. Use this e.g. for requests to master that need to be sent again on master failover.
*/
public void clear() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know there's just one use case for this now, but just dropping all deduplication seems to be the right approach to master failovers in general if we ever want to reuse this for e.g. put mapping calls or so.

requests.clear();
}

public int size() {
return requests.size();
}
Expand Down