-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inline TransportReplAct#createReplicatedOperation #41197
Inline TransportReplAct#createReplicatedOperation #41197
Conversation
`TransportReplicationAction.AsyncPrimaryAction#createReplicatedOperation` exists so it can be overridden in tests. This commit re-works these tests to use a real `ReplicationOperation` and inlines the now-unnecessary method. Relates elastic#40706.
Pinging @elastic/es-distributed |
@@ -821,57 +809,50 @@ public void testCounterOnPrimary() throws Exception { | |||
Request request = new Request(shardId); | |||
PlainActionFuture<TestResponse> listener = new PlainActionFuture<>(); | |||
ReplicationTask task = maybeTask(); | |||
int i = randomInt(3); | |||
final boolean throwExceptionOnCreation = i == 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case doesn't seem to be possible in production, so I removed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Thanks @DaveCTurner, I left 3 comments to consider.
@Override | ||
public void onFailure(Exception e) { | ||
handleException(primaryShardReference, e); | ||
final ActionListener<Response> referenceClosingListener = ActionListener.wrap(response -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the separation into two listeners artificial and a bit confusing. I suggest something like following instead:
final ActionListener<Response> globalCheckpointSyncingListener = ActionListener.wrap(response -> {
if (syncGlobalCheckpointAfterOperation) {
final IndexShard shard = primaryShardReference.indexShard;
try {
shard.maybeSyncGlobalCheckpoint("post-operation");
} catch (final Exception e) {
// only log non-closed exceptions
if (ExceptionsHelper.unwrap(
e, AlreadyClosedException.class, IndexShardClosedException.class) == null) {
// intentionally swallow, a missed global checkpoint sync should not fail this operation
logger.info(
new ParameterizedMessage(
"{} failed to execute post-operation global checkpoint sync", shard.shardId()), e);
}
}
}
primaryShardReference.close(); // release shard operation lock before responding to caller
setPhase(replicationTask, "finished");
onCompletionListener.onResponse(response);
}, e -> handleException(primaryShardReference, e));
new ReplicationOperation<>(primaryRequest.getRequest(), primaryShardReference,
ActionListener.wrap(result -> result.respond(globalCheckpointSyncingListener),
globalCheckpointSyncingListener::onFailure),
newReplicasProxy(), logger, actionName, primaryRequest.getPrimaryTerm()).execute();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In isolation I agree, but this separation will be important in a followup so I hope it's ok to leave it like it is. The global checkpoint syncing is the responsibility of the primary, whereas the cleanup of the replication task and the primaryShardReference
is the responsibility of the reroute/delegation phase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
} | ||
}.run(); | ||
}.new AsyncPrimaryAction(primaryRequest, ActionListener.wrap(listener::onResponse, throwable -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we instead of using ActionListener.wrap
just assert that listener.isDone()
and do listener.get()
like in the test above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it seems we can. I pushed 1350c0f.
} else { | ||
throw e; | ||
} | ||
} | ||
|
||
if (throwExceptionOnRun || respondWithError) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think it is more logical to put this inside the try-catch (after listener.get()) and remove the return above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I pushed fb1f7eb.
@@ -376,7 +377,8 @@ public void handleException(TransportException exp) { | |||
// intentionally swallow, a missed global checkpoint sync should not fail this operation | |||
logger.info( | |||
new ParameterizedMessage( | |||
"{} failed to execute post-operation global checkpoint sync", shard.shardId()), e); | |||
"{} failed to execute post-operation global checkpoint sync", | |||
primaryShardReference.routingEntry().shardId()), e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I follow this change, I cannot figure out how this makes a difference. I think using just shard.shardId() is simpler unless there is a reason for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More foreshadowing of changes to come, but I can defer this until later.
This reverts commit a26a986.
`TransportReplicationAction.AsyncPrimaryAction#createReplicatedOperation` exists so it can be overridden in tests. This commit re-works these tests to use a real `ReplicationOperation` and inlines the now-unnecessary method. Relates #40706.
`TransportReplicationAction.AsyncPrimaryAction#createReplicatedOperation` exists so it can be overridden in tests. This commit re-works these tests to use a real `ReplicationOperation` and inlines the now-unnecessary method. Relates elastic#40706.
TransportReplicationAction.AsyncPrimaryAction#createReplicatedOperation
exists so it can be overridden in tests. This commit re-works these tests to
use a real
ReplicationOperation
and inlines the now-unnecessary method.Relates #40706.