Allow asynchronous block operations to be delayed in IndexShardOperationPermits #35999
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
IndexShardOperationPermits provides two methods to execute an operation while holding all the index shard permits:
blockOperations()
andasyncBlockOperations()
. When they are called concurrently, all synchronous and asynchronous operations are competing for the acquisition of all permits. The order of execution is defined by the fairness of the internal semaphore used to acquire/release permits, but since asynchronous block operations are executed on another thread on the generic thread pool (which can already be busy with other unrelated operations) the order of execution is actually undefined.This behavior was OK until the merge of #35540, in which we allow transport replication write action to acquire all permits during the execution. This change caused tests failures like #35850 where the execution of the action is competing with a primary term bump for the acquisition of all permits (see #35862 (comment) for a more complete explanation of the issue). Depending of which takes the precedence, the test failed or succeed.
This pull request changes the IndexShardOperationPermits so that asynchronous block operations are delayed like regular operations if another block operation already requested the delay of operations. As soon as a blocking operation is terminated, IndexShardOperationPermits checks if one or more operations were delayed and should be released, and releases async block operations first - one after the other. This way IndexShardOperationPermits can guarantee that async block operations are executed in the order they were requested, while regular operation are still delayed until all sync/async block operations are terminated.
Closes #35850