-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI] CreateIndexIT testCreateAndDeleteIndexConcurrently failing #96578
Comments
Pinging @elastic/es-search (Team:Search) |
I also noticed this assertion failure in a recent test failure:
|
Btw I don't think this a failure for the search team? This test creates and deletes the target index for writes concurrently while indexing. |
Pinging @elastic/es-data-management (Team:Data Management) |
I also just saw a similar error as the one mentioned above, here for a different test (
I think it's probably the same cause, that's why I mention it on this ticket. |
Another one today with timeouts on 8.9: https://gradle-enterprise.elastic.co/s/63qen2ntob2u4 Looking at the thread dump this seems to be the interesting parts to me, but I might be wrong:
|
This failed again today with the same symptoms: stopping the node leaves a recovery in a state that will never complete, which prevents the node from closing. Relabelling this as a recovery bug. |
Pinging @elastic/es-distributed (Team:Distributed) |
Possibly related to #100589. |
Ugh yeah we're missing a diff --git a/server/src/main/java/org/elasticsearch/indices/recovery/RecoverySourceHandler.java b/server/src/main/java/org/elasticsearch/indices/recovery/RecoverySourceHandler.java
index fc5df1a4aa2..81bc226102f 100644
--- a/server/src/main/java/org/elasticsearch/indices/recovery/RecoverySourceHandler.java
+++ b/server/src/main/java/org/elasticsearch/indices/recovery/RecoverySourceHandler.java
@@ -33,7 +33,6 @@ import org.elasticsearch.common.unit.ByteSizeValue;
import org.elasticsearch.common.util.CancellableThreads;
import org.elasticsearch.common.util.concurrent.CountDown;
import org.elasticsearch.common.util.set.Sets;
-import org.elasticsearch.core.CheckedRunnable;
import org.elasticsearch.core.IOUtils;
import org.elasticsearch.core.Nullable;
import org.elasticsearch.core.Releasable;
@@ -426,7 +425,7 @@ public class RecoverySourceHandler {
}
static void runUnderPrimaryPermit(
- CheckedRunnable<Exception> action,
+ Runnable action,
IndexShard primary,
CancellableThreads cancellableThreads,
ActionListener<Void> listener
@@ -1260,7 +1259,7 @@ public class RecoverySourceHandler {
*/
final SubscribableListener<Void> markInSyncStep = new SubscribableListener<>();
runUnderPrimaryPermit(
- () -> shard.markAllocationIdAsInSync(request.targetAllocationId(), targetLocalCheckpoint),
+ () -> cancellableThreads.execute(() -> shard.markAllocationIdAsInSync(request.targetAllocationId(), targetLocalCheckpoint)),
shard,
cancellableThreads,
markInSyncStep |
`IndexShard#markAllocationIdAsInSync` is interruptible because it may block the thread on a monitor waiting for the local checkpoint to advance, but we lost the ability to interrupt it on a recovery cancellation in elastic#95270. Closes elastic#96578 Closes elastic#100589
`IndexShard#markAllocationIdAsInSync` is interruptible because it may block the thread on a monitor waiting for the local checkpoint to advance, but we lost the ability to interrupt it on a recovery cancellation in elastic#95270. Closes elastic#96578 Closes elastic#100589
Build scan:
https://gradle-enterprise.elastic.co/s/dxnvahurwubcu/tests/:server:internalClusterTest/org.elasticsearch.action.admin.indices.create.CreateIndexIT/testCreateAndDeleteIndexConcurrently
Reproduction line:
Applicable branches:
main, 8.8
Reproduces locally?:
No
Failure history:
https://gradle-enterprise.elastic.co/scans/tests?tests.container=org.elasticsearch.action.admin.indices.create.CreateIndexIT&tests.test=testCreateAndDeleteIndexConcurrently
Failure excerpt:
The text was updated successfully, but these errors were encountered: