Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] CancellableRateLimitedFluxIteratorTests testCancellation failing #103054

Closed
thecoop opened this issue Dec 6, 2023 · 3 comments · Fixed by #104259
Closed

[CI] CancellableRateLimitedFluxIteratorTests testCancellation failing #103054

thecoop opened this issue Dec 6, 2023 · 3 comments · Fixed by #104259
Labels
:Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs medium-risk An open issue or test failure that is a medium risk to future releases Team:Distributed Meta label for distributed team (obsolete) >test-failure Triaged test failures from CI

Comments

@thecoop
Copy link
Member

thecoop commented Dec 6, 2023

Similar previous error: #87112

Build scan:
https://gradle-enterprise.elastic.co/s/rzfy4vf6ajniq/tests/:plugins:repository-azure:test/org.elasticsearch.repositories.azure.CancellableRateLimitedFluxIteratorTests/testCancellation

Reproduction line:

./gradlew ':plugins:repository-azure:test' --tests "org.elasticsearch.repositories.azure.CancellableRateLimitedFluxIteratorTests.testCancellation" -Dtests.seed=23C67FEEB19CBD6A -Dtests.locale=ko -Dtests.timezone=Pacific/Rarotonga -Druntime.java=18

Applicable branches:
main 7.17

Reproduces locally?:
Didn't try

Failure history:
Failure dashboard for org.elasticsearch.repositories.azure.CancellableRateLimitedFluxIteratorTests#testCancellation

Failure excerpt:

java.lang.AssertionError: 
Expected: <[3, 4]>
     but: was <[4]>

  at __randomizedtesting.SeedInfo.seed([23C67FEEB19CBD6A:F5F120982F238AEC]:0)
  at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
  at org.junit.Assert.assertThat(Assert.java:956)
  at org.junit.Assert.assertThat(Assert.java:923)
  at org.elasticsearch.repositories.azure.CancellableRateLimitedFluxIteratorTests.lambda$testCancellation$7(CancellableRateLimitedFluxIteratorTests.java:197)
  at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:1143)
  at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:1116)
  at org.elasticsearch.repositories.azure.CancellableRateLimitedFluxIteratorTests.testCancellation(CancellableRateLimitedFluxIteratorTests.java:197)
  at jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
  at java.lang.reflect.Method.invoke(Method.java:577)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
  at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
  at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
  at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
  at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
  at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
  at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
  at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
  at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850)
  at java.lang.Thread.run(Thread.java:833)

@thecoop thecoop added :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >test-failure Triaged test failures from CI labels Dec 6, 2023
@elasticsearchmachine elasticsearchmachine added blocker Team:Distributed Meta label for distributed team (obsolete) labels Dec 6, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@pxsalehi
Copy link
Member

pxsalehi commented Dec 7, 2023

This is not main. It's 7.17. Removing the blocker label since based on the previous failure and fix, doesn't seem like a block. Maybe @fcofdez can confirm?

@pxsalehi pxsalehi added medium-risk An open issue or test failure that is a medium risk to future releases and removed blocker labels Dec 7, 2023
@idegtiarenko
Copy link
Contributor

Discussed with @fcofdez, this could be reproduced with:

Subject: [PATCH] format
---
Index: modules/repository-azure/src/main/java/org/elasticsearch/repositories/azure/CancellableRateLimitedFluxIterator.java
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/modules/repository-azure/src/main/java/org/elasticsearch/repositories/azure/CancellableRateLimitedFluxIterator.java b/modules/repository-azure/src/main/java/org/elasticsearch/repositories/azure/CancellableRateLimitedFluxIterator.java
--- a/modules/repository-azure/src/main/java/org/elasticsearch/repositories/azure/CancellableRateLimitedFluxIterator.java	(revision 943b2eae70e7395ddd6b4a8c4ef36972082ab018)
+++ b/modules/repository-azure/src/main/java/org/elasticsearch/repositories/azure/CancellableRateLimitedFluxIterator.java	(date 1704803388473)
@@ -154,6 +154,11 @@
             return;
         }
 
+        try {
+            Thread.sleep(1000);
+        } catch (InterruptedException e) {
+            throw new RuntimeException(e);
+        }
         if (queue.offer(element) == false) {
             // If the source doesn't respect backpressure, we might lose elements,
             // in that case we cancel the subscription and mark this consumer as failed
@@ -167,6 +172,11 @@
     public void cancel() {
         cancelSubscription();
         clearQueue();
+        try {
+            Thread.sleep(100);
+        } catch (InterruptedException e) {
+            throw new RuntimeException(e);
+        }
         done = true;
         // cancel should be called from the consumer
         // thread, but to avoid potential deadlocks

and affects both 7.17 and main

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs medium-risk An open issue or test failure that is a medium risk to future releases Team:Distributed Meta label for distributed team (obsolete) >test-failure Triaged test failures from CI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants