Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] FrozenSearchableSnapshotsIntegTests testCreateAndRestorePartialSearchableSnapshot failing #81949

Closed
droberts195 opened this issue Dec 20, 2021 · 7 comments · Fixed by #82249
Assignees
Labels
:Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. >test-failure Triaged test failures from CI

Comments

@droberts195
Copy link
Contributor

Happened in a Windows build - not sure if this is just slowness on the Windows worker or something more serious.

Build scan:
https://gradle-enterprise.elastic.co/s/xe3ov3qvjhque/tests/:x-pack:plugin:searchable-snapshots:internalClusterTest/org.elasticsearch.xpack.searchablesnapshots.FrozenSearchableSnapshotsIntegTests/testCreateAndRestorePartialSearchableSnapshot

Reproduction line:
gradlew ':x-pack:plugin:searchable-snapshots:internalClusterTest' --tests "org.elasticsearch.xpack.searchablesnapshots.FrozenSearchableSnapshotsIntegTests.testCreateAndRestorePartialSearchableSnapshot" -Dtests.seed=83DAAD1A771DAD8D -Dtests.locale=ja-JP-u-ca-japanese-x-lvariant-JP -Dtests.timezone=Canada/Yukon -Druntime.java=8

Applicable branches:
7.17

Reproduces locally?:
No

Failure history:
https://gradle-enterprise.elastic.co/scans/tests?tests.container=org.elasticsearch.xpack.searchablesnapshots.FrozenSearchableSnapshotsIntegTests&tests.test=testCreateAndRestorePartialSearchableSnapshot

Failure excerpt:

java.lang.AssertionError: Shard [syxjwhutbk][5] is still locked after 5 sec waiting

  at __randomizedtesting.SeedInfo.seed([83DAAD1A771DAD8D:F75D6CCB4A7A836F]:0)
  at org.junit.Assert.fail(Assert.java:88)
  at org.elasticsearch.test.InternalTestCluster.assertAfterTest(InternalTestCluster.java:2751)
  at org.elasticsearch.test.ESIntegTestCase.afterInternal(ESIntegTestCase.java:622)
  at org.elasticsearch.test.ESIntegTestCase.cleanUpCluster(ESIntegTestCase.java:2396)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethodAccessorImpl.java:-2)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
  at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
  at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
  at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
  at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:947)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:832)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:883)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:894)
  at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
  at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
  at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
  at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
  at java.lang.Thread.run(Thread.java:748)

@droberts195 droberts195 added :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >test-failure Triaged test failures from CI labels Dec 20, 2021
@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Dec 20, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@andreidan
Copy link
Contributor

@ywangd
Copy link
Member

ywangd commented Dec 22, 2021

It's failing on a 7.17 backport PR #81873

build scan: https://gradle-enterprise.elastic.co/s/i63dq5lmmmdmq/tests/:x-pack:plugin:searchable-snapshots:internalClusterTest/org.elasticsearch.xpack.searchablesnapshots.FrozenSearchableSnapshotsIntegTests/testCreateAndRestorePartialSearchableSnapshot?top-execution=1

The failure message is a bit different but looks genuine:

java.lang.AssertionError: |  
-- | --
  | Expected: <0> |  
  | but: was <6> |  

@idegtiarenko
Copy link
Contributor

idegtiarenko commented Dec 23, 2021

I was able to reliably reproduce it locally with:
./gradlew ':x-pack:plugin:searchable-snapshots:internalClusterTest' --tests "org.elasticsearch.xpack.searchablesnapshots.FrozenSearchableSnapshotsIntegTests" -Dtests.seed=533D96987A4EF90A -Druntime.java=8
and
./gradlew ':x-pack:plugin:searchable-snapshots:internalClusterTest' --tests "org.elasticsearch.xpack.searchablesnapshots.FrozenSearchableSnapshotsIntegTests" -Dtests.seed=533D96987A4EF90A

I did a git bisect with above and it has pointed to 12fa869 (Upgrade to Lucene 8.11.1)

Could somebody please help to investigate it further?

@droberts195
Copy link
Contributor Author

The commit mentioned in #81949 (comment) is the upgrade to Lucene 8.11.1.

If that’s truly the cause then this needs to be investigated before 7.17 feature freeze in case a Lucene fix is required.

Pinging @elastic/es-search

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. >test-failure Triaged test failures from CI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants