Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] org.elasticsearch.xpack.ccr.FollowerFailOverIT#testFailOverOnFollower #58534

Closed
matriv opened this issue Jun 25, 2020 · 3 comments · Fixed by #59375
Closed

[CI] org.elasticsearch.xpack.ccr.FollowerFailOverIT#testFailOverOnFollower #58534

matriv opened this issue Jun 25, 2020 · 3 comments · Fixed by #59375
Assignees
Labels
:Distributed Indexing/CCR Issues around the Cross Cluster State Replication features Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. >test-failure Triaged test failures from CI

Comments

@matriv
Copy link
Contributor

matriv commented Jun 25, 2020

Failing on master

Build scan: https://gradle-enterprise.elastic.co/s/jm5pkdiktr2zq

Repro line:

./gradlew ':x-pack:plugin:ccr:internalClusterTest' --tests "org.elasticsearch.xpack.ccr.FollowerFailOverIT.testFailOverOnFollower" \
  -Dtests.seed=A1E8FAE62B8B6C9E \
  -Dtests.security.manager=true \
  -Dtests.locale=cs-CZ \
  -Dtests.timezone=America/Argentina/Rio_Gallegos \
  -Druntime.java=11

Reproduces locally?: No

Applicable branches: master

Failure history:
https://build-stats.elastic.co/app/kibana#/discover?_g=(refreshInterval:(pause:!f,value:7200000),time:(from:now-30d,mode:quick,to:now))&_a=(columns:!(_source),index:e58bf320-7efd-11e8-bf69-63c8ef516157,interval:auto,query:(language:lucene,query:'class:%20%22org.elasticsearch.xpack.ccr.FollowerFailOverIT%22%20test:%22%20testFailOverOnFollower%22%20stacktrace:%20%22timed%20out%20waiting%20for%20green%20state%22'),sort:!(time,desc))

Failure excerpt:

07:36:11 org.elasticsearch.xpack.ccr.FollowerFailOverIT > testFailOverOnFollower FAILED
07:36:11     java.lang.AssertionError: timed out waiting for green state
07:36:11         at __randomizedtesting.SeedInfo.seed([A1E8FAE62B8B6C9E:7EBB537ABFEBCFA4]:0)
07:36:11         at org.junit.Assert.fail(Assert.java:88)
07:36:11         at org.elasticsearch.xpack.CcrIntegTestCase.ensureColor(CcrIntegTestCase.java:337)
07:36:11         at org.elasticsearch.xpack.CcrIntegTestCase.ensureFollowerGreen(CcrIntegTestCase.java:311)
07:36:11         at org.elasticsearch.xpack.CcrIntegTestCase.ensureFollowerGreen(CcrIntegTestCase.java:306)
07:36:11         at org.elasticsearch.xpack.ccr.FollowerFailOverIT.testFailOverOnFollower(FollowerFailOverIT.java:101)
07:36:11 REPRODUCE WITH: ./gradlew ':x-pack:plugin:ccr:internalClusterTest' --tests "org.elasticsearch.xpack.ccr.FollowerFailOverIT.testFailOverOnFollower" -Dtests.seed=A1E8FAE62B8B6C9E -Dtests.security.manager=true -Dtests.locale=cs-CZ -Dtests.timezone=America/Argentina/Rio_Gallegos -Druntime.java=11
07:36:11 
07:36:11 org.elasticsearch.xpack.ccr.FollowerFailOverIT > classMethod FAILED
07:36:11     com.carrotsearch.randomizedtesting.ThreadLeakError: 2 threads leaked from SUITE scope at org.elasticsearch.xpack.ccr.FollowerFailOverIT: 
07:36:11        1) Thread[id=712, name=Thread-23, state=TIMED_WAITING, group=TGRP-FollowerFailOverIT]
07:36:11             at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
07:36:11             at [email protected]/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:234)
07:36:11             at [email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1079)
07:36:11             at [email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1369)
07:36:11             at [email protected]/java.util.concurrent.Semaphore.tryAcquire(Semaphore.java:415)
07:36:11             at app//org.elasticsearch.xpack.ccr.FollowerFailOverIT.lambda$testFailOverOnFollower$0(FollowerFailOverIT.java:70)
07:36:11             at app//org.elasticsearch.xpack.ccr.FollowerFailOverIT$$Lambda$4075/0x0000000100d3dc40.run(Unknown Source)
07:36:11             at [email protected]/java.lang.Thread.run(Thread.java:834)
07:36:11        2) Thread[id=711, name=Thread-22, state=TIMED_WAITING, group=TGRP-FollowerFailOverIT]
07:36:11             at [email protected]/jdk.internal.misc.Unsafe.park(Native Method)
07:36:11             at [email protected]/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:234)
07:36:11             at [email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1079)
07:36:11             at [email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1369)
07:36:11             at [email protected]/java.util.concurrent.Semaphore.tryAcquire(Semaphore.java:415)
07:36:11             at app//org.elasticsearch.xpack.ccr.FollowerFailOverIT.lambda$testFailOverOnFollower$0(FollowerFailOverIT.java:70)
07:36:11             at app//org.elasticsearch.xpack.ccr.FollowerFailOverIT$$Lambda$4075/0x0000000100d3dc40.run(Unknown Source)
07:36:11             at [email protected]/java.lang.Thread.run(Thread.java:834)
07:36:11         at __randomizedtesting.SeedInfo.seed([A1E8FAE62B8B6C9E]:0)
@matriv matriv added >test-failure Triaged test failures from CI :Distributed Indexing/CCR Issues around the Cross Cluster State Replication features labels Jun 25, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/CCR)

@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Jun 25, 2020
@dnhatn dnhatn self-assigned this Jun 25, 2020
@dimitris-athanasiou
Copy link
Contributor

Another failure in: https://gradle-enterprise.elastic.co/s/t5ncm7jvhfo3k

@jrodewig
Copy link
Contributor

danielmitterdorfer added a commit that referenced this issue Jul 9, 2020
dnhatn added a commit that referenced this issue Jul 14, 2020
)

The primary shards of follower indices during the bootstrap need to be
on nodes with the remote cluster client role as those nodes reach out to
the corresponding leader shards on the remote cluster to copy Lucene
segment files and renew the retention leases. This commit introduces a
new allocation decider that ensures bootstrapping follower primaries are
allocated to nodes with the remote cluster client role.

Relates #54146
Relates #53924
Closes #58534

Co-authored-by: Jason Tedor <[email protected]>
dnhatn added a commit to dnhatn/elasticsearch that referenced this issue Jul 14, 2020
…stic#59375)

The primary shards of follower indices during the bootstrap need to be
on nodes with the remote cluster client role as those nodes reach out to
the corresponding leader shards on the remote cluster to copy Lucene
segment files and renew the retention leases. This commit introduces a
new allocation decider that ensures bootstrapping follower primaries are
allocated to nodes with the remote cluster client role.

Relates elastic#54146
Relates elastic#53924
Closes elastic#58534

Co-authored-by: Jason Tedor <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/CCR Issues around the Cross Cluster State Replication features Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. >test-failure Triaged test failures from CI
Projects
None yet
5 participants