Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] :qa:full-cluster-restart:v7.0.1#upgradedClusterTest failing #91470

Closed
pxsalehi opened this issue Nov 9, 2022 · 3 comments · Fixed by #91765
Closed

[CI] :qa:full-cluster-restart:v7.0.1#upgradedClusterTest failing #91470

pxsalehi opened this issue Nov 9, 2022 · 3 comments · Fixed by #91765
Assignees
Labels
:Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) Team:Distributed Meta label for distributed team (obsolete) >test-failure Triaged test failures from CI

Comments

@pxsalehi
Copy link
Member

pxsalehi commented Nov 9, 2022

CI Link

https://gradle-enterprise.elastic.co/s/ptwgurvvtzx5s

Repro line

Probably with ./gradlew :qa:full-cluster-restart:v7.0.1#upgradedClusterTest

Does it reproduce?

Didn't try

Applicable branches

main

Failure history

No response

Failure excerpt

See https://gradle-enterprise.elastic.co/s/ptwgurvvtzx5s/console-log/raw?task=:qa:full-cluster-restart:v7.0.1%23upgradedClusterTest

It also fails for other BWC versions: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+main+periodic+bwc/
(7.0.0 to 7.1.1)

» [2022-11-09T13:11:28,587][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [v7.0.1-1] fatal error in thread [elasticsearch[v7.0.1-1][masterService#updateTask][T#1]], exiting java.lang.AssertionError: {[testclosedindices][0]=2}
»  	at [email protected]/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceReconciler.allocateUnassignedInvariant(DesiredBalanceReconciler.java:118)
»  	at [email protected]/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceReconciler.run(DesiredBalanceReconciler.java:81)
»  	at [email protected]/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceShardsAllocator.recordTime(DesiredBalanceShardsAllocator.java:299)
»  	at [email protected]/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceShardsAllocator.reconcile(DesiredBalanceShardsAllocator.java:213)
»  	at [email protected]/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceShardsAllocator$ReconcileDesiredBalanceExecutor.lambda$applyBalance$1(DesiredBalanceShardsAllocator.java:277)
»  	at [email protected]/org.elasticsearch.cluster.routing.allocation.AllocationService.reroute(AllocationService.java:518)
»  	at [email protected]/org.elasticsearch.cluster.routing.allocation.AllocationService.executeWithRoutingAllocation(AllocationService.java:444)
»  	at [email protected]/org.elasticsearch.cluster.ClusterModule.reconcile(ClusterModule.java:145)
»  	at [email protected]/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceShardsAllocator$ReconcileDesiredBalanceExecutor.applyBalance(DesiredBalanceShardsAllocator.java:275)
»  	at [email protected]/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceShardsAllocator$ReconcileDesiredBalanceExecutor.execute(DesiredBalanceShardsAllocator.java:261)
»  	at [email protected]/org.elasticsearch.cluster.service.MasterService.innerExecuteTasks(MasterService.java:1052)
»  	at [email protected]/org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:1017)
»  	at [email protected]/org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:278)
»  	at [email protected]/org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:170)
»  	at [email protected]/org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:110)
»  	at [email protected]/org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:148)
»  	at [email protected]/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:850)
»  	at [email protected]/org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:257)
»  	at [email protected]/org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:223)
»  	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
»  	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
»  	at java.base/java.lang.Thread.run(Thread.java:833)
@pxsalehi pxsalehi added >test-failure Triaged test failures from CI :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) labels Nov 9, 2022
@elasticsearchmachine elasticsearchmachine added the Team:Distributed Meta label for distributed team (obsolete) label Nov 9, 2022
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@mark-vieira
Copy link
Contributor

mark-vieira commented Nov 10, 2022

@DaveCTurner Given when this started failing and the stacktraces could #91343 be the culprit here? Looks like this is limited to upgrading from nodes earlier than 7.2.0.

@DaveCTurner
Copy link
Contributor

Yeah, well at least #91343 introduced the assertion that's tripping here. Not yet clear why it would be tripping tho, but closed indices from pre-7.2 nodes are a special and interesting corner case (#33888). If I don't see an obvious problem soon I'll try and work out a way to mute these tests.

DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this issue Nov 21, 2022
This assertion fails in the presence of pre-7.2.0 closed indices because
such indices don't even have routing table entries.

Relates elastic#33888
Closes elastic#91470
elasticsearchmachine pushed a commit that referenced this issue Nov 21, 2022
This assertion fails in the presence of pre-7.2.0 closed indices because
such indices don't even have routing table entries.

Relates #33888 Closes #91470
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this issue Nov 21, 2022
This assertion fails in the presence of pre-7.2.0 closed indices because
such indices don't even have routing table entries.

Relates elastic#33888 Closes elastic#91470
elasticsearchmachine pushed a commit that referenced this issue Nov 21, 2022
This assertion fails in the presence of pre-7.2.0 closed indices because
such indices don't even have routing table entries.

Relates #33888 Closes #91470
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) Team:Distributed Meta label for distributed team (obsolete) >test-failure Triaged test failures from CI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants