Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] QueueResizingEsThreadPoolExecutorTests.testAutoQueueSizingWithMax + thread leakage #30740

Closed
vladimirdolzhenko opened this issue May 19, 2018 · 12 comments
Assignees
Labels
:Core/Infra/Core Core issues without another label Team:Core/Infra Meta label for core/infra team >test-failure Triaged test failures from CI v6.2.5

Comments

@vladimirdolzhenko
Copy link
Contributor

Log:

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.2+periodic/704/console

REPRODUCE WITH: ./gradlew :server:test -Dtests.seed=FA63CC5EE34DF6A8 -Dtests.class=org.elasticsearch.common.util.concurrent.QueueResizingEsThreadPoolExecutorTests -Dtests.method="testAutoQueueSizingWithMax" -Dtests.security.manager=true -Dtests.locale=ru-RU -Dtests.timezone=SystemV/EST5

1> [2018-05-18T15:02:47,981][INFO ][o.e.c.u.c.QueueResizingEsThreadPoolExecutorTests] --> executor: QueueResizingEsThreadPoolExecutor[name = test-threadpool, queue capacity = 5000, min queue capacity = 10, max queue capacity = 5024, frame size = 23, targeted response rate = 1ms, task execution EWMA = 0s, adjustment amount = 50, org.elasticsearch.common.util.concurrent.QueueResizingEsThreadPoolExecutor@af1c9b0[Running, pool size = 3, active threads = 0, queued tasks = 0, completed tasks = 0]]
  1> [2018-05-18T15:02:47,982][INFO ][o.e.c.u.c.QueueResizingEsThreadPoolExecutorTests] --> executing a task [69] times
  1> [2018-05-18T15:02:57,986][INFO ][o.e.c.u.c.QueueResizingEsThreadPoolExecutorTests] [testAutoQueueSizingWithMax]: after test
FAILURE 10.0s J1 | QueueResizingEsThreadPoolExecutorTests.testAutoQueueSizingWithMax <<< FAILURES!
   > Throwable #1: java.lang.AssertionError: 
   > Expected: <5024>
   >      but: was <4974>
   > 	at __randomizedtesting.SeedInfo.seed([FA63CC5EE34DF6A8:66709390C577231B]:0)
   > 	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
   > 	at org.elasticsearch.common.util.concurrent.QueueResizingEsThreadPoolExecutorTests.lambda$testAutoQueueSizingWithMax$7(QueueResizingEsThreadPoolExecutorTests.java:180)
   > 	at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:731)
   > 	at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:705)
   > 	at org.elasticsearch.common.util.concurrent.QueueResizingEsThreadPoolExecutorTests.testAutoQueueSizingWithMax(QueueResizingEsThreadPoolExecutorTests.java:179)
   > 	at java.lang.Thread.run(Thread.java:748)

+ thred leakage there

2> SEVERE: 3 threads leaked from SUITE scope at org.elasticsearch.common.util.concurrent.QueueResizingEsThreadPoolExecutorTests: 
  2>    1) Thread[id=1873, name=queuetest[T#2], state=WAITING, group=TGRP-QueueResizingEsThreadPoolExecutorTests]
  2>         at sun.misc.Unsafe.park(Native Method)
  2>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
  2>         at java.util.concurrent.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:737)
  2>         at java.util.concurrent.LinkedTransferQueue.xfer(LinkedTransferQueue.java:647)
  2>         at java.util.concurrent.LinkedTransferQueue.take(LinkedTransferQueue.java:1269)
  2>         at org.elasticsearch.common.util.concurrent.SizeBlockingQueue.take(SizeBlockingQueue.java:165)
  2>         at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
  2>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
  2>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  2>         at java.lang.Thread.run(Thread.java:748)
  2>    2) Thread[id=1872, name=queuetest[T#1], state=WAITING, group=TGRP-QueueResizingEsThreadPoolExecutorTests]
  2>         at sun.misc.Unsafe.park(Native Method)
  2>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
  2>         at java.util.concurrent.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:737)
  2>         at java.util.concurrent.LinkedTransferQueue.xfer(LinkedTransferQueue.java:647)
  2>         at java.util.concurrent.LinkedTransferQueue.take(LinkedTransferQueue.java:1269)
  2>         at org.elasticsearch.common.util.concurrent.SizeBlockingQueue.take(SizeBlockingQueue.java:165)
  2>         at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
  2>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
  2>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  2>         at java.lang.Thread.run(Thread.java:748)
ERROR   0.00s J1 | QueueResizingEsThreadPoolExecutorTests (suite) <<< FAILURES!
  2>    3) Thread[id=1874, name=queuetest[T#3], state=WAITING, group=TGRP-QueueResizingEsThreadPoolExecutorTests]
   > Throwable #1: com.carrotsearch.randomizedtesting.ThreadLeakError: 3 threads leaked from SUITE scope at org.elasticsearch.common.util.concurrent.QueueResizingEsThreadPoolExecutorTests: 
  2>         at sun.misc.Unsafe.park(Native Method)
@vladimirdolzhenko vladimirdolzhenko added :Core/Infra/Core Core issues without another label >test-failure Triaged test failures from CI v6.2.5 labels May 19, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

@jasontedor
Copy link
Member

@dakrone Would you take a look?

@dakrone
Copy link
Member

dakrone commented May 21, 2018

Yep I'll take a look

dakrone added a commit that referenced this issue May 21, 2018
Enables debug logging on
QueueResizingEsThreadPoolExecutorTests#testAutoQueueSizingWithMax

Relates to #30740
dakrone added a commit that referenced this issue May 21, 2018
Enables debug logging on
QueueResizingEsThreadPoolExecutorTests#testAutoQueueSizingWithMax

Relates to #30740
@dakrone
Copy link
Member

dakrone commented May 21, 2018

I pushed a commit to enable debug logging for this test, unfortunately I wasn't able to reproduce it and there's not much info from the failure currently. Hopefully if it fails again we'll have more information.

@bizybot
Copy link
Contributor

bizybot commented Sep 17, 2018

Hi @dakrone, I see this failure on 6.x build.
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.x+matrix-java-periodic/ES_BUILD_JAVA=java10,ES_RUNTIME_JAVA=java8fips,nodes=virtual&&linux/7/consoleText
I think this will have your debug logs, could you please take a look? Thank you.

danielmitterdorfer pushed a commit that referenced this issue Sep 20, 2018
Enables debug logging on
QueueResizingEsThreadPoolExecutorTests#testAutoQueueSizingWithMax

Relates to #30740
@danielmitterdorfer
Copy link
Member

@dakrone unfortunately the commit to enable debug logging was missing on 6.x so I took the liberty of cherry-picking it for you in a204f93.

@bizybot, @atorok As a suggestion: I think it would make sense in such cases to paste the relevant log snippet here on the ticket and not just refer to the CI links because the logs are wiped after some time (IIRC the retention period is two weeks). While it is possible to get them after that period as well it can be a bit of a hassle.

For completeness, here is the snippet containing the relevant output from recent test failure referred to by Alpar above (which is actually the same link as the one from Yogesh):

  1> [2018-09-17T19:17:26,986][INFO ][o.e.c.u.c.QueueResizingEsThreadPoolExecutorTests] [testAutoQueueSizingWithMax] before test
  1> [2018-09-17T19:17:26,986][INFO ][o.e.c.u.c.QueueResizingEsThreadPoolExecutorTests] [testAutoQueueSizingWithMax] --> auto-queue with a measurement window of 29 tasks
  1> [2018-09-17T19:17:26,987][INFO ][o.e.c.u.c.QueueResizingEsThreadPoolExecutorTests] [testAutoQueueSizingWithMax] --> executor: QueueResizingEsThreadPoolExecutor[name = test-threadpool, queue capacity = 5000, min queue capacity = 10, max queue capacity = 5021, frame size = 29, targeted response rate = 1ms, task execution EWMA = 0s, adjustment amount = 50, org.elasticsearch.common.util.concurrent.QueueResizingEsThreadPoolExecutor@13a240ca[Running, pool size = 5, active threads = 1, queued tasks = 0, completed tasks = 0]]
  1> [2018-09-17T19:17:26,988][INFO ][o.e.c.u.c.QueueResizingEsThreadPoolExecutorTests] [testAutoQueueSizingWithMax] --> executing a task [87] times
  1> [2018-09-17T19:17:36,995][INFO ][o.e.c.u.c.QueueResizingEsThreadPoolExecutorTests] [testAutoQueueSizingWithMax] after test

As you see the output only contains INFO messages but no DEBUG logs.

@jakelandis
Copy link
Contributor

Looks like 6.x missed the @TestLogging import, so I added it : 6c44fbb

dakrone added a commit to dakrone/elasticsearch that referenced this issue Jan 31, 2019
It was already added to `testAutoQueueSizingWithMax`, however, that has not
recently failed. There was a failure for `testAutoQueueSizingWithMin` in mid
December without any extra logs.

Relates to elastic#30740
@dakrone
Copy link
Member

dakrone commented Jan 31, 2019

Closing this as it has not failed on 6.x and up for a while

@dakrone dakrone closed this as completed Jan 31, 2019
@jkakavas jkakavas reopened this Mar 18, 2020
@jkakavas
Copy link
Member

I'm re-opening this instead of a new issue as the new failure ( in 7.x this time ) is identical.

https://gradle-enterprise.elastic.co/s/dc6jwya7fgzoc/console-log?task=:server:test

org.elasticsearch.common.util.concurrent.QueueResizingEsThreadPoolExecutorTests > testAutoQueueSizingWithMin FAILED
    java.lang.AssertionError: 
    Expected: <4997>
         but: was <5047>
        at __randomizedtesting.SeedInfo.seed([B7B36B46920A59FE:F403FA14F0E86240]:0)
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
        at org.junit.Assert.assertThat(Assert.java:956)
        at org.junit.Assert.assertThat(Assert.java:923)
        at org.elasticsearch.common.util.concurrent.QueueResizingEsThreadPoolExecutorTests.lambda$testAutoQueueSizingWithMin$6(QueueResizingEsThreadPoolExecutorTests.java:147)
        at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:906)
        at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:879)
        at org.elasticsearch.common.util.concurrent.QueueResizingEsThreadPoolExecutorTests.testAutoQueueSizingWithMin(QueueResizingEsThreadPoolExecutorTests.java:146)
REPRODUCE WITH: ./gradlew ':server:test' --tests "org.elasticsearch.common.util.concurrent.QueueResizingEsThreadPoolExecutorTests.testAutoQueueSizingWithMin" -Dtests.seed=B7B36B46920A59FE -Dtests.security.manager=true -Dtests.locale=he -Dtests.timezone=Asia/Vladivostok -Dcompiler.java=13

@rjernst rjernst added the Team:Core/Infra Meta label for core/infra team label May 4, 2020
@rjernst
Copy link
Member

rjernst commented Aug 6, 2020

This test failed once more in June, but the feature has also been removed from master (#52280). Given this rarely fails, and it can only do so on 7.x branches, I'm going to close this.

@rjernst rjernst closed this as completed Aug 6, 2020
@benwtrent
Copy link
Member

Failed on 7.x again: https://gradle-enterprise.elastic.co/s/ntpidbhmvfftc

org.elasticsearch.common.util.concurrent.QueueResizingEsThreadPoolExecutorTests > testAutoQueueSizingWithMax FAILED |  
-- | --
  | java.lang.AssertionError: |  
  | Expected: <5018> |  
  | but: was <4968> |  
  | at __randomizedtesting.SeedInfo.seed([D494DCC0A5B4A3EC:4887830E838E765F]:0) |  
  | at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18) |  
  | at org.junit.Assert.assertThat(Assert.java:956) |  
  | at org.junit.Assert.assertThat(Assert.java:923) |  
  | at org.elasticsearch.common.util.concurrent.QueueResizingEsThreadPoolExecutorTests.lambda$testAutoQueueSizingWithMax$7(QueueResizingEsThreadPoolExecutorTests.java:165) |  
  | at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:1013) |  
  | at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:986) |  
  | at org.elasticsearch.common.util.concurrent.QueueResizingEsThreadPoolExecutorTests.testAutoQueueSizingWithMax(QueueResizingEsThreadPoolExecutorTests.java:164)
./gradlew ':server:test' --tests "org.elasticsearch.common.util.concurrent.QueueResizingEsThreadPoolExecutorTests.testAutoQueueSizingWithMax" -Dtests.seed=D494DCC0A5B4A3EC -Dtests.security.manager=true -Dtests.locale=et-EE -Dtests.timezone=Asia/Barnaul -Druntime.java=8
com.carrotsearch.randomizedtesting.ThreadLeakError: 4 threads leaked from SUITE scope at org.elasticsearch.common.util.concurrent.QueueResizingEsThreadPoolExecutorTests: |  
-- | --
  | 1) Thread[id=418, name=queuetest[T#3], state=WAITING, group=TGRP-QueueResizingEsThreadPoolExecutorTests] |  
  | at sun.misc.Unsafe.park(Native Method) |  
  | at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) |  
  | at java.util.concurrent.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:737) |  
  | at java.util.concurrent.LinkedTransferQueue.xfer(LinkedTransferQueue.java:647) |  
  | at java.util.concurrent.LinkedTransferQueue.take(LinkedTransferQueue.java:1269) |  
  | at org.elasticsearch.common.util.concurrent.SizeBlockingQueue.take(SizeBlockingQueue.java:154) |  
  | at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) |  
  | at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) |  
  | at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) |  
  | at java.lang.Thread.run(Thread.java:748) |  
  | 2) Thread[id=419, name=queuetest[T#4], state=WAITING, group=TGRP-QueueResizingEsThreadPoolExecutorTests] |  
  | at sun.misc.Unsafe.park(Native Method) |  
  | at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) |  
  | at java.util.concurrent.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:737) |  
  | at java.util.concurrent.LinkedTransferQueue.xfer(LinkedTransferQueue.java:647) |  
  | at java.util.concurrent.LinkedTransferQueue.take(LinkedTransferQueue.java:1269) |  
  | at org.elasticsearch.common.util.concurrent.SizeBlockingQueue.take(SizeBlockingQueue.java:154) |  
  | at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) |  
  | at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) |  
  | at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) |  
  | at java.lang.Thread.run(Thread.java:748) |  
  | 3) Thread[id=416, name=queuetest[T#1], state=WAITING, group=TGRP-QueueResizingEsThreadPoolExecutorTests] |  
  | at sun.misc.Unsafe.park(Native Method) |  
  | at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) |  
  | at java.util.concurrent.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:737) |  
  | at java.util.concurrent.LinkedTransferQueue.xfer(LinkedTransferQueue.java:647) |  
  | at java.util.concurrent.LinkedTransferQueue.take(LinkedTransferQueue.java:1269) |  
  | at org.elasticsearch.common.util.concurrent.SizeBlockingQueue.take(SizeBlockingQueue.java:154) |  
  | at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) |  
  | at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) |  
  | at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) |  
  | at java.lang.Thread.run(Thread.java:748) |  
  | 4) Thread[id=417, name=queuetest[T#2], state=WAITING, group=TGRP-QueueResizingEsThreadPoolExecutorTests] |  
  | at sun.misc.Unsafe.park(Native Method) |  
  | at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) |  
  | at java.util.concurrent.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:737) |  
  | at java.util.concurrent.LinkedTransferQueue.xfer(LinkedTransferQueue.java:647) |  
  | at java.util.concurrent.LinkedTransferQueue.take(LinkedTransferQueue.java:1269) |  
  | at org.elasticsearch.common.util.concurrent.SizeBlockingQueue.take(SizeBlockingQueue.java:154) |  
  | at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) |  
  | at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) |  
  | at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) |  
  | at java.lang.Thread.run(Thread.java:748) |  
  | at __randomizedtesting.SeedInfo.seed([D494DCC0A5B4A3EC]:0)

It has failed 4 times since December.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Core Core issues without another label Team:Core/Infra Meta label for core/infra team >test-failure Triaged test failures from CI v6.2.5
Projects
None yet
Development

No branches or pull requests