Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] HeapAttackIT failing #102784

Closed
breskeby opened this issue Nov 29, 2023 · 5 comments · Fixed by #102831
Closed

[CI] HeapAttackIT failing #102784

breskeby opened this issue Nov 29, 2023 · 5 comments · Fixed by #102831
Assignees
Labels
:Analytics/ES|QL AKA ESQL medium-risk An open issue or test failure that is a medium risk to future releases Team:QL (Deprecated) Meta label for query languages team >test-failure Triaged test failures from CI

Comments

@breskeby
Copy link
Contributor

The HeapAttackIT test has started to fail on a daily basis since 22th of november

Build scan:
https://gradle-enterprise.elastic.co/s/h2qxzbqn5vq2m/tests/:x-pack:plugin:esql:qa:server:single-node:javaRestTest/org.elasticsearch.xpack.esql.qa.single_node.HeapAttackIT/testSortByManyLongsTooMuchMemory

Reproduction line:

./gradlew ':x-pack:plugin:esql:qa:server:single-node:javaRestTest' --tests "org.elasticsearch.xpack.esql.qa.single_node.HeapAttackIT.testSortByManyLongsTooMuchMemory" -Dtests.seed=E5F40F3AD7D8520 -Dtests.configure_test_clusters_with_one_processor=true -Dtests.locale=hr -Dtests.timezone=Australia/NSW -Druntime.java=21

Applicable branches:
main

Reproduces locally?:
Didn't try

Failure history:
https://es-delivery-stats.elastic.dev/app/dashboards#/view/dcec9e60-72ac-11ee-8f39-55975ded9e63?_g=(refreshInterval:(pause:!t,value:60000),time:(from:now-7d%2Fd,to:now))&_a=(controlGroupInput:(chainingSystem:HIERARCHICAL,controlStyle:twoLine,ignoreParentSettings:(ignoreFilters:!f,ignoreQuery:!f,ignoreTimerange:!f,ignoreValidations:!t),panels:('0c0c9cb8-ccd2-45c6-9b13-96bac4abc542':(explicitInput:(dataViewId:fbbdc689-be23-4b3d-8057-aa402e9ed0c5,enhancements:(),fieldName:task.keyword,grow:!t,id:'0c0c9cb8-ccd2-45c6-9b13-96bac4abc542',searchTechnique:wildcard,selectedOptions:!(),singleSelect:!t,title:'Gradle%20Task',width:medium),grow:!t,order:0,type:optionsListControl,width:small),'144933da-5c1b-4257-a969-7f43455a7901':(explicitInput:(dataViewId:fbbdc689-be23-4b3d-8057-aa402e9ed0c5,enhancements:(),fieldName:name.keyword,grow:!t,id:'144933da-5c1b-4257-a969-7f43455a7901',searchTechnique:wildcard,selectedOptions:!('testSortByManyLongsTooMuchMemory'),title:Test,width:medium),grow:!t,order:2,type:optionsListControl,width:medium),'4e6ad9d6-6fdc-4fcc-bf1a-aa6ca79e0850':(explicitInput:(dataViewId:fbbdc689-be23-4b3d-8057-aa402e9ed0c5,enhancements:(),fieldName:className.keyword,grow:!t,id:'4e6ad9d6-6fdc-4fcc-bf1a-aa6ca79e0850',searchTechnique:wildcard,selectedOptions:!('org.elasticsearch.xpack.esql.qa.single_node.HeapAttackIT'),title:Suite,width:medium),grow:!t,order:1,type:optionsListControl,width:medium))))

Failure excerpt:

junit.framework.AssertionFailedError: Unexpected exception type, expected ResponseException but got java.io.IOException: Connection reset

  at org.apache.lucene.tests.util.LuceneTestCase.expectThrows(LuceneTestCase.java:2869)
  at org.apache.lucene.tests.util.LuceneTestCase.expectThrows(LuceneTestCase.java:2850)
  at org.elasticsearch.xpack.esql.qa.single_node.HeapAttackIT.assertCircuitBreaks(HeapAttackIT.java:77)
  at org.elasticsearch.xpack.esql.qa.single_node.HeapAttackIT.testSortByManyLongsTooMuchMemory(HeapAttackIT.java:73)
  at jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
  at java.lang.reflect.Method.invoke(Method.java:580)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
  at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
  at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
  at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
  at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  at org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850)
  at java.lang.Thread.run(Thread.java:1583)

  Caused by: java.io.IOException: Connection reset

    at org.elasticsearch.client.RestClient.extractAndWrapCause(RestClient.java:939)
    at org.elasticsearch.client.RestClient.performRequest(RestClient.java:304)
    at org.elasticsearch.client.RestClient.performRequest(RestClient.java:307)
    at org.elasticsearch.client.RestClient.performRequest(RestClient.java:292)
    at org.elasticsearch.xpack.esql.qa.single_node.HeapAttackIT.query(HeapAttackIT.java:285)
    at org.elasticsearch.xpack.esql.qa.single_node.HeapAttackIT.sortByManyLongs(HeapAttackIT.java:94)
    at org.elasticsearch.xpack.esql.qa.single_node.HeapAttackIT.lambda$testSortByManyLongsTooMuchMemory$0(HeapAttackIT.java:73)
    at org.apache.lucene.tests.util.LuceneTestCase._expectThrows(LuceneTestCase.java:3022)
    at org.apache.lucene.tests.util.LuceneTestCase.expectThrows(LuceneTestCase.java:2859)
    at org.apache.lucene.tests.util.LuceneTestCase.expectThrows(LuceneTestCase.java:2850)
    at org.elasticsearch.xpack.esql.qa.single_node.HeapAttackIT.assertCircuitBreaks(HeapAttackIT.java:77)
    at org.elasticsearch.xpack.esql.qa.single_node.HeapAttackIT.testSortByManyLongsTooMuchMemory(HeapAttackIT.java:73)
    at jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
    at java.lang.reflect.Method.invoke(Method.java:580)
    at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
    at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
    at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
    at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
    at org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
    at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
    at org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
    at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
    at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
    at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
    at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
    at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
    at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
    at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
    at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
    at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
    at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
    at org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
    at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
    at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
    at org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
    at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
    at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
    at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
    at org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
    at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
    at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
    at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850)
    at java.lang.Thread.run(Thread.java:1583)

    Caused by: java.net.SocketException: Connection reset

      at sun.nio.ch.SocketChannelImpl.throwConnectionReset(SocketChannelImpl.java:401)
      at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:434)
      at org.apache.http.impl.nio.reactor.SessionInputBufferImpl.fill(SessionInputBufferImpl.java:231)
      at org.apache.http.impl.nio.codecs.AbstractMessageParser.fillBuffer(AbstractMessageParser.java:136)
      at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:241)
      at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:87)
      at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:40)
      at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114)
      at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)
      at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)
      at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
      at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
      at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
      at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
      at java.lang.Thread.run(Thread.java:1583)

@breskeby breskeby added :Analytics/EQL EQL querying >test-failure Triaged test failures from CI labels Nov 29, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-ql (Team:QL)

@elasticsearchmachine elasticsearchmachine added the Team:QL (Deprecated) Meta label for query languages team label Nov 29, 2023
@breskeby breskeby changed the title [CI] HeapAttackIT testSortByManyLongsTooMuchMemory failing [CI] HeapAttackIT failing Nov 29, 2023
@nik9000
Copy link
Member

nik9000 commented Nov 29, 2023

@alex-spies or @dnhatn would you like this or should I dig?

@dnhatn
Copy link
Member

dnhatn commented Nov 29, 2023

@nik9000 I can take this.

@dnhatn dnhatn self-assigned this Nov 29, 2023
@dnhatn dnhatn added medium-risk An open issue or test failure that is a medium risk to future releases :Analytics/ES|QL AKA ESQL and removed blocker :Analytics/EQL EQL querying labels Nov 29, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/elasticsearch-esql (:Query Languages/ES|QL)

@dnhatn
Copy link
Member

dnhatn commented Nov 30, 2023

I think I have found the issue. We miss tracking memory in TopNOperator. I will work on the fix shortly.

dnhatn added a commit that referenced this issue Dec 1, 2023
This commit addresses the issue of missing memory tracking for the 
BitSet in TopN.Row. Instead of introducing BreakingBitSet, we replace
the BitSet with a smaller array of offsets in this PR. Nik suggested to
remove that BitSet, but I haven't looked into that option yet.

Closes #100640
Closes #102683
Closes #102790
Closes #102784
dnhatn added a commit to dnhatn/elasticsearch that referenced this issue Dec 1, 2023
This commit addresses the issue of missing memory tracking for the 
BitSet in TopN.Row. Instead of introducing BreakingBitSet, we replace
the BitSet with a smaller array of offsets in this PR. Nik suggested to
remove that BitSet, but I haven't looked into that option yet.

Closes elastic#100640
Closes elastic#102683
Closes elastic#102790
Closes elastic#102784
elasticsearchmachine pushed a commit that referenced this issue Dec 1, 2023
This commit addresses the issue of missing memory tracking for the 
BitSet in TopN.Row. Instead of introducing BreakingBitSet, we replace
the BitSet with a smaller array of offsets in this PR. Nik suggested to
remove that BitSet, but I haven't looked into that option yet.

Closes #100640
Closes #102683
Closes #102790
Closes #102784
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL medium-risk An open issue or test failure that is a medium risk to future releases Team:QL (Deprecated) Meta label for query languages team >test-failure Triaged test failures from CI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants