Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReindexTestCase.testMultipleSources flaky test #6542

Closed
dreamer-89 opened this issue Mar 4, 2023 · 5 comments
Closed

ReindexTestCase.testMultipleSources flaky test #6542

dreamer-89 opened this issue Mar 4, 2023 · 5 comments
Assignees
Labels
bug Something isn't working flaky-test Random test failure that succeeds on second run Indexing Indexing, Bulk Indexing and anything related to indexing

Comments

@dreamer-89
Copy link
Member

dreamer-89 commented Mar 4, 2023

Test failure: ReindexBasicTests.testMultipleSources

java.lang.AssertionError: AcknowledgedResponse failed - not acked
Expected: <true>
     but: was <false>

Stacktrace

java.lang.AssertionError: AcknowledgedResponse failed - not acked
Expected: <true>
     but: was <false>
	at __randomizedtesting.SeedInfo.seed([30E7310E449AFEF5:B74E1D1AAED050A1]:0)
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
	at org.opensearch.test.hamcrest.OpenSearchAssertions.assertAcked(OpenSearchAssertions.java:125)
	at org.opensearch.test.hamcrest.OpenSearchAssertions.assertAcked(OpenSearchAssertions.java:129)
	at org.opensearch.test.TestCluster.wipeIndices(TestCluster.java:181)
	at org.opensearch.test.TestCluster.wipe(TestCluster.java:92)
	at org.opensearch.test.OpenSearchIntegTestCase.afterInternal(OpenSearchIntegTestCase.java:622)
	at org.opensearch.test.OpenSearchIntegTestCase.cleanUpCluster(OpenSearchIntegTestCase.java:2409)

Previous issue

Identified in #6541
Unstable Gradle link: https://build.ci.opensearch.org/job/gradle-check/11960

> Task :modules:reindex:test

REPRODUCE WITH: ./gradlew ':modules:reindex:test' --tests "org.opensearch.index.reindex.UpdateByQueryBasicTests.testMultipleSources" -Dtests.seed=D9FD8950FAC3D84E -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=ga-IE -Dtests.timezone=Pacific/Galapagos -Druntime.java=17

org.opensearch.index.reindex.UpdateByQueryBasicTests > testMultipleSources FAILED
    java.lang.AssertionError: 
    Expected: an empty iterable
         but: [<RemoteTransportException[[node_s0][127.0.0.1:40405][indices:admin/mapping/auto_put]]; nested: ProcessClusterEventTimeoutException[failed to process cluster event (put-mapping [test0/MqJv_Pi5RAm0_2lygw970Q]) within 10s];>,<RemoteTransportException[[node_s0][127.0.0.1:40405][indices:admin/mapping/auto_put]]; nested: ProcessClusterEventTimeoutException[failed to process cluster event (put-mapping [test4/A4cPfs6tS4-5ktGzOACsvg]) within 10s];>,<RemoteTransportException[[node_s0][127.0.0.1:40405][indices:admin/mapping/auto_put]]; nested: ProcessClusterEventTimeoutException[failed to process cluster event (put-mapping [test0/MqJv_Pi5RAm0_2lygw970Q]) within 10s];>,<RemoteTransportException[[node_s0][127.0.0.1:40405][indices:admin/mapping/auto_put]]; nested: ProcessClusterEventTimeoutException[failed to process cluster event (put-mapping [test1/0QerPEEiSaGdrb5NgDedPw]) within 10s];>,<RemoteTransportException[[node_s0][127.0.0.1:40405][indices:admin/mapping/auto_put]]; nested: ProcessClusterEventTimeoutException[failed to process cluster event (put-mapping [test3/Xs3AnHSaRsKvGxQyHznwsA]) within 10s];>,<RemoteTransportException[[node_s0][127.0.0.1:40405][indices:admin/mapping/auto_put]]; nested: ProcessClusterEventTimeoutException[failed to process cluster event (put-mapping [test3/Xs3AnHSaRsKvGxQyHznwsA]) within 10s];>,<RemoteTransportException[[node_s0][127.0.0.1:40405][indices:admin/mapping/auto_put]]; nested: ProcessClusterEventTimeoutException[failed to process cluster event (put-mapping [test2/OaXk850qQouUMGJcPWIE1g]) within 10s];>,<RemoteTransportException[[node_s0][127.0.0.1:40405][indices:admin/mapping/auto_put]]; nested: ProcessClusterEventTimeoutException[failed to process cluster event (put-mapping [test2/OaXk850qQouUMGJcPWIE1g]) within 10s];>,<RemoteTransportException[[node_s0][127.0.0.1:40405][indices:admin/mapping/auto_put]]; nested: ProcessClusterEventTimeoutException[failed to process cluster event (put-mapping [test2/OaXk850qQouUMGJcPWIE1g]) within 10s];>]
        at __randomizedtesting.SeedInfo.seed([D9FD8950FAC3D84E:5E54A5441089761A]:0)
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
        at org.junit.Assert.assertThat(Assert.java:964)
        at org.junit.Assert.assertThat(Assert.java:930)
        at org.opensearch.test.OpenSearchIntegTestCase.indexRandom(OpenSearchIntegTestCase.java:1641)
        at org.opensearch.test.OpenSearchIntegTestCase.indexRandom(OpenSearchIntegTestCase.java:1555)
        at org.opensearch.test.OpenSearchIntegTestCase.indexRandom(OpenSearchIntegTestCase.java:1539)
        at org.opensearch.index.reindex.UpdateByQueryBasicTests.testMultipleSources(UpdateByQueryBasicTests.java:146)
@dreamer-89 dreamer-89 added bug Something isn't working untriaged flaky-test Random test failure that succeeds on second run labels Mar 4, 2023
@minalsha minalsha removed the untriaged label Mar 6, 2023
@dbwiddis
Copy link
Member

dbwiddis commented Aug 6, 2023

Unable to repro in 500 attempts. Failure appears to have likely been a transient network/CPU issue (timeout).

@dbwiddis dbwiddis closed this as completed Aug 6, 2023
@peternied peternied changed the title UpdateByQueryBasicTests.testMultipleSources flaky test ReindexTestCase.testMultipleSources flaky test Nov 3, 2023
@peternied peternied reopened this Nov 3, 2023
@peternied
Copy link
Member

Looks like the underlying issue is still present in these reindex tests, reopening

@peternied peternied reopened this Nov 3, 2023
@andrross andrross added the Indexing Indexing, Bulk Indexing and anything related to indexing label Feb 21, 2024
@gaobinlong
Copy link
Collaborator

This flaky test shows again, see the console output: https://build.ci.opensearch.org/job/gradle-check/38717/consoleFull, maybe it's related to this exception:

1> [2024-05-15T04:34:43,878][INFO ][o.o.n.Node               ] [testMissingSources] initialized
  1> [2024-05-15T04:34:43,888][INFO ][o.o.n.Node               ] [[test_SUITE-TEST_WORKER_VM=[334]-CLUSTER_SEED=[-8404182343929799118]-HASH=[257F78996A6]-cluster[T#2]]] starting ...
  1> [2024-05-15T04:34:43,888][INFO ][o.o.n.Node               ] [[test_SUITE-TEST_WORKER_VM=[334]-CLUSTER_SEED=[-8404182343929799118]-HASH=[257F78996A6]-cluster[T#1]]] starting ...
  1> [2024-05-15T04:34:43,890][INFO ][o.o.n.Node               ] [[test_SUITE-TEST_WORKER_VM=[334]-CLUSTER_SEED=[-8404182343929799118]-HASH=[257F78996A6]-cluster[T#3]]] starting ...
  1> [2024-05-15T04:34:43,900][INFO ][o.o.t.TransportService   ] [[test_SUITE-TEST_WORKER_VM=[334]-CLUSTER_SEED=[-8404182343929799118]-HASH=[257F78996A6]-cluster[T#3]]] publish_address {127.0.0.1:34973}, bound_addresses {[::1]:37079}, {127.0.0.1:34973}
  1> [2024-05-15T04:34:43,900][INFO ][o.o.t.TransportService   ] [[test_SUITE-TEST_WORKER_VM=[334]-CLUSTER_SEED=[-8404182343929799118]-HASH=[257F78996A6]-cluster[T#2]]] publish_address {127.0.0.1:43469}, bound_addresses {[::1]:38627}, {127.0.0.1:43469}
  1> [2024-05-15T04:34:43,901][INFO ][o.o.t.TransportService   ] [[test_SUITE-TEST_WORKER_VM=[334]-CLUSTER_SEED=[-8404182343929799118]-HASH=[257F78996A6]-cluster[T#3]]] Remote clusters initialized successfully.
  1> [2024-05-15T04:34:43,901][INFO ][o.o.t.TransportService   ] [[test_SUITE-TEST_WORKER_VM=[334]-CLUSTER_SEED=[-8404182343929799118]-HASH=[257F78996A6]-cluster[T#2]]] Remote clusters initialized successfully.
  1> [2024-05-15T04:34:43,907][INFO ][o.o.t.TransportService   ] [[test_SUITE-TEST_WORKER_VM=[334]-CLUSTER_SEED=[-8404182343929799118]-HASH=[257F78996A6]-cluster[T#1]]] publish_address {127.0.0.1:46141}, bound_addresses {[::1]:45401}, {127.0.0.1:46141}
  1> [2024-05-15T04:34:43,907][INFO ][o.o.t.TransportService   ] [[test_SUITE-TEST_WORKER_VM=[334]-CLUSTER_SEED=[-8404182343929799118]-HASH=[257F78996A6]-cluster[T#1]]] Remote clusters initialized successfully.
  1> [2024-05-15T04:34:45,391][WARN ][o.o.t.TcpTransport       ] [node_s0] exception caught on transport layer [NioSocketChannel{localAddress=/127.0.0.1:46141, remoteAddress=/127.0.0.1:52448}], closing connection
  1> java.lang.IllegalStateException: transport not ready yet to handle incoming requests
  1> 	at org.opensearch.transport.TransportService.onRequestReceived(TransportService.java:1241) ~[opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.transport.NativeMessageHandler.handleRequest(NativeMessageHandler.java:215) ~[opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.transport.NativeMessageHandler.handleMessage(NativeMessageHandler.java:147) ~[opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.transport.NativeMessageHandler.messageReceived(NativeMessageHandler.java:127) ~[opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.transport.InboundHandler.messageReceivedFromPipeline(InboundHandler.java:121) ~[opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.transport.InboundHandler.inboundMessage(InboundHandler.java:113) ~[opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.transport.TcpTransport.inboundMessage(TcpTransport.java:788) [opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.transport.nativeprotocol.NativeInboundBytesHandler.forwardFragments(NativeInboundBytesHandler.java:156) [opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.transport.nativeprotocol.NativeInboundBytesHandler.doHandleBytes(NativeInboundBytesHandler.java:93) [opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:143) [opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:119) [opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.transport.nio.MockNioTransport$MockTcpReadWriteHandler.consumeReads(MockNioTransport.java:343) [framework-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.nio.SocketChannelContext.handleReadBytes(SocketChannelContext.java:246) [opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.nio.BytesChannelContext.read(BytesChannelContext.java:59) [opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.nio.EventHandler.handleRead(EventHandler.java:152) [opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.transport.nio.TestEventHandler.handleRead(TestEventHandler.java:167) [framework-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.nio.NioSelector.handleRead(NioSelector.java:438) [opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.nio.NioSelector.processKey(NioSelector.java:264) [opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.nio.NioSelector.singleLoop(NioSelector.java:191) [opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.nio.NioSelector.runLoop(NioSelector.java:148) [opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at java.****/java.lang.Thread.run(Thread.java:1583) [?:?]
  1> [2024-05-15T04:34:45,406][WARN ][o.o.t.TcpTransport       ] [node_s0] exception caught on transport layer [NioSocketChannel{localAddress=/127.0.0.1:46141, remoteAddress=/127.0.0.1:52460}], closing connection
  1> java.lang.IllegalStateException: transport not ready yet to handle incoming requests
  1> 	at org.opensearch.transport.TransportService.onRequestReceived(TransportService.java:1241) ~[opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.transport.NativeMessageHandler.handleRequest(NativeMessageHandler.java:215) ~[opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.transport.NativeMessageHandler.handleMessage(NativeMessageHandler.java:147) ~[opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.transport.NativeMessageHandler.messageReceived(NativeMessageHandler.java:127) ~[opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.transport.InboundHandler.messageReceivedFromPipeline(InboundHandler.java:121) ~[opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.transport.InboundHandler.inboundMessage(InboundHandler.java:113) ~[opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.transport.TcpTransport.inboundMessage(TcpTransport.java:788) [opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.transport.nativeprotocol.NativeInboundBytesHandler.forwardFragments(NativeInboundBytesHandler.java:156) [opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.transport.nativeprotocol.NativeInboundBytesHandler.doHandleBytes(NativeInboundBytesHandler.java:93) [opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:143) [opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:119) [opensearch-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.transport.nio.MockNioTransport$MockTcpReadWriteHandler.consumeReads(MockNioTransport.java:343) [framework-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.nio.SocketChannelContext.handleReadBytes(SocketChannelContext.java:246) [opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.nio.BytesChannelContext.read(BytesChannelContext.java:59) [opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.nio.EventHandler.handleRead(EventHandler.java:152) [opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.transport.nio.TestEventHandler.handleRead(TestEventHandler.java:167) [framework-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.nio.NioSelector.handleRead(NioSelector.java:438) [opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.nio.NioSelector.processKey(NioSelector.java:264) [opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.nio.NioSelector.singleLoop(NioSelector.java:191) [opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at org.opensearch.nio.NioSelector.runLoop(NioSelector.java:148) [opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1> 	at java.****/java.lang.Thread.run(Thread.java:1583) [?:?]

, seems that there're too many connections between the local cluster and the remote clusters.

@shiv0408
Copy link
Member

From the stacktrace, looks like issue is coming after the test has completed and assert is failing while cleaning up the indices and does not seems to be related to this test in particular.

@shiv0408
Copy link
Member

Also, the title of the issue is confusing as the ReindexTestCase does not have any test and linked stack trace is related to UpdateByQueryBasicTests class.
Closing this issue and created #13912, #13913 to track flaky tests in UpdateByQueryBasicTests and ReindexBasicTests respectively.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working flaky-test Random test failure that succeeds on second run Indexing Indexing, Bulk Indexing and anything related to indexing
Projects
None yet
Development

No branches or pull requests

8 participants