Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trace analytics ingestion End-to-end tests have been flaky #366

Closed
chenqi0805 opened this issue Oct 6, 2021 · 9 comments
Closed

Trace analytics ingestion End-to-end tests have been flaky #366

chenqi0805 opened this issue Oct 6, 2021 · 9 comments
Assignees
Labels
bug Something isn't working

Comments

@chenqi0805
Copy link
Collaborator

Describe the bug
Recently I am seeing more frequent CI failure on e2e tests due to UnprocessedRequestException when Armeria client sending the requests.

Task :data-prepper-core:rawSpanEndToEndTest

com.amazon.dataprepper.integration.EndToEndRawSpanTest > testPipelineEndToEnd FAILED
    io.grpc.StatusRuntimeException at EndToEndRawSpanTest.java:156
        Caused by: com.linecorp.armeria.client.UnprocessedRequestException at UnprocessedRequestException.java:45
            Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException
                Caused by: java.net.ConnectException at Errors.java:155

To Reproduce
This is produced by github CI.

Screenshots
If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

  • OS: linux
  • Version: github

Additional context
Add any other context about the problem here.

@chenqi0805 chenqi0805 added bug Something isn't working Beta untriaged labels Oct 6, 2021
@chenqi0805
Copy link
Collaborator Author

One way to mitigate is to introduce retry rule based upon UnprocessedRequestException as here: https://armeria.dev/docs/client-retry/

But we still need to find the root cause

@chenqi0805
Copy link
Collaborator Author

The above PR mitigates the issue. Checking with Armeria community if UnprocessedRequestException could be related to sth wrong on the server side.

@chenqi0805
Copy link
Collaborator Author

Issue persists:

com.amazon.dataprepper.integration.EndToEndRawSpanTest > testPipelineEndToEnd FAILED
    io.grpc.StatusRuntimeException at EndToEndRawSpanTest.java:160
        Caused by: com.linecorp.armeria.client.UnprocessedRequestException at UnprocessedRequestException.java:45
            Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException
                Caused by: java.net.ConnectException at Errors.java:155

Needs to dive deeper

@chenqi0805 chenqi0805 mentioned this issue Oct 7, 2021
5 tasks
@chenqi0805
Copy link
Collaborator Author

Deep dive into data-prepper container logs:

Task :data-prepper-core:logstartcreaterawSpanDataPrepperFromBuild
Logs for container with ID 'd0870704a330a275a67c51fbe1bc88ec6c77c4116c85b4713bd5a86e4ca60a6b'.
WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
2021-10-08T14:27:39,994 [main] INFO  com.amazon.dataprepper.pipeline.server.DataPrepperServer - Creating Data Prepper server without TLS
2021-10-08T14:27:40,013 [main] INFO        com.amazon.dataprepper.DataPrepper - Using /app/raw-span-e2e-pipeline-latest-release.yml configuration file
2021-10-08T14:27:40,034 [main] INFO  com.amazon.dataprepper.parser.PipelineParser - Building pipeline [entry-pipeline] from provided configuration
2021-10-08T14:27:40,035 [main] INFO  com.amazon.dataprepper.parser.PipelineParser - Building [otel_trace_source] as source component for the pipeline [entry-pipeline]
2021-10-08T14:27:40,305 [main] INFO  com.amazon.dataprepper.parser.PipelineParser - Building buffer for the pipeline [entry-pipeline]
2021-10-08T14:27:40,315 [main] INFO  com.amazon.dataprepper.parser.PipelineParser - Building preppers for the pipeline [entry-pipeline]
2021-10-08T14:27:40,392 [main] INFO  com.amazon.dataprepper.plugins.prepper.peerforwarder.discovery.StaticPeerListProvider - Found endpoints: [dataprepper1, dataprepper2]
2021-10-08T14:27:40,397 [main] INFO  com.amazon.dataprepper.plugins.prepper.peerforwarder.HashRing - Building hash ring with endpoints: [dataprepper1, dataprepper2]
2021-10-08T14:27:40,420 [main] INFO  com.amazon.dataprepper.parser.PipelineParser - Building sinks for the pipeline [entry-pipeline]
2021-10-08T14:27:40,421 [main] INFO  com.amazon.dataprepper.parser.PipelineParser - Building [pipeline] as sink component
2021-10-08T14:27:40,424 [main] INFO  com.amazon.dataprepper.parser.PipelineParser - Building pipeline [raw-pipeline] from provided configuration
2021-10-08T14:27:40,424 [main] INFO  com.amazon.dataprepper.parser.PipelineParser - Building [pipeline] as source component for the pipeline [raw-pipeline]
2021-10-08T14:27:40,424 [main] INFO  com.amazon.dataprepper.parser.PipelineParser - Building buffer for the pipeline [raw-pipeline]
2021-10-08T14:27:40,432 [main] INFO  com.amazon.dataprepper.parser.PipelineParser - Building preppers for the pipeline [raw-pipeline]
2021-10-08T14:27:40,452 [main] INFO  com.amazon.dataprepper.parser.PipelineParser - Building sinks for the pipeline [raw-pipeline]
2021-10-08T14:27:40,452 [main] INFO  com.amazon.dataprepper.parser.PipelineParser - Building [opensearch] as sink component
2021-10-08T14:27:40,478 [main] INFO  com.amazon.dataprepper.plugins.sink.opensearch.OpenSearchSink - Initializing OpenSearch sink
2021-10-08T14:27:40,494 [main] INFO  com.amazon.dataprepper.plugins.sink.opensearch.ConnectionConfiguration - Using the username provided in the config.
2021-10-08T14:27:40,765 [main] INFO  com.amazon.dataprepper.plugins.sink.opensearch.ConnectionConfiguration - Using the trust all strategy
2021-10-08T14:27:40,950 [main] ERROR com.amazon.dataprepper.plugins.PluginFactory - Encountered exception while instantiating the plugin OpenSearchSink
java.lang.reflect.InvocationTargetException: null
	at jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:?]
	at jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source) ~[?:?]
	at jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source) ~[?:?]
	at java.lang.reflect.Constructor.newInstanceWithCaller(Unknown Source) ~[?:?]
	at java.lang.reflect.Constructor.newInstance(Unknown Source) ~[?:?]
	at com.amazon.dataprepper.plugins.PluginFactory.newPlugin(PluginFactory.java:35) ~[data-prepper.jar:1.0.0]
	at com.amazon.dataprepper.plugins.sink.SinkFactory.newSink(SinkFactory.java:23) ~[data-prepper.jar:1.0.0]
	at com.amazon.dataprepper.parser.PipelineParser.buildSinkOrConnector(PipelineParser.java:160) ~[data-prepper.jar:1.0.0]
	at java.util.stream.ReferencePipeline$3$1.accept(Unknown Source) [?:?]
	at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(Unknown Source) [?:?]
	at java.util.stream.AbstractPipeline.copyInto(Unknown Source) [?:?]
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(Unknown Source) [?:?]
	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(Unknown Source) [?:?]
	at java.util.stream.AbstractPipeline.evaluate(Unknown Source) [?:?]
	at java.util.stream.ReferencePipeline.collect(Unknown Source) [?:?]
	at com.amazon.dataprepper.parser.PipelineParser.buildPipelineFromConfiguration(PipelineParser.java:109) [data-prepper.jar:1.0.0]
	at com.amazon.dataprepper.parser.PipelineParser.parseConfiguration(PipelineParser.java:75) [data-prepper.jar:1.0.0]
	at com.amazon.dataprepper.DataPrepper.execute(DataPrepper.java:132) [data-prepper.jar:1.0.0]
	at com.amazon.dataprepper.DataPrepperExecute.main(DataPrepperExecute.java:33) [data-prepper.jar:1.0.0]
Caused by: java.lang.RuntimeException: Connection refused
	at com.amazon.dataprepper.plugins.sink.opensearch.OpenSearchSink.<init>(OpenSearchSink.java:91) ~[data-prepper.jar:1.0.0]
	... 19 more
Caused by: java.net.ConnectException: Connection refused
	at org.opensearch.client.RestClient.extractAndWrapCause(RestClient.java:892) ~[data-prepper.jar:1.0.0]
	at org.opensearch.client.RestClient.performRequest(RestClient.java:296) ~[data-prepper.jar:1.0.0]
	at org.opensearch.client.RestClient.performRequest(RestClient.java:283) ~[data-prepper.jar:1.0.0]
	at org.opensearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1394) ~[data-prepper.jar:1.0.0]
	at org.opensearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1364) ~[data-prepper.jar:1.0.0]
	at org.opensearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1334) ~[data-prepper.jar:1.0.0]
	at org.opensearch.client.ClusterClient.getSettings(ClusterClient.java:106) ~[data-prepper.jar:1.0.0]
	at com.amazon.dataprepper.plugins.sink.opensearch.IndexStateManagement.checkISMEnabled(IndexStateManagement.java:35) ~[data-prepper.jar:1.0.0]
	at com.amazon.dataprepper.plugins.sink.opensearch.OpenSearchSink.initialize(OpenSearchSink.java:98) ~[data-prepper.jar:1.0.0]
	at com.amazon.dataprepper.plugins.sink.opensearch.OpenSearchSink.<init>(OpenSearchSink.java:88) ~[data-prepper.jar:1.0.0]
	... 19 more
Caused by: java.net.ConnectException: Connection refused
	at sun.nio.ch.Net.pollConnect(Native Method) ~[?:?]
	at sun.nio.ch.Net.pollConnectNow(Unknown Source) ~[?:?]
	at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source) ~[?:?]
	at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvent(DefaultConnectingIOReactor.java:174) ~[data-prepper.jar:1.0.0]
	at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:148) ~[data-prepper.jar:1.0.0]
	at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:351) ~[data-prepper.jar:1.0.0]
	at org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:221) ~[data-prepper.jar:1.0.0]
	at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64) ~[data-prepper.jar:1.0.0]
	at java.lang.Thread.run(Unknown Source) ~[?:?]
2021-10-08T14:27:40,966 [main] ERROR com.amazon.dataprepper.parser.PipelineParser - Construction of pipeline components failed, skipping building of pipeline [raw-pipeline] and its connected pipelines
com.amazon.dataprepper.plugins.PluginException: Encountered exception while instantiating the plugin OpenSearchSink
	at com.amazon.dataprepper.plugins.PluginFactory.newPlugin(PluginFactory.java:45) ~[data-prepper.jar:1.0.0]
	at com.amazon.dataprepper.plugins.sink.SinkFactory.newSink(SinkFactory.java:23) ~[data-prepper.jar:1.0.0]
	at com.amazon.dataprepper.parser.PipelineParser.buildSinkOrConnector(PipelineParser.java:160) ~[data-prepper.jar:1.0.0]
	at java.util.stream.ReferencePipeline$3$1.accept(Unknown Source) ~[?:?]
	at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(Unknown Source) ~[?:?]
	at java.util.stream.AbstractPipeline.copyInto(Unknown Source) ~[?:?]
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(Unknown Source) ~[?:?]
	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(Unknown Source) ~[?:?]
	at java.util.stream.AbstractPipeline.evaluate(Unknown Source) ~[?:?]
	at java.util.stream.ReferencePipeline.collect(Unknown Source) ~[?:?]
	at com.amazon.dataprepper.parser.PipelineParser.buildPipelineFromConfiguration(PipelineParser.java:109) [data-prepper.jar:1.0.0]
	at com.amazon.dataprepper.parser.PipelineParser.parseConfiguration(PipelineParser.java:75) [data-prepper.jar:1.0.0]
	at com.amazon.dataprepper.DataPrepper.execute(DataPrepper.java:132) [data-prepper.jar:1.0.0]
	at com.amazon.dataprepper.DataPrepperExecute.main(DataPrepperExecute.java:33) [data-prepper.jar:1.0.0]
Caused by: java.lang.reflect.InvocationTargetException
	at jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:?]
	at jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source) ~[?:?]
	at jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source) ~[?:?]
	at java.lang.reflect.Constructor.newInstanceWithCaller(Unknown Source) ~[?:?]
	at java.lang.reflect.Constructor.newInstance(Unknown Source) ~[?:?]
	at com.amazon.dataprepper.plugins.PluginFactory.newPlugin(PluginFactory.java:35) ~[data-prepper.jar:1.0.0]
	... 13 more
Caused by: java.lang.RuntimeException: Connection refused
	at com.amazon.dataprepper.plugins.sink.opensearch.OpenSearchSink.<init>(OpenSearchSink.java:91) ~[data-prepper.jar:1.0.0]
	at jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:?]
	at jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source) ~[?:?]
	at jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source) ~[?:?]
	at java.lang.reflect.Constructor.newInstanceWithCaller(Unknown Source) ~[?:?]
	at java.lang.reflect.Constructor.newInstance(Unknown Source) ~[?:?]
	at com.amazon.dataprepper.plugins.PluginFactory.newPlugin(PluginFactory.java:35) ~[data-prepper.jar:1.0.0]
	... 13 more
Caused by: java.net.ConnectException: Connection refused
	at org.opensearch.client.RestClient.extractAndWrapCause(RestClient.java:892) ~[data-prepper.jar:1.0.0]
	at org.opensearch.client.RestClient.performRequest(RestClient.java:296) ~[data-prepper.jar:1.0.0]
	at org.opensearch.client.RestClient.performRequest(RestClient.java:283) ~[data-prepper.jar:1.0.0]
	at org.opensearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1394) ~[data-prepper.jar:1.0.0]
	at org.opensearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1364) ~[data-prepper.jar:1.0.0]
	at org.opensearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1334) ~[data-prepper.jar:1.0.0]
	at org.opensearch.client.ClusterClient.getSettings(ClusterClient.java:106) ~[data-prepper.jar:1.0.0]
	at com.amazon.dataprepper.plugins.sink.opensearch.IndexStateManagement.checkISMEnabled(IndexStateManagement.java:35) ~[data-prepper.jar:1.0.0]
	at com.amazon.dataprepper.plugins.sink.opensearch.OpenSearchSink.initialize(OpenSearchSink.java:98) ~[data-prepper.jar:1.0.0]
	at com.amazon.dataprepper.plugins.sink.opensearch.OpenSearchSink.<init>(OpenSearchSink.java:88) ~[data-prepper.jar:1.0.0]
	at jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:?]
	at jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source) ~[?:?]
	at jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source) ~[?:?]
	at java.lang.reflect.Constructor.newInstanceWithCaller(Unknown Source) ~[?:?]
	at java.lang.reflect.Constructor.newInstance(Unknown Source) ~[?:?]
	at com.amazon.dataprepper.plugins.PluginFactory.newPlugin(PluginFactory.java:35) ~[data-prepper.jar:1.0.0]
	... 13 more
Caused by: java.net.ConnectException: Connection refused
	at sun.nio.ch.Net.pollConnect(Native Method) ~[?:?]
	at sun.nio.ch.Net.pollConnectNow(Unknown Source) ~[?:?]
	at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source) ~[?:?]
	at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvent(DefaultConnectingIOReactor.java:174) ~[data-prepper.jar:1.0.0]
	at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:148) ~[data-prepper.jar:1.0.0]
	at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:351) ~[data-prepper.jar:1.0.0]
	at org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:221) ~[data-prepper.jar:1.0.0]
	at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64) ~[data-prepper.jar:1.0.0]
	at java.lang.Thread.run(Unknown Source) ~[?:?]
2021-10-08T14:27:40,968 [main] ERROR       com.amazon.dataprepper.DataPrepper - No valid pipeline is available for execution, exiting

@chenqi0805
Copy link
Collaborator Author

Also seeing similar error in opensearch sink integration with opensearch:

com.amazon.dataprepper.plugins.sink.opensearch.OpenSearchTests > testOpenSearchConnection FAILED
    org.apache.http.ConnectionClosedException: Connection is closed
        at org.opensearch.client.RestClient.extractAndWrapCause(RestClient.java:882)
        at org.opensearch.client.RestClient.performRequest(RestClient.java:296)

Suite: Test class com.amazon.dataprepper.plugins.sink.opensearch.OpenSearchTests
  2> SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
  2> SLF4J: Defaulting to no-operation (NOP) logger implementation
  2> SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
  2> org.apache.http.ConnectionClosedException: Connection is closed
        at org.opensearch.client.RestClient.extractAndWrapCause(RestClient.java:882)
        at org.opensearch.client.RestClient.performRequest(RestClient.java:296)
        at org.opensearch.client.RestClient.performRequest(RestClient.java:283)
        at com.amazon.dataprepper.plugins.sink.opensearch.OpenSearchTests.testOpenSearchConnection(OpenSearchTests.java:49)

        Caused by:
        org.apache.http.ConnectionClosedException: Connection is closed
            at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.endOfInput(HttpAsyncRequestExecutor.java:356)
            at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:261)
            at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81)
            at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39)
            at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:121)
            at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)
            at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)
            at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
            at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
            at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
            at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
            at java.base/java.lang.Thread.run(Thread.java:832)
        at org.opensearch.client.RestClient.performRequest(RestClient.java:283)
        at com.amazon.dataprepper.plugins.sink.opensearch.OpenSearchTests.testOpenSearchConnection(OpenSearchTests.java:49)

        Caused by:
        org.apache.http.ConnectionClosedException: Connection is closed
            at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.endOfInput(HttpAsyncRequestExecutor.java:356)
            at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:261)
            at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81)
            at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39)
            at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:121)
            at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)
            at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)
            at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
            at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
            at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
            at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
            at java.base/java.lang.Thread.run(Thread.java:832)


1 test completed, 1 failed
Tests with failures:
 - com.amazon.dataprepper.plugins.sink.opensearch.OpenSearchTests.testOpenSearchConnection

@chenqi0805
Copy link
Collaborator Author

For ODFE backward integration tests, I have not seen similar issue happening yet. So I infer this is a bug in opensearch engine. Will doublecheck.

@chenqi0805
Copy link
Collaborator Author

chenqi0805 commented Oct 8, 2021

Also check logs in opensearch container:

> Task :data-prepper-core:startOpenSearchDockerContainer
Starting container with ID '74b18a11d3f35b019e9942857270265806b4c1aa76668c44342b6fa15f752a05'.

> Task :data-prepper-core:logOpenSearchDockerContainer
Logs for container with ID '74b18a11d3f35b019e9942857270265806b4c1aa76668c44342b6fa15f752a05'.
Killing opensearch process 11
Killing performance analyzer process 12

@chenqi0805
Copy link
Collaborator Author

Opened an issue on opensearch engine: opensearch-project/OpenSearch#1350

@chenqi0805
Copy link
Collaborator Author

Close since it has not been detected since #403 merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants