Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Neural Search Integ tests are failing on Java 21 due to upload api timing out. #1896

Closed
vibrantvarun opened this issue Jan 22, 2024 · 4 comments
Assignees
Labels
bug Something isn't working untriaged

Comments

@vibrantvarun
Copy link
Member

What is the bug?
The integration tests are failing due to upload API giving below error

org.opensearch.neuralsearch.processor.NormalizationProcessorIT > testResultProcessor_whenMultipleShardsAndPartialMatches_thenSuccessful FAILED
    java.net.SocketTimeoutException: 60000 MILLISECONDS
        at org.opensearch.client.RestClient.extractAndWrapCause(RestClient.java:1019)
        at org.opensearch.client.RestClient.performRequest(RestClient.java:342)
        at org.opensearch.client.RestClient.performRequest(RestClient.java:345)
        at org.opensearch.client.RestClient.performRequest(RestClient.java:330)
        at org.opensearch.neuralsearch.BaseNeuralSearchIT.makeRequest(BaseNeuralSearchIT.java:626)
        at org.opensearch.neuralsearch.BaseNeuralSearchIT.makeRequest(BaseNeuralSearchIT.java:599)
        at org.opensearch.neuralsearch.BaseNeuralSearchIT.uploadModel(BaseNeuralSearchIT.java:144)
        at org.opensearch.neuralsearch.BaseNeuralSearchIT.registerModelGroupAndUploadModel(BaseNeuralSearchIT.java:140)
        at org.opensearch.neuralsearch.BaseNeuralSearchIT.prepareModel(BaseNeuralSearchIT.java:206)
        at org.opensearch.neuralsearch.processor.NormalizationProcessorIT.setUp(NormalizationProcessorIT.java:60)

        Caused by:
        java.net.SocketTimeoutException: 60000 MILLISECONDS
            at org.apache.hc.core5.io.SocketTimeoutExceptionFactory.create(SocketTimeoutExceptionFactory.java:50)
            at org.apache.hc.core5.http.impl.nio.AbstractHttp1StreamDuplexer.onTimeout(AbstractHttp1StreamDuplexer.java:399)
            at org.apache.hc.core5.http.impl.nio.AbstractHttp1IOEventHandler.timeout(AbstractHttp1IOEventHandler.java:82)
            at org.apache.hc.core5.http.impl.nio.ClientHttp1IOEventHandler.timeout(ClientHttp1IOEventHandler.java:41)
            at org.apache.hc.core5.reactor.InternalDataChannel.onTimeout(InternalDataChannel.java:169)
            at org.apache.hc.core5.reactor.InternalChannel.checkTimeout(InternalChannel.java:67)
            at org.apache.hc.core5.reactor.SingleCoreIOReactor.checkTimeout(SingleCoreIOReactor.java:241)
            at org.apache.hc.core5.reactor.SingleCoreIOReactor.validateActiveChannels(SingleCoreIOReactor.java:168)
            at org.apache.hc.core5.reactor.SingleCoreIOReactor.doExecute(SingleCoreIOReactor.java:130)
            at org.apache.hc.core5.reactor.AbstractSingleCoreIOReactor.execute(AbstractSingleCoreIOReactor.java:86)
            at org.apache.hc.core5.reactor.IOReactorWorker.run(IOReactorWorker.java:44)
            at java.base/java.lang.Thread.run(Thread.java:1583)

The failure is not flaky.

Below are the tests which are failing constantly

40 tests completed, 36 failed

Tests with failures:
 - org.opensearch.neuralsearch.processor.NeuralQueryEnricherProcessorIT.testNeuralQueryEnricherProcessor_whenNoModelIdPassed_thenSuccess
 - org.opensearch.neuralsearch.processor.NeuralQueryEnricherProcessorIT.testNeuralQueryEnricherProcessor_whenHybridQueryBuilderAndNoModelIdPassed_thenSuccess
 - org.opensearch.neuralsearch.processor.NormalizationProcessorIT.testResultProcessor_whenDefaultProcessorConfigAndQueryMatches_thenSuccessful
 - org.opensearch.neuralsearch.processor.NormalizationProcessorIT.testResultProcessor_whenMultipleShardsAndNoMatches_thenSuccessful
 - org.opensearch.neuralsearch.processor.NormalizationProcessorIT.testResultProcessor_whenOneShardAndQueryMatches_thenSuccessful
 - org.opensearch.neuralsearch.processor.NormalizationProcessorIT.testResultProcessor_whenMultipleShardsAndPartialMatches_thenSuccessful
 - org.opensearch.neuralsearch.processor.NormalizationProcessorIT.testResultProcessor_whenMultipleShardsAndQueryMatches_thenSuccessful
 - org.opensearch.neuralsearch.processor.ScoreCombinationIT.testHarmonicMeanCombination_whenOneShardAndQueryMatches_thenSuccessful
 - org.opensearch.neuralsearch.processor.ScoreCombinationIT.testGeometricMeanCombination_whenOneShardAndQueryMatches_thenSuccessful
 - org.opensearch.neuralsearch.processor.ScoreCombinationIT.testArithmeticWeightedMean_whenWeightsPassed_thenSuccessful
 - org.opensearch.neuralsearch.processor.ScoreNormalizationIT.testMinMaxNorm_whenOneShardAndQueryMatches_thenSuccessful
 - org.opensearch.neuralsearch.processor.ScoreNormalizationIT.testL2Norm_whenOneShardAndQueryMatches_thenSuccessful
 - org.opensearch.neuralsearch.processor.SparseEncodingProcessIT.testSparseEncodingProcessor
 - org.opensearch.neuralsearch.processor.TextEmbeddingProcessorIT.testTextEmbeddingProcessor
 - org.opensearch.neuralsearch.processor.TextImageEmbeddingProcessorIT.testEmbeddingProcessor_whenIngestingDocumentWithSourceWithoutMatchingInMapping_thenSuccessful
 - org.opensearch.neuralsearch.processor.TextImageEmbeddingProcessorIT.testEmbeddingProcessor_whenIngestingDocumentWithSourceMatchingTextMapping_thenSuccessful
 - org.opensearch.neuralsearch.query.HybridQueryIT.testIndexWithNestedFields_whenHybridQuery_thenSuccess
 - org.opensearch.neuralsearch.query.HybridQueryIT.testNoMatchResults_whenOnlyTermSubQueryWithoutMatch_thenEmptyResult
 - org.opensearch.neuralsearch.query.HybridQueryIT.testNestedQuery_whenHybridQueryIsWrappedIntoOtherQuery_thenFail
 - org.opensearch.neuralsearch.query.HybridQueryIT.testIndexWithNestedFields_whenHybridQueryIncludesNested_thenSuccess
 - org.opensearch.neuralsearch.query.HybridQueryIT.testComplexQuery_whenMultipleSubqueries_thenSuccessful
 - org.opensearch.neuralsearch.query.HybridQueryIT.testComplexQuery_whenMultipleIdenticalSubQueries_thenSuccessful
 - org.opensearch.neuralsearch.query.NeuralQueryIT.testBoostQuery
 - org.opensearch.neuralsearch.query.NeuralQueryIT.testRescoreQuery
 - org.opensearch.neuralsearch.query.NeuralQueryIT.testNestedQuery
 - org.opensearch.neuralsearch.query.NeuralQueryIT.testBooleanQuery_withMultipleNeuralQueries
 - org.opensearch.neuralsearch.query.NeuralQueryIT.testBasicQuery
 - org.opensearch.neuralsearch.query.NeuralQueryIT.testMultimodalQuery
 - org.opensearch.neuralsearch.query.NeuralQueryIT.testFilterQuery
 - org.opensearch.neuralsearch.query.NeuralQueryIT.testBooleanQuery_withNeuralAndBM25Queries
 - org.opensearch.neuralsearch.query.NeuralSparseQueryIT.testBooleanQuery_withSparseEncodingAndBM25Queries
 - org.opensearch.neuralsearch.query.NeuralSparseQueryIT.testRescoreQuery
 - org.opensearch.neuralsearch.query.NeuralSparseQueryIT.testBoostQuery
 - org.opensearch.neuralsearch.query.NeuralSparseQueryIT.testBooleanQuery_withMultipleSparseEncodingQueries
 - org.opensearch.neuralsearch.query.NeuralSparseQueryIT.testBasicQueryUsingQueryText_whenQueryWrongFieldType_thenFail
 - org.opensearch.neuralsearch.query.NeuralSparseQueryIT.testBasicQueryUsingQueryText

How can one reproduce the bug?
Steps to reproduce the behavior:

  1. Run ./gradlew integTest on neural search on java 21
  2. See error in the logs

What is the expected behavior?
The tests should pass.

What is your host/environment?

  • OS: Windows, linux
  • Version Java 21
  • Plugins
@vibrantvarun vibrantvarun added bug Something isn't working untriaged labels Jan 22, 2024
@Gaurav-137
Copy link

  1. Review the Test:
    Inspect the test case to understand its purpose and expected behavior.

  2. Check the Failure:
    Examine the specific failure message or stack trace to understand what went wrong.

  3. Review Code:
    a. Go to the code for 'NeuralQueryEnricherProcessorIT' and locate the test method.
    b. Check if there are recent code changes that might have affected this test.

4.Debugging:
If the failure is not immediately clear, consider adding debug logs or using a debugger to step through the test execution.

5.Check Dependencies:
Ensure that any dependencies required for the test (e.g., models, configurations) are properly set up.

6.Update Test Data:
If the test involves specific data, ensure that the test data is accurate and up-to-date.

7.Check Configurations:
Ensure that any configurations or parameters used by the test are correct.

  1. Isolation:
    Make sure the test is isolated and not affected by other test cases or external factors.

  2. Consult Documentation:
    Refer to the documentation for the code being tested to confirm the expected behavior.

  3. Collaborate:
    Discuss the issue with your team members, especially if it involves recent changes made by others.

  4. Make Adjustments:
    Based on your findings, make necessary adjustments to the test code or the code being tested.

  5. Run Locally:
    Run the modified test locally to verify that the issue has been addressed.

  6. Push Changes:
    If the local run is successful, push the changes to your version control system.

  7. Continuous Integration:
    Ensure that your CI/CD pipeline runs the updated tests automatically.

  8. Monitor:
    Keep an eye on future test runs and address any new issues that may arise.

Repeat this process for each failing test case, and consider reaching out to your team for assistance if needed. Each test case may have a different root cause, so a systematic approach is essential.

@vibrantvarun
Copy link
Member Author

We are doing an investigation on our end. Once the investigation completes, we will inform here in the conversation thread.

@vibrantvarun
Copy link
Member Author

vibrantvarun commented Jan 23, 2024

Alright the issue due to a bug in JDK 21.0.2. https://github.com/opensearch-project/OpenSearch/pull/11968/files

@vibrantvarun vibrantvarun self-assigned this Jan 23, 2024
@saratvemulapalli
Copy link
Member

@vibrantvarun curious how we figured out regression in JDK caused it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working untriaged
Projects
None yet
Development

No branches or pull requests

3 participants