-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Unable to merge results from shards #875
Comments
@jed326 - Might be related to concurrent segment search. Tagging it to Concurrent Search |
Thanks for reporting this @eemmiirr. I see the exception you're seeing is coming from here in the neural-search/src/main/java/org/opensearch/neuralsearch/search/query/HybridQueryScoreDocsMerger.java Lines 39 to 43 in e1c3878
Could you please share some sample queries (and index mappings if possible) where you are encountering this issue? Depending on how documents are distributed across segments the same query may return different results due to different scoring and this is especially likely in cases where there are only a few documents per segment. We'll be able to better determine if this is truly a bug or if it's an issue with sparse data in the test instead with these details. |
@opensearch-project/admin could you help transfer this to FYI @sohami @Gankris96 |
@jed326 I'll try to write tomorrow an example with tests to showcase the error |
While we're waiting on @opensearch-project/admin to transfer the issue, In the meantime @navneet1v @martin-gaievski could you help take a look at this as well? |
Adding @vibrantvarun |
@jed326 I created a showcase project which triggers the bug. You can check it out here: https://github.com/eemmiirr/opensearch-concurrent-segment-search-bug. All details can be found in the README |
Thanks @eemmiirr , looking at https://github.com/eemmiirr/opensearch-concurrent-segment-search-bug/blob/main/src/test/kotlin/com/github/eemmiirr/osshowcase/OpenSearchRepositoryTest.kt#L15-L36 it looks like the test case is only using 3 documents, which is the same value as the setting neural-search/src/main/java/org/opensearch/neuralsearch/search/query/HybridQueryScoreDocsMerger.java Line 25 in e1c3878
to hit the exception you're seeing. It seems like if you ever have more than 1 segment in your index, then you will hit that condition. You can quickly verify this by calling I'll let @vibrantvarun or one of the other neural plugin folks confirm if this is the expected behavior. |
@jed326 I'm not sure I follow. Why would the number of documents matter? The test succeeds multiple times under the same conditions, but then it suddenly fails without any changes in those conditions. Regardless, I added a test with a configurable number of documents, and as expected, it also fails. |
@eemmiirr I have tried the Kotlin based project you've shared to reproduce the issue (https://github.com/eemmiirr/opensearch-concurrent-segment-search-bug), and I got results that are bit different from yours. I do not see the mentioned error "cannot merge top docs because it does not have enough elements.", however I do see that both test cases, specifically one with 3 docs are failing with following:
I wonder how do you see that mentioned error and something is missing from steps or setup? I also tried with a standalone opensearch 2.16 cluster, and I'm not seeing the issue. That seems expected as you've mentioned that issue is transient and you've seen it only with certain combination of steps. I also agree with your point that number of documents in index shouldn't matter. That 3 number is specific to internal format we use for hybrid query results, we need 1 as header element, 1 for first query delimiter and 1 as footer element. |
Hi @martin-gaievski . Thanks for having a look at it. I improved the tests so they collect the errors and throw an exception. Both tests are failing now with |
tried with your latest code @eemmiirr, it's failing on assertion for both test cases:
and
it looks like not all docs are retrieved, I think that was the behavior in 2.15, but it should be fixed in 2.16. |
@eemmiirr I finally replicated the problem, will be working on the fix now |
Describe the bug
When performing a hybrid search in OpenSearch 2.16, combining a lexical and a kNN query, you may encounter the error: "cannot merge top docs because it does not have enough elements." Downgrading to version 2.15 returns results but may omit results from one shard. Executing a forcemerge before running the search or disabling concurrent segment search resolves the issue. This problem is challenging to reproduce, as it only occurs when tests are run as part of a suite. Running a single test in isolation on an empty index consistently succeeds. The issue seems to be related to segment search.
Related component
Search
To Reproduce
Expected behavior
Returns results from all shards and segments
Additional Details
Plugins
Please list all plugins currently enabled.
Screenshots
If applicable, add screenshots to help explain your problem.
Host/Environment (please complete the following information):
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: