-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concurrent Searching: Support Early Termination #2586
Comments
Hey @jed326 , yeah sure, please go ahead, the issue with Lucene's IndexSearcher timeout was related to very closed APIs, it was difficult to extract that from IndexSearcher since all methods were either private or package private (and I didn't want to copy/paste) |
Sharing some more detailed thoughts on this below Problem OverviewCurrently, if the query phase fails for whatever reason, no partial results will be returned for the failed shard in the concurrent search case. If an exception is thrown during searcher.search() OpenSearch/server/src/main/java/org/opensearch/search/query/ConcurrentQueryPhaseSearcher.java Line 81 in 068404e
Going a level deeper, in Lucene whenever the search thread is blocking on the index searcher threads, if any of the threads encounter an exception then we will not add the collectors for the rest of the slice segments. This means that even if we call reduce on all the collectors from the query phase, we will still not get the "full" partial results from the concurrent executions. Lucene Timeout ImplementationThis was already explored as a part of #4906 and #4487 but I took another look for my own understanding.
Exception HandlingFor now I believe the best solution is to handle the exceptions in ContextIndexSearcher and keep track of if the search has timed out. It looks like we are already doing something similar with CollectionTerminatedException ( OpenSearch/server/src/main/java/org/opensearch/search/internal/ContextIndexSearcher.java Lines 332 to 335 in 8eea7b9
The difficulty here is coming up with a way to propagate the the timeout without using the exception. I’ll list out the places I’ve thought of below:
|
Is your feature request related to a problem? Please describe.
Early termination and time-bounded search are exception-driven and may return partial results (whatever collected up to the point of termination). When search goes over segments concurrently, it is difficult to replicate the sequential behaviour as-is (in this case the flow is interrupted and the reducers are not available, no results).
Describe the solution you'd like
Find the way to propagate early termination and time-bounded search conditions in case of concurrent segments traversal, without introducing the additional synchronization.
Describe alternatives you've considered
The early termination and time-bounded search do not return partial results right now.
Additional context
Add any other context or screenshots about the feature request here.
The text was updated successfully, but these errors were encountered: