Concurrent Searching: Support Early Termination #2586

reta · 2022-03-24T19:04:16Z

Is your feature request related to a problem? Please describe.
Early termination and time-bounded search are exception-driven and may return partial results (whatever collected up to the point of termination). When search goes over segments concurrently, it is difficult to replicate the sequential behaviour as-is (in this case the flow is interrupted and the reducers are not available, no results).

Describe the solution you'd like
Find the way to propagate early termination and time-bounded search conditions in case of concurrent segments traversal, without introducing the additional synchronization.

Describe alternatives you've considered
The early termination and time-bounded search do not return partial results right now.

Additional context
Add any other context or screenshots about the feature request here.

jed326 · 2023-06-20T21:11:01Z

Hey @reta what's the status of this issue? I saw you have a draft PR #4906 where it looks like you took a crack at using the Lucene IndexSearcher timeout but ultimately put it off. If you're not actively working on this do you mind if I pick it up?

reta · 2023-06-20T22:01:37Z

Hey @jed326 , yeah sure, please go ahead, the issue with Lucene's IndexSearcher timeout was related to very closed APIs, it was difficult to extract that from IndexSearcher since all methods were either private or package private (and I didn't want to copy/paste)

jed326 · 2023-06-27T21:00:55Z

Sharing some more detailed thoughts on this below

Problem Overview

Currently, if the query phase fails for whatever reason, no partial results will be returned for the failed shard in the concurrent search case. If an exception is thrown during searcher.search()

OpenSearch/server/src/main/java/org/opensearch/search/query/ConcurrentQueryPhaseSearcher.java

Line 81 in 068404e

final ReduceableSearchResult result = searcher.search(query, collectorManager);

in the query phase searcher then reduce will not get called and no TopDocs will be collected.

Going a level deeper, in Lucene whenever the search thread is blocking on the index searcher threads, if any of the threads encounter an exception then we will not add the collectors for the rest of the slice segments. This means that even if we call reduce on all the collectors from the query phase, we will still not get the "full" partial results from the concurrent executions.

Lucene Timeout Implementation

This was already explored as a part of #4906 and #4487 but I took another look for my own understanding.
Looking at the latest Lucene code, I see a few issues with going down this path

The biggest problem is that TimeLimitingBulkScorer.TimeExceededException is package private. Since we are overriding IndexSearcher.search with our own implementation

OpenSearch/server/src/main/java/org/opensearch/search/internal/ContextIndexSearcher.java

Lines 284 to 354 in 8eea7b9

    
               @Override 
        
               protected void search(List<LeafReaderContext> leaves, Weight weight, Collector collector) throws IOException { 
        
                   if (shouldReverseLeafReaderContexts()) { 
        
                       // reverse the segment search order if this flag is true. 
        
                       // Certain queries can benefit if we reverse the segment read order, 
        
                       // for example time series based queries if searched for desc sort order. 
        
                       for (int i = leaves.size() - 1; i >= 0; i--) { 
        
                           searchLeaf(leaves.get(i), weight, collector); 
        
                       } 
        
                   } else { 
        
                       for (int i = 0; i < leaves.size(); i++) { 
        
                           searchLeaf(leaves.get(i), weight, collector); 
        
                       } 
        
                   } 
        
               } 
        
               /** 
        
                * Lower-level search API. 
        
                * 
        
                * {@link LeafCollector#collect(int)} is called for every matching document in 
        
                * the provided <code>ctx</code>. 
        
                */ 
        
               private void searchLeaf(LeafReaderContext ctx, Weight weight, Collector collector) throws IOException { 
        
                   // Check if at all we need to call this leaf for collecting results. 
        
                   if (canMatch(ctx) == false) { 
        
                       return; 
        
                   } 
        
                   cancellable.checkCancelled(); 
        
                   weight = wrapWeight(weight); 
        
                   // See please https://github.com/apache/lucene/pull/964 
        
                   collector.setWeight(weight); 
        
                   final LeafCollector leafCollector; 
        
                   try { 
        
                       leafCollector = collector.getLeafCollector(ctx); 
        
                   } catch (CollectionTerminatedException e) { 
        
                       // there is no doc of interest in this reader context 
        
                       // continue with the following leaf 
        
                       return; 
        
                   } 
        
                   Bits liveDocs = ctx.reader().getLiveDocs(); 
        
                   BitSet liveDocsBitSet = getSparseBitSetOrNull(liveDocs); 
        
                   if (liveDocsBitSet == null) { 
        
                       BulkScorer bulkScorer = weight.bulkScorer(ctx); 
        
                       if (bulkScorer != null) { 
        
                           try { 
        
                               bulkScorer.score(leafCollector, liveDocs); 
        
                           } catch (CollectionTerminatedException e) { 
        
                               // collection was terminated prematurely 
        
                               // continue with the following leaf 
        
                           } 
        
                       } 
        
                   } else { 
        
                       // if the role query result set is sparse then we should use the SparseFixedBitSet for advancing: 
        
                       Scorer scorer = weight.scorer(ctx); 
        
                       if (scorer != null) { 
        
                           try { 
        
                               intersectScorerAndBitSet( 
        
                                   scorer, 
        
                                   liveDocsBitSet, 
        
                                   leafCollector, 
        
                                   this.cancellable.isEnabled() ? cancellable::checkCancelled : () -> {} 
        
                               ); 
        
                           } catch (CollectionTerminatedException e) { 
        
                               // collection was terminated prematurely 
        
                               // continue with the following leaf 
        
                           } 
        
                       } 
        
                   } 
        
               }

, that makes it difficult for us to catch the correct exception in our implementation.

The timeout is currently only done by the TimeLimitingBulkScorer, which means that timeout is only checked in the collection phase currently. On the other hand, the OpenSearch timeout implementation is done with a runnable that checks for timeouts

OpenSearch/server/src/main/java/org/opensearch/search/query/QueryPhase.java

Lines 309 to 329 in 90678c2

    
               /** 
        
                * Create runnable which throws {@link TimeExceededException} when the runnable is called after timeout + runnable creation time 
        
                * exceeds currentTime 
        
                * @param searchContext to extract timeout from and to get relative time from 
        
                * @return the created runnable 
        
                */ 
        
               static Runnable createQueryTimeoutChecker(final SearchContext searchContext) { 
        
                   /* for startTime, relative non-cached precise time must be used to prevent false positive timeouts. 
        
                   * Using cached time for startTime will fail and produce false positive timeouts when maxTime = (startTime + timeout) falls in 
        
                   * next time cache slot(s) AND time caching lifespan > passed timeout */ 
        
                   final long startTime = searchContext.getRelativeTimeInMillis(false); 
        
                   final long maxTime = startTime + searchContext.timeout().millis(); 
        
                   return () -> { 
        
                       /* As long as startTime is non cached time, using cached time here might only produce false negative timeouts within the time 
        
                       * cache life span which is acceptable */ 
        
                       final long time = searchContext.getRelativeTimeInMillis(); 
        
                       if (time > maxTime) { 
        
                           throw new TimeExceededException(); 
        
                       } 
        
                   }; 
        
               }

See IndexSearcher#setTimeout should also abort query rewrites, point ranges and vector searches [LUCENE-10641] apache/lucene#11677

This also does not solve for partial results in the early termination case because a different exception is thrown for that and we need to handle both cases

OpenSearch/server/src/main/java/org/opensearch/search/query/QueryPhase.java

Lines 353 to 370 in 90678c2

    
           try { 
        
               searcher.search(query, queryCollector); 
        
           } catch (EarlyTerminatingCollector.EarlyTerminationException e) { 
        
               queryResult.terminatedEarly(true); 
        
           } catch (TimeExceededException e) { 
        
               assert timeoutSet : "TimeExceededException thrown even though timeout wasn't set"; 
        
               if (searchContext.request().allowPartialSearchResults() == false) { 
        
                   // Can't rethrow TimeExceededException because not serializable 
        
                   throw new QueryPhaseExecutionException(searchContext.shardTarget(), "Time exceeded"); 
        
               } 
        
               queryResult.searchTimedOut(true); 
        
           } 
        
           if (searchContext.terminateAfter() != SearchContext.DEFAULT_TERMINATE_AFTER && queryResult.terminatedEarly() == null) { 
        
               queryResult.terminatedEarly(false); 
        
           } 
        
           for (QueryCollectorContext ctx : collectors) { 
        
               ctx.postProcess(queryResult); 
        
           }

Exception Handling

For now I believe the best solution is to handle the exceptions in ContextIndexSearcher and keep track of if the search has timed out. It looks like we are already doing something similar with CollectionTerminatedException (

OpenSearch/server/src/main/java/org/opensearch/search/internal/ContextIndexSearcher.java

Lines 332 to 335 in 8eea7b9

    
           } catch (CollectionTerminatedException e) { 
        
               // collection was terminated prematurely 
        
               // continue with the following leaf 
        
           }

).
The difficulty here is coming up with a way to propagate the the timeout without using the exception. I’ll list out the places I’ve thought of below:

We could provide our own implementations of the concurrent search search() methods in IndexSearcher, however that is going to require copying a lot of code since a lot of the dependency methods are package private. It would however be very straightforward to simply catch and throw the timeout exception outside of the blocking for loop.
In ContextIndexSearcher.searchLeaf we have access to searchContext, so we should be able to catch the exception there and set the timeout parameter. This seems pretty straightforward to me so I'm planning on following up with a PR for this implementation.

reta added enhancement Enhancement or improvement to existing feature or request untriaged labels Mar 24, 2022

This was referenced Mar 24, 2022

Concurrent Searching #1286

Closed

[META] Concurrent Searching #2587

Closed

Concurrent Searching (Experimental) #1500

Merged

Rishikesh1159 added Indexing & Search and removed untriaged labels Mar 28, 2022

reta mentioned this issue Sep 12, 2022

[FEATURE] Use timeout support of the IndexSearcher instead of custom implementation #4487

Open

anasalkouz assigned jed326 Jun 22, 2023

jed326 mentioned this issue Jun 28, 2023

Add early termination support for concurrent segment search #8306

Merged

6 tasks

reta closed this as completed in #8306 Jul 7, 2023

reta added the v2.9.0 'Issues and PRs related to version v2.9.0' label Jul 7, 2023

jed326 mentioned this issue Jul 7, 2023

[Backport 2.X] Add early termination support for concurrent segment search (#8306) #8537

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Concurrent Searching: Support Early Termination #2586

Concurrent Searching: Support Early Termination #2586

reta commented Mar 24, 2022 •

edited

Loading

jed326 commented Jun 20, 2023

reta commented Jun 20, 2023

jed326 commented Jun 27, 2023 •

edited

Loading

Concurrent Searching: Support Early Termination #2586

Concurrent Searching: Support Early Termination #2586

Comments

reta commented Mar 24, 2022 • edited Loading

jed326 commented Jun 20, 2023

reta commented Jun 20, 2023

jed326 commented Jun 27, 2023 • edited Loading

Problem Overview

Lucene Timeout Implementation

Exception Handling

reta commented Mar 24, 2022 •

edited

Loading

jed326 commented Jun 27, 2023 •

edited

Loading