SOLR-13350: Multithreaded search #2248

chatman · 2024-02-06T18:01:22Z

Abandoning previous PRs to create this new one.

https://issues.apache.org/jira/browse/SOLR-13350

…hanged

…hreads 685cd7a follow-up: queue capacity to continue to match max threads

cpoerschke · 2024-02-07T12:50:00Z

solr/core/src/java/org/apache/solr/core/CoreContainer.java

-        new SolrNamedThreadFactory("searcherCollector"));
+    this.collectorExecutor =
+        ExecutorUtil.newMDCAwareCachedThreadPool(
+            6, new SolrNamedThreadFactory("searcherCollector"));


1/2 I added dbd2d8a commit to tentatively make this configurable. Hope you don't mind.

Thanks @cpoerschke for your review! This is something I'm still wondering about. I'll post a comment on JIRA about the number of threads.

cpoerschke · 2024-02-07T12:51:40Z

solr/core/src/java/org/apache/solr/core/CoreContainer.java

+            cfg.getIndexSearcherExecutorThreads(), // thread count
+            cfg.getIndexSearcherExecutorThreads(), // queue size


2/2 Was thinking IndexSearcherExecutorThreads name based on https://github.com/apache/lucene/blob/releases/lucene/9.9.2/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L228 signature and specifically no mention of 'collector' in the name since e.g. https://github.com/apache/lucene/blob/releases/lucene/9.9.2/lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java#L82 also uses it.

dsmiley

Seems a draft with all the logging and some commented code but anyway it's nice to see this!

dsmiley · 2024-02-11T05:35:49Z

solr/core/src/java/org/apache/solr/search/QueryResult.java

@@ -46,6 +46,7 @@ public void setDocSet(DocSet set) {
      docListAndSet = new DocListAndSet();
    }
    docListAndSet.docSet = set;
+    // log.error("set docset {}",     docListAndSet.docSet.getBits().length());


should remove

dsmiley · 2024-02-11T05:40:51Z

solr/core/src/java/org/apache/solr/search/SolrMultiCollectorManager.java

+  }
+
+  /** Wraps multiple collectors for processing */
+  public class Collectors implements Collector {


not sure yet if these inner classes need to be public.

not sure yet if these inner classes need to be public.

3e66068 to reduce visibility -- implements CollectorManager<SolrMultiCollectorManager.Collectors, Object[]> says SolrMultiCollectorManager.Collectors has private access in SolrMultiCollectorManager if it's private but package visibility is okay and LeafCollectors can be private.

dsmiley · 2024-02-11T05:50:51Z

solr/core/src/java/org/apache/solr/search/ThreadSafeBitSetCollector.java

+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/** */


Suggested change

/** */

/** A Lucene Collector collecting docs in a thread-safe manner. */

dsmiley · 2024-02-11T05:51:23Z

solr/core/src/java/org/apache/solr/search/ThreadSafeBitSetCollector.java

+  }
+
+  public DocSet getDocSet() {
+    log.error("Max Set Bit {}", bits.maxSetBit());


dsmiley · 2024-02-11T05:51:32Z

solr/core/src/java/org/apache/solr/search/ThreadSafeBitSetCollector.java

+
+    doc += base;
+    if (log.isErrorEnabled()) {
+      log.error("collect doc: {}, base: {}", doc, base, this);


dsmiley · 2024-02-11T06:15:44Z

solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java

      }
-      ScoreMode scoreModeUsed =
-          buildAndRunCollectorChain(qr, query, collector, cmd, pf.postFilter).scoreMode();
+      MTCollectorQueryCheck allowMT = new MTCollectorQueryCheck();


Bunch of code being added in two places looking similar. Have you thought of attempting to factor out a common approach?

c69006a and 1ec21d4 factored something out. Not thought much about naming in doing so and not specifically in response to this comment but just from my code reading/learning/reviewing perspective.

dsmiley · 2024-02-11T06:16:32Z

solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java

+        }
+
+        populateNextCursorMarkFromTopDocs(qr, cmd, topDocs);
+        //        if (cmd.getSort() != null && !(cmd.getQuery() instanceof RankQuery) &&


Forgot some logic or...?

cb3c1cb uncomments some of it.

dsmiley · 2024-02-11T06:16:59Z

solr/core/src/java/org/apache/solr/search/SolrMultiCollectorManager.java

+import org.apache.lucene.search.ScoreMode;
+
+/**
+ * A {@link CollectorManager} implements which wrap a set of {@link CollectorManager} as {@link


Suggested change

* A {@link CollectorManager} implements which wrap a set of {@link CollectorManager} as {@link

* A {@link CollectorManager} implementation which wrap a set of {@link CollectorManager} as {@link

dsmiley · 2024-02-11T06:18:03Z

solr/core/src/java/org/apache/solr/search/SolrMultiCollectorManager.java

+ * MultiCollector} acts for {@link Collector}.
+ */
+public class SolrMultiCollectorManager
+    implements CollectorManager<SolrMultiCollectorManager.Collectors, Object[]> {


Hmm; Object? Do we not know the type?

Lucene has a public class MultiCollectorManager implements CollectorManager<Collector, Object[]> i.e. https://github.com/apache/lucene/blob/releases/lucene/9.9.2/lucene/core/src/java/org/apache/lucene/search/MultiCollectorManager.java and I'm curious if there'd be a way to code share with that, though so far I can't see a way there.

dsmiley · 2024-02-11T06:19:49Z

solr/solr-ref-guide/modules/indexing-guide/examples/stemdict.txt

Looks like you removed a symlink?

Looks like you removed a symlink?

Restored in 08d415f commit.

solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java

…rcher.MTCollectorQueryCheck class-level javadocs;

…finals

…tion and fewer MTCollectorQueryCheck visits

solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java

dsmiley · 2024-02-14T14:38:29Z

Observation: I don't think we need the concurrent DocSet builder (the one from Netflix). Couldn't we build segment level FixedBitSets (no safety issue) and then at the end combine into a master FixedBitSet so that we have our DocSet? It could even be made to be long-aligned, and thus the final aggregation is just copying longs with System.arraycopy (no doc iteration! or long shifting). The boundary long would be XOR.

cpoerschke · 2024-02-14T17:09:27Z

solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java

+              }
+              log.error("new docset collector for {} max={}", numDocs, maxDoc());
+
+              return new ThreadSafeBitSetCollector(bits, maxDoc);


Not sure yet if a firstCollectors[2] = ... was intended here or if not we can reduce to new Collector[2] above it seems.

Resolved Conflicts: solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java

…de computation to after super.search call

cpoerschke · 2024-02-16T10:09:17Z

solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java

+    for (Collector collector : firstCollectors) {
+      if (collector != null) {
+        if (scoreMode == null) {
+          scoreMode = collector.scoreMode();
+        } else if (scoreMode != collector.scoreMode()) {
+          scoreMode = ScoreMode.COMPLETE;
+        }
+      }
+    }


I think this needs to happen after the super.search call since at this point here the collectors will still all be null? 2501f7b to tentatively move it.

For new clusters, the solr.xml contains `indexSearcherExecutorThreads` as 4, and hence 4 threads will be used for search. However, for older setups that upgrade to this version, the `indexSearcherExecutorThreads` defaults to 1 when unspecified in the solr.xml.

chatman · 2024-05-06T18:37:23Z

I think that's an anti-pattern or broken and isn't what I meant in JIRA. We could use a SynchronousQueue (with fairness) if we want to block for a thread -- probably what we should do. FYI that queue is the default for Executors.newCachedThreadPool(). The "caller runs" behavior I meant could be done via an ExecutorService delegate that catches RejectedException and simply runs the Runnable.

@dsmiley instead of using a rejected tasks execution handler, I went with @noblepaul 's suggestion of having a reasonably large queue for the threadpool (number of threads * 1000). Beyond this, if tasks are submitted, it is okay to reject them. We can revisit these limits later as well.

chatman · 2024-05-06T18:51:37Z

Thanks @cpoerschke and @dsmiley. I've merged to main, where this can bake for some days before merging to branch_9x. If there are any major outstanding issues, or any changes needed to documentation or default behaviour, we can take it up in another PR.

dsmiley · 2024-05-06T20:09:58Z

Merging to main was unexpected to me because of the healthy code review happening here didn't conclude. Next time please announce an intention to do so in a couple days. I would have given this another look! For example I thought the DocSet merging aspect was still tentative... I was awaiting another comment from Christine. And I was looking forward to checking out the thread pool aspect again.

dsmiley · 2024-05-06T20:10:46Z

This pull request is closed, but the jira/solr-13350 branch has unmerged commits.

Not sure what didn't make it

chatman · 2024-05-07T03:49:12Z

Merging to main was unexpected to me because of the healthy code review happening here didn't conclude.

I didn't want this to languish for a long time, and there were large PRs in the waiting that could affect this PR. Example: #2382

I would have given this another look!

...

And I was looking forward to checking out the thread pool aspect again.

Please feel free to look at it, would be happy to address all loose ends, if any.

cpoerschke · 2024-05-07T11:49:48Z

Merging to main was unexpected to me because of the healthy code review happening here didn't conclude. ...

I was surprised by this merge too and my initial thought was that it might have been an accident and well that happens sometimes and we can just revert the commit and resume on a new PR then.

gus-asf · 2024-05-16T15:42:03Z

The discussion on this is long, so maybe I've missed it, but the actual merged code has introduced the possibility (though I suspect it might never happen) of a non-numeric Max Score...

    public float getMaxScore(int totalHits) {
      if (totalHits > 0) {
        for (Object res : result) {
          if (res instanceof MaxScoreResult) {
            return ((MaxScoreResult) res).maxScore;
          }
        }
        return Float.NaN;    <<<<<<<<<<<<<<<<<<<<<<
      } else {
        return 0.0f;
      }
    }

Did I miss discussion of this?

cpoerschke · 2024-07-18T17:34:18Z

solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java

@@ -335,7 +335,7 @@ public SolrIndexSearcher(
      boolean reserveDirectory,
      DirectoryFactory directoryFactory)
      throws IOException {
-    super(wrapReader(core, r));
+    super(wrapReader(core, r), core.getCoreContainer().getCollectorExecutor());


This change here is not controlled by the request level multiThreaded=true|false flag. #2570 explores how node-level opt-out could be supported.

cpoerschke · 2024-07-24T13:58:03Z

solr/core/src/java/org/apache/solr/search/MultiThreadedSearcher.java

+    }
+    if (needDocSet) {
+      int maxDoc = searcher.getRawReader().maxDoc();
+      log.error("raw read max={}", searcher.getRawReader().maxDoc());


Am unclear on why we log an error here. Also since the overall bit set is now computed as union of per-segment bit sets, perhaps there is no need to 'know' the max doc and it could be computed from the per-segment bit sets?

logging: surely it was temporary during dev to ensure a log message could be seen easily. It's easier than logging at trace and reconfiguring logging. This must be fixed.

oops, looks like this wasn't fixed yet: https://issues.apache.org/jira/browse/SOLR-17454

#2720 to remedy.

cpoerschke · 2024-07-24T15:54:18Z

solr/core/src/test/org/apache/solr/search/TestCpuAllowedLimit.java

-                "50"));
+                "50",
+                "multiThreaded",
+                "false"));


#2590 proposes to explicitly not (yet) support multi-threading when query limits are used, and with that then the test change here could be reverted.

cpoerschke · 2024-07-24T15:54:36Z

solr/core/src/test/org/apache/solr/search/TestQueryLimits.java

-              params("q", "id:*", "sort", "id asc", "facet", "true", "facet.field", "val_i"));
+              params(
+                  "q",
+                  "id:*",
+                  "sort",
+                  "id asc",
+                  "facet",
+                  "true",
+                  "facet.field",
+                  "val_i",
+                  "multiThreaded",
+                  "false"));


#2590 proposes to explicitly not (yet) support multi-threading when query limits are used, and with that then the test change here could be reverted.

Ishan Chattopadhyaya added 2 commits February 6, 2024 01:16

SOLR-13350: Multithreaded search using CollectorManager

3cee3e5

SOLR-13350: Fix tests and precommit

74c05e5

github-actions bot added the documentation Improvements or additions to documentation label Feb 6, 2024

cpoerschke added 5 commits February 7, 2024 11:34

Merge remote-tracking branch 'origin/main' into jira/solr-13350

976f350

make it compile: ExecutorUtil.newMDCAwareCachedThreadPool signature c…

685cd7a

…hanged

Update CoreContainer.java - remove unused import

8732c33

Update CoreContainer.java - queue capacity to continue to match max t…

50160dc

…hreads 685cd7a follow-up: queue capacity to continue to match max threads

tentative: add indexSearcherExecutorThreads element to solr.xml

dbd2d8a

cpoerschke reviewed Feb 7, 2024

View reviewed changes

cpoerschke mentioned this pull request Feb 7, 2024

SOLR-13350: Multithreaded search using collector managers apache/lucene-solr#1310

Closed

dsmiley reviewed Feb 11, 2024

View reviewed changes

cpoerschke added 3 commits February 13, 2024 16:43

Merge remote-tracking branch 'origin/main' into jira/solr-13350

7b5e813

git checkout origin/main -- solr/solr-ref-guide/modules/indexing-guide

08d415f

reduce SolrMultiCollectorManager.[Leaf]Collectors visibility

3e66068

cpoerschke mentioned this pull request Feb 13, 2024

SolrIndexSearcher.getDocList[AndSet]NC style: more final local variables #2258

Merged

cpoerschke reviewed Feb 13, 2024

View reviewed changes

solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java Show resolved Hide resolved

cpoerschke added 5 commits February 14, 2024 09:42

Merge branch 'main' into jira/solr-13350

85b783e

minor: SolrIndexSearcher style: add couple of finals

947d841

factor out SolrIndexSearcher.allowMT(Query) utility; add SolrIndexSea…

c69006a

…rcher.MTCollectorQueryCheck class-level javadocs;

minor: SolrIndexSearcher style: (post main merge) add back couple of …

366235b

…finals

more logic in SolrIndexSearcher.allowMT utility for less code duplica…

1ec21d4

…tion and fewer MTCollectorQueryCheck visits

cpoerschke mentioned this pull request Feb 14, 2024

NO JIRA: remove duplicate '(cmd.getFlags() & GET_SCORES) != 0' calls #2262

Merged

cpoerschke reviewed Feb 14, 2024

View reviewed changes

solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java Outdated Show resolved Hide resolved

cpoerschke reviewed Feb 14, 2024

View reviewed changes

cpoerschke added 3 commits February 15, 2024 17:25

Merge remote-tracking branch 'origin/main' into jira/solr-13350

a2b48a8

Resolved Conflicts: solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java

more 'needsScores' use (post origin/main merge conflict resolution)

278a1ad

tentative: in SolrIndexSearcher.searchCollectorManagers defer scoreMo…

2501f7b

…de computation to after super.search call

cpoerschke reviewed Feb 16, 2024

View reviewed changes

undo TestDistributedSearch change by uncommenting commented out code

cb3c1cb

github-actions bot added cat:cloud cat:cli cat:api scripts labels May 6, 2024

Merge branch 'main' into jira/solr-13350

b86ab72

github-actions bot removed test-framework jetty-server client:solrj start-scripts cat:cloud cat:cli cat:api scripts labels May 6, 2024

github-actions bot added the configs label May 6, 2024

SOLR-13350: Defaulting to or 4 for number of threads

d8e9ef0

github-actions bot added the start-scripts label May 6, 2024

asfgit closed this in ff6607d May 6, 2024

cpoerschke reviewed Jul 18, 2024

View reviewed changes

cpoerschke reviewed Jul 24, 2024

View reviewed changes

		cfg.getIndexSearcherExecutorThreads(), // thread count
		cfg.getIndexSearcherExecutorThreads(), // queue size

	/** */
	/** A Lucene Collector collecting docs in a thread-safe manner. */

	* A {@link CollectorManager} implements which wrap a set of {@link CollectorManager} as {@link
	* A {@link CollectorManager} implementation which wrap a set of {@link CollectorManager} as {@link

SOLR-13350: Multithreaded search #2248

SOLR-13350: Multithreaded search #2248

Conversation

chatman commented Feb 6, 2024 • edited by cpoerschke Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dsmiley left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dsmiley commented Feb 14, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chatman commented May 6, 2024 • edited Loading

chatman commented May 6, 2024

dsmiley commented May 6, 2024

dsmiley commented May 6, 2024

chatman commented May 7, 2024

cpoerschke commented May 7, 2024

gus-asf commented May 16, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chatman commented Feb 6, 2024 •

edited by cpoerschke

Loading

chatman commented May 6, 2024 •

edited

Loading