Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase index_searcher thread count to 2x processor #12196

Merged
merged 1 commit into from
Feb 7, 2024

Conversation

jed326
Copy link
Collaborator

@jed326 jed326 commented Feb 6, 2024

Description

With the Lucene 9.9 changes the search threadpool will now offload all the tasks to the index_searcher threadpool. Since the number of threads for the index_searcher threadpool today is allocatedProcessors, this will significantly reduce the throughput of concurrent segment search.

Previously, there are 1.5x processors for the search threadpool and 1x processors for the index_searcher threadpool, giving 2.5x processor threads available for search. With the Lucene 9.9 change there will only be 1x processors available for the index_searcher threadpool to use. Even for the single slice case this is a significant throughput decrease as previously the single slice case would be executed on the search threadpool which has 1.5x processor threads.

This PR bumps the index_searcher thread count to 2x to move the search throughput back towards the original level.

Related Issues

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@jed326 jed326 marked this pull request as ready for review February 6, 2024 20:38
@jed326 jed326 requested a review from peternied as a code owner February 6, 2024 20:38
@jed326
Copy link
Collaborator Author

jed326 commented Feb 6, 2024

@reta @sohami @peternied

Apologies for the last minute change but would really like to get this into 2.12 if possible.

@peternied peternied requested review from reta and sohami February 6, 2024 20:43
@jed326 jed326 added backport 2.x Backport to 2.x branch v2.12.0 Issues and PRs related to version 2.12.0 v3.0.0 Issues and PRs related to version 3.0.0 labels Feb 6, 2024
@sohami
Copy link
Collaborator

sohami commented Feb 6, 2024

Thanks Jay for making this change. It makes sense to me to start with 2x the processor count given the change in lucene 9.9, as you explained it is between 1.5x and 2.5x thread count. With 2x, it will multiplex 2 threads on each vCPU atleast to provide benefit for cases when there is some disk IO happening on search thread.

Copy link
Contributor

github-actions bot commented Feb 6, 2024

Compatibility status:

Checks if related components are compatible with change 4e986b1

Incompatible components

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/flow-framework.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/performance-analyzer.git]

Copy link
Contributor

github-actions bot commented Feb 6, 2024

❌ Gradle check result for 74a5af0: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@jed326
Copy link
Collaborator Author

jed326 commented Feb 6, 2024

❌ Gradle check result for 74a5af0: FAILURE

There are also a lot of failing bwc tests. They shouldn't be related to this change, more likely failed due to the Jenkins issues.

Not able to open https://build.ci.opensearch.org/job/gradle-check/33396/ link anymore actually

Copy link
Contributor

github-actions bot commented Feb 6, 2024

❌ Gradle check result for 74a5af0:

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@jed326
Copy link
Collaborator Author

jed326 commented Feb 6, 2024

I think the "todo" here needs to be done: f67ed17#diff-afe8d7298c117ffe707f8755d7e981f9f1103dbd0b040b40642efb555428aeec

Copy link
Contributor

github-actions bot commented Feb 6, 2024

❌ Gradle check result for 74a5af0: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@jed326
Copy link
Collaborator Author

jed326 commented Feb 6, 2024

BWC failures are due to this: #12111 (comment)
It's also happening in other PRs: #12194 (comment)

@kotwanikunal
Copy link
Member

BWC failures are due to this: #12111 (comment) It's also happening in other PRs: #12194 (comment)

Raised a PR here: #12201

@jed326
Copy link
Collaborator Author

jed326 commented Feb 6, 2024

BWC failures are due to this: #12111 (comment) It's also happening in other PRs: #12194 (comment)

Raised a PR here: #12201

@kotwanikunal I just pushed a commit doing the same to this PR too. I'm fine with undoing my changes in favor of your PR or vice versa, no strong preference here.

@kotwanikunal
Copy link
Member

kotwanikunal commented Feb 6, 2024

BWC failures are due to this: #12111 (comment) It's also happening in other PRs: #12194 (comment)

Raised a PR here: #12201

@kotwanikunal I just pushed a commit doing the same to this PR too. I'm fine with undoing my changes in favor of your PR or vice versa, no strong preference here.

Thanks @jed326!
A bunch of PRs are blocked on that change. Let's pull it in independently.

@jed326 jed326 force-pushed the index-searcher-tuning branch from ba3a847 to 57fe433 Compare February 6, 2024 23:02
@kotwanikunal kotwanikunal added the backport 2.12 Backport to 2.12 branch label Feb 6, 2024
@kotwanikunal
Copy link
Member

Added backport 2.12 since the branch is already cut.

Copy link
Contributor

github-actions bot commented Feb 6, 2024

❕ Gradle check result for ba3a847: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.remotestore.RemoteIndexPrimaryRelocationIT.testPrimaryRelocationWhileIndexing
      1 org.opensearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT.testIndexCreateBlockIsRemovedWhenAnyNodesNotExceedHighWatermarkWithAutoReleaseEnabled

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link

codecov bot commented Feb 6, 2024

Codecov Report

Attention: 138 lines in your changes are missing coverage. Please review.

Comparison is base (3cbf54e) 71.40% compared to head (4e986b1) 71.34%.
Report is 5 commits behind head on main.

Files Patch % Lines
...rch/search/fetch/subphase/MatchedQueriesPhase.java 0.00% 37 Missing ⚠️
...src/main/java/org/opensearch/search/SearchHit.java 58.73% 17 Missing and 9 partials ⚠️
...ansport/top_queries/TransportTopQueriesAction.java 24.00% 18 Missing and 1 partial ⚠️
.../insights/core/listener/QueryInsightsListener.java 73.58% 10 Missing and 4 partials ⚠️
...opensearch/search/builder/SearchSourceBuilder.java 42.85% 4 Missing and 4 partials ⚠️
.../resthandler/top_queries/RestTopQueriesAction.java 61.11% 7 Missing ⚠️
.../insights/rules/action/top_queries/TopQueries.java 66.66% 5 Missing ⚠️
...search/ingest/common/RemoveByPatternProcessor.java 95.77% 1 Missing and 2 partials ⚠️
...s/rules/action/top_queries/TopQueriesResponse.java 93.18% 3 Missing ⚠️
...g/opensearch/search/internal/SubSearchContext.java 0.00% 3 Missing ⚠️
... and 8 more
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #12196      +/-   ##
============================================
- Coverage     71.40%   71.34%   -0.07%     
- Complexity    59636    59675      +39     
============================================
  Files          4944     4952       +8     
  Lines        280322   280638     +316     
  Branches      40728    40773      +45     
============================================
+ Hits         200175   200223      +48     
- Misses        63501    63863     +362     
+ Partials      16646    16552      -94     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

github-actions bot commented Feb 6, 2024

❌ Gradle check result for 57fe433: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@jed326 jed326 force-pushed the index-searcher-tuning branch from 57fe433 to 4e986b1 Compare February 6, 2024 23:57
Copy link
Contributor

github-actions bot commented Feb 7, 2024

✅ Gradle check result for 4e986b1: SUCCESS

@sohami sohami merged commit 237cee3 into opensearch-project:main Feb 7, 2024
29 of 30 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-12196-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 237cee38a03aa5cc45ccea9ff1017a2744bcfbe4
# Push it to GitHub
git push --set-upstream origin backport/backport-12196-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-12196-to-2.x.

@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.12 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.12 2.12
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.12
# Create a new branch
git switch --create backport/backport-12196-to-2.12
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 237cee38a03aa5cc45ccea9ff1017a2744bcfbe4
# Push it to GitHub
git push --set-upstream origin backport/backport-12196-to-2.12
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.12

Then, create a pull request where the base branch is 2.12 and the compare/head branch is backport/backport-12196-to-2.12.

jed326 added a commit to jed326/OpenSearch that referenced this pull request Feb 7, 2024
sohami pushed a commit that referenced this pull request Feb 7, 2024
sohami pushed a commit that referenced this pull request Feb 7, 2024
peteralfonsi pushed a commit to peteralfonsi/OpenSearch that referenced this pull request Mar 1, 2024
rayshrey pushed a commit to rayshrey/OpenSearch that referenced this pull request Mar 18, 2024
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch backport 2.12 Backport to 2.12 branch backport-failed skip-changelog v2.12.0 Issues and PRs related to version 2.12.0 v3.0.0 Issues and PRs related to version 3.0.0
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

6 participants