-
Notifications
You must be signed in to change notification settings - Fork 501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add doc for neural-sparse-query-two-phase-processor. #7306
Merged
Naarcha-AWS
merged 13 commits into
opensearch-project:main
from
conggguan:neural-sparse-two-phase
Jun 14, 2024
Merged
Changes from all commits
Commits
Show all changes
13 commits
Select commit
Hold shift + click to select a range
b188880
Add doc for neural-sparse-query-two-phase-processor.
7a0df87
Make some edits for the comments.
22d4739
Fix some typo and style-job.
d11c412
Update neural-sparse-query-two-phase-processor.md
Naarcha-AWS 33a7c6e
Apply suggestions from code review
Naarcha-AWS e1fd8dd
Apply suggestions from code review
Naarcha-AWS 074f340
Apply suggestions from code review
Naarcha-AWS 61f74ee
Apply suggestions from code review
Naarcha-AWS d05dc86
Apply suggestions from code review
Naarcha-AWS d5f068b
Apply suggestions from code review
Naarcha-AWS 7f4b04f
Apply suggestions from code review
Naarcha-AWS 5fa2306
Update _search-plugins/search-pipelines/neural-sparse-query-two-phase…
Naarcha-AWS 0442dd9
Merge branch 'main' into neural-sparse-two-phase
Naarcha-AWS File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
150 changes: 150 additions & 0 deletions
150
_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,150 @@ | ||
--- | ||
layout: default | ||
title: Neural spare query two-phase processor | ||
nav_order: 13 | ||
parent: Search processors | ||
grand_parent: Search pipelines | ||
--- | ||
|
||
# Neural sparse query two-phase processor | ||
Introduced 2.15 | ||
{: .label .label-purple } | ||
|
||
The `neural_sparse_two_phase_processor` search processor is designed to provide faster search pipelines for [neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/). It accelerates the neural sparse query by dividing the original method of scoring all documents with all tokens into two steps: | ||
|
||
1. High-weight tokens score the documents and filter out the top documents. | ||
2. Low-weight tokens rescore the top documents. | ||
|
||
## Request fields | ||
|
||
The following table lists all available request fields. | ||
|
||
Field | Data type | Description | ||
:--- | :--- | :--- | ||
`enabled` | Boolean | Controls whether the two-phase processor is enabled. Default is `true`. | ||
`two_phase_parameter` | Object | A map of key-value pairs representing the two-phase parameters and their associated values. You can specify the value of `prune_ratio`, `expansion_rate`, `max_window_size`, or any combination of these three parameters. Optional. | ||
`two_phase_parameter.prune_ratio` | Float | A ratio that represents how to split the high-weight tokens and low-weight tokens. The threshold is the token's maximum score multiplied by its `prune_ratio`. Valid range is [0,1]. Default is `0.4` | ||
`two_phase_parameter.expansion_rate` | Float | The rate at which documents will be fine-tuned during the second phase. The second-phase document number equals the query size (default is 10) multiplied by its expansion rate. Valid range is greater than 1.0. Default is `5.0` | ||
`two_phase_parameter.max_window_size` | Int | The maximum number of documents that can be processed using the two-phase processor. Valid range is greater than 50. Default is `10000`. | ||
`tag` | String | The processor's identifier. Optional. | ||
`description` | String | A description of the processor. Optional. | ||
|
||
## Example | ||
|
||
The following example creates a search pipeline with a `neural_sparse_two_phase_processor` search request processor. | ||
|
||
### Create search pipeline | ||
|
||
The following example request creates a search pipeline with a `neural_sparse_two_phase_processor` search request processor. The processor sets a custom model ID at the index level and provides different default model IDs for two specific index fields: | ||
|
||
```json | ||
PUT /_search/pipeline/two_phase_search_pipeline | ||
{ | ||
"request_processors": [ | ||
{ | ||
"neural_sparse_two_phase_processor": { | ||
"tag": "neural-sparse", | ||
"description": "This processor is making two-phase processor.", | ||
"enabled": true, | ||
"two_phase_parameter": { | ||
"prune_ratio": custom_prune_ratio, | ||
"expansion_rate": custom_expansion_rate, | ||
"max_window_size": custom_max_window_size | ||
} | ||
} | ||
} | ||
] | ||
} | ||
``` | ||
{% include copy-curl.html %} | ||
|
||
### Set search pipeline | ||
|
||
After the two-phase pipeline is created, set the `index.search.default_pipeline` setting to the name of the pipeline for the index on which you want to use the two-phase pipeline: | ||
|
||
```json | ||
PUT /index-name/_settings | ||
{ | ||
"index.search.default_pipeline" : "two_phase_search_pipeline" | ||
} | ||
``` | ||
{% include copy-curl.html %} | ||
|
||
## Limitation | ||
|
||
The `neural_sparse_two_phase_processor` has the following limitations. | ||
|
||
### Version support | ||
|
||
The `neural_sparse_two_phase_processor` can only be used with OpenSearch 2.15 or later. | ||
|
||
### Compound query support | ||
|
||
As of OpenSearch 2.15, only the Boolean [compound query]({{site.url}}{{site.baseurl}}/query-dsl/compound/index/) is supported. | ||
|
||
Neural sparse queries and Boolean queries with a boost parameter (not boosting queries) are also supported. | ||
|
||
## Examples | ||
|
||
The following examples show neural sparse queries with the supported query types. | ||
|
||
### Single neural sparse query | ||
|
||
``` | ||
GET /my-nlp-index/_search | ||
{ | ||
"query": { | ||
"neural_sparse": { | ||
"passage_embedding": { | ||
"query_text": "Hi world" | ||
"model_id": <model-id> | ||
} | ||
} | ||
} | ||
} | ||
``` | ||
{% include copy-curl.html %} | ||
|
||
### Neural sparse query nested in a Boolean query | ||
|
||
``` | ||
GET /my-nlp-index/_search | ||
{ | ||
"query": { | ||
"bool": { | ||
"should": [ | ||
{ | ||
"neural_sparse": { | ||
"passage_embedding": { | ||
"query_text": "Hi world", | ||
"model_id": <model-id> | ||
}, | ||
"boost": 2.0 | ||
} | ||
} | ||
] | ||
} | ||
} | ||
} | ||
``` | ||
{% include copy-curl.html %} | ||
|
||
## P99 latency metrics | ||
Using an OpenSearch cluster set up on three m5.4xlarge Amazon Elastic Compute Cloud (Amazon EC2) instances, OpenSearch conducts neural sparse query P99 latency tests on indexes corresponding to more than 10 datasets. | ||
|
||
### Doc-only mode latency metric | ||
|
||
In doc-only mode, the two-phase processor can significantly decrease query latency, as shown by the following latency metrics: | ||
|
||
- Average latency without the two-phase processor: 53.56 ms | ||
- Average latency with the two-phase processor: 38.61 ms | ||
|
||
This results in an overall latency reduction of approximately 27.92%. Most indexes show a significant latency reduction when using the two-phase processor, with reductions ranging from 5.14 to 84.6%. The specific latency optimization values depend on the data distribution within the indexes. | ||
|
||
### Bi-encoder mode latency metric | ||
|
||
In bi-encoder mode, the two-phase processor can significantly decrease query latency, as shown by the following latency metrics: | ||
- Average latency without the two-phase processor: 300.79 ms | ||
- Average latency with the two-phase processor: 121.64 ms | ||
|
||
This results in an overall latency reduction of approximately 59.56%. Most indexes show a significant latency reduction when using the two-phase processor, with reductions ranging from 1.56 to 82.84%. The specific latency optimization values depend on the data distribution within the indexes. |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we give some example API calling like the section
## Setting a default model on an index or field
? Using all default values is also good, this would help users a lotThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also mention that this processor is strongly recommended to set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gave a example API for set a default 2-phase pipeline.
Add a explanation for why we recommend this pipeline.