Skip to content

Commit

Permalink
Add cross-encoder model documentation (opensearch-project#6357)
Browse files Browse the repository at this point in the history
* Add cross-ranking model documentation

Signed-off-by: Fanit Kolchina <[email protected]>

* Model id format

Signed-off-by: Fanit Kolchina <[email protected]>

* Move to custom models

Signed-off-by: Fanit Kolchina <[email protected]>

* Update _search-plugins/search-relevance/reranking-search-results.md

Signed-off-by: kolchfa-aws <[email protected]>

* Update _ml-commons-plugin/custom-local-models.md

Signed-off-by: kolchfa-aws <[email protected]>

* Tech review and doc review comments

Signed-off-by: Fanit Kolchina <[email protected]>

* Apply suggestions from code review

Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: kolchfa-aws <[email protected]>

* Update _ml-commons-plugin/pretrained-models.md

Signed-off-by: kolchfa-aws <[email protected]>

---------

Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: kolchfa-aws <[email protected]>
Co-authored-by: Nathan Bower <[email protected]>
  • Loading branch information
2 people authored and oeyh committed Mar 14, 2024
1 parent 1b4fb65 commit 1b7cd2c
Show file tree
Hide file tree
Showing 3 changed files with 148 additions and 3 deletions.
147 changes: 146 additions & 1 deletion _ml-commons-plugin/custom-local-models.md
Original file line number Diff line number Diff line change
Expand Up @@ -315,4 +315,149 @@ The response contains the tokens and weights:

## Step 5: Use the model for search

To learn how to use the model for vector search, see [Set up neural search]({{site.url}}{{site.baseurl}}http://localhost:4000/docs/latest/search-plugins/neural-search/#set-up-neural-search).
To learn how to use the model for vector search, see [Using an ML model for neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/#using-an-ml-model-for-neural-search).

## Cross-encoder models

Cross-encoder models support query reranking.

To register a cross-encoder model, send a request in the following format. The `model_config` object is optional. For cross-encoder models, specify the `function_name` as `TEXT_SIMILARITY`. For example, the following request registers an `ms-marco-TinyBERT-L-2-v2` model:

```json
POST /_plugins/_ml/models/_register
{
"name": "ms-marco-TinyBERT-L-2-v2",
"version": "1.0.0",
"function_name": "TEXT_SIMILARITY",
"description": "test model",
"model_format": "TORCH_SCRIPT",
"model_group_id": "lN4AP40BKolAMNtR4KJ5",
"model_content_hash_value": "90e39a926101d1a4e542aade0794319404689b12acfd5d7e65c03d91c668b5cf",
"model_config": {
"model_type": "bert",
"embedding_dimension": 1,
"framework_type": "huggingface_transformers",
"total_chunks":2,
"all_config": "{\"total_chunks\":2}"
},
"url": "https://github.com/opensearch-project/ml-commons/blob/main/ml-algorithms/src/test/resources/org/opensearch/ml/engine/algorithms/text_similarity/TinyBERT-CE-torch_script.zip?raw=true"
}
```
{% include copy-curl.html %}

Then send a request to deploy the model:

```json
POST _plugins/_ml/models/<model_id>/_deploy
```
{% include copy-curl.html %}

To test a cross-encoder model, send the following request:

```json
POST _plugins/_ml/models/<model_id>/_predict
{
"query_text": "today is sunny",
"text_docs": [
"how are you",
"today is sunny",
"today is july fifth",
"it is winter"
]
}
```
{% include copy-curl.html %}

The model calculates the similarity score of `query_text` and each document in `text_docs` and returns a list of scores for each document in the order they were provided in `text_docs`:

```json
{
"inference_results": [
{
"output": [
{
"name": "similarity",
"data_type": "FLOAT32",
"shape": [
1
],
"data": [
-6.077798
],
"byte_buffer": {
"array": "Un3CwA==",
"order": "LITTLE_ENDIAN"
}
}
]
},
{
"output": [
{
"name": "similarity",
"data_type": "FLOAT32",
"shape": [
1
],
"data": [
10.223609
],
"byte_buffer": {
"array": "55MjQQ==",
"order": "LITTLE_ENDIAN"
}
}
]
},
{
"output": [
{
"name": "similarity",
"data_type": "FLOAT32",
"shape": [
1
],
"data": [
-1.3987057
],
"byte_buffer": {
"array": "ygizvw==",
"order": "LITTLE_ENDIAN"
}
}
]
},
{
"output": [
{
"name": "similarity",
"data_type": "FLOAT32",
"shape": [
1
],
"data": [
-4.5923924
],
"byte_buffer": {
"array": "4fSSwA==",
"order": "LITTLE_ENDIAN"
}
}
]
}
]
}
```

A higher document score means higher similarity. In the preceding response, documents are scored as follows against the query text `today is sunny`:

Document text | Score
:--- | :---
`how are you` | -6.077798
`today is sunny` | 10.223609
`today is july fifth` | -1.3987057
`it is winter` | -4.5923924

The document that contains the same text as the query is scored the highest, and the remaining documents are scored based on the text similarity.

To learn how to use the model for reranking, see [Reranking search results]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/reranking-search-results/).
2 changes: 1 addition & 1 deletion _ml-commons-plugin/pretrained-models.md
Original file line number Diff line number Diff line change
Expand Up @@ -296,4 +296,4 @@ The following table provides a list of sparse encoding models and artifact links
|---|---|---|---|
| `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1` | 1.0.1 | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-v1/1.0.1/torch_script/neural-sparse_opensearch-neural-sparse-encoding-v1-1.0.1-torch_script.zip)<br>- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-v1/1.0.1/torch_script/config.json) | A neural sparse encoding model. The model transforms text into a sparse vector, identifies the indexes of non-zero elements in the vector, and then converts the vector into `<entry, weight>` pairs, where each entry corresponds to a non-zero element index. |
| `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` | 1.0.1 | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1/1.0.1/torch_script/neural-sparse_opensearch-neural-sparse-encoding-doc-v1-1.0.1-torch_script.zip)<br>- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1/1.0.1/torch_script/config.json) | A neural sparse encoding model. The model transforms text into a sparse vector, identifies the indexes of non-zero elements in the vector, and then converts the vector into `<entry, weight>` pairs, where each entry corresponds to a non-zero element index. |
| `amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1` | 1.0.1 | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1/1.0.1/torch_script/neural-sparse_opensearch-neural-sparse-tokenizer-v1-1.0.1-torch_script.zip)<br>- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1/1.0.1/torch_script/config.json) | A neural sparse tokenizer model. The model tokenizes text into tokens and assigns each token a predefined weight, which is the token's IDF (if the IDF file is not provided, the weight defaults to 1). For more information, see [Preparing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/custom-local-models/#preparing-a-model). |
| `amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1` | 1.0.1 | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1/1.0.1/torch_script/neural-sparse_opensearch-neural-sparse-tokenizer-v1-1.0.1-torch_script.zip)<br>- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1/1.0.1/torch_script/config.json) | A neural sparse tokenizer model. The model tokenizes text into tokens and assigns each token a predefined weight, which is the token's inverse document frequency (IDF). If the IDF file is not provided, the weight defaults to 1. For more information, see [Preparing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/custom-local-models/#preparing-a-model). |
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ Introduced 2.12
You can rerank search results using a cross-encoder reranker in order to improve search relevance. To implement reranking, you need to configure a [search pipeline]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/index/) that runs at search time. The search pipeline intercepts search results and applies the [`rerank` processor]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rerank-processor/) to them. The `rerank` processor evaluates the search results and sorts them based on the new scores provided by the cross-encoder model.

**PREREQUISITE**<br>
Before using hybrid search, you must set up a cross-encoder model. For more information, see [Choosing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/#choosing-a-model).
Before configuring a reranking pipeline, you must set up a cross-encoder model. For more information, see [Cross-encoder models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/custom-local-models/#cross-encoder-models).
{: .note}

## Running a search with reranking
Expand Down

0 comments on commit 1b7cd2c

Please sign in to comment.