Skip to content

Commit

Permalink
Add documentation for new reranking feature in 2.12 (opensearch-proje…
Browse files Browse the repository at this point in the history
…ct#6368)

* Create reranking.md

document new reranking feature in 2.12

Signed-off-by: HenryL27 <[email protected]>

* Doc review and address comments

Signed-off-by: Fanit Kolchina <[email protected]>

* Apply suggestions from code review

Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: kolchfa-aws <[email protected]>

* Update _search-plugins/search-pipelines/rerank-processor.md

Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: kolchfa-aws <[email protected]>

* Update _search-plugins/search-pipelines/rerank-processor.md

Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: kolchfa-aws <[email protected]>

---------

Signed-off-by: HenryL27 <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: kolchfa-aws <[email protected]>
Co-authored-by: Fanit Kolchina <[email protected]>
Co-authored-by: kolchfa-aws <[email protected]>
Co-authored-by: Nathan Bower <[email protected]>
  • Loading branch information
4 people authored and oeyh committed Mar 14, 2024
1 parent 484c109 commit fc90fa0
Show file tree
Hide file tree
Showing 5 changed files with 241 additions and 4 deletions.
116 changes: 116 additions & 0 deletions _search-plugins/search-pipelines/rerank-processor.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
---
layout: default
title: Rerank
nav_order: 25
has_children: false
parent: Search processors
grand_parent: Search pipelines
---

# Rerank processor

The `rerank` search request processor intercepts search results and passes them to a cross-encoder model to be reranked. The model reranks the results, taking into account the scoring context. Then the processor orders documents in the search results based on their new scores.

## Request fields

The following table lists all available request fields.

Field | Data type | Description
:--- | :--- | :---
`<reranker_type>` | Object | The reranker type provides the rerank processor with static information needed across all reranking calls. Required.
`context` | Object | Provides the rerank processor with information necessary for generating reranking context at query time.
`tag` | String | The processor's identifier. Optional.
`description` | String | A description of the processor. Optional.
`ignore_failure` | Boolean | If `true`, OpenSearch [ignores any failure]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/creating-search-pipeline/#ignoring-processor-failures) of this processor and continues to run the remaining processors in the search pipeline. Optional. Default is `false`.

### The `ml_opensearch` reranker type

The `ml_opensearch` reranker type is designed to work with the cross-encoder model provided by OpenSearch. For this reranker type, specify the following fields.

Field | Data type | Description
:--- | :--- | :---
`ml_opensearch` | Object | Provides the rerank processor with model information. Required.
`ml_opensearch.model_id` | String | The model ID for the cross-encoder model. Required. For more information, see [Using ML models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/using-ml-models/).
`context.document_fields` | Array | An array of document fields that specifies the fields from which to retrieve context for the cross-encoder model. Required.

## Example

The following example demonstrates using a search pipeline with a `rerank` processor.

### Creating a search pipeline

The following request creates a search pipeline with a `rerank` response processor:

```json
PUT /_search/pipeline/rerank_pipeline
{
"response_processors": [
{
"rerank": {
"ml_opensearch": {
"model_id": "gnDIbI0BfUsSoeNT_jAw"
},
"context": {
"document_fields": [ "title", "text_representation"]
}
}
}
]
}
```
{% include copy-curl.html %}

### Using a search pipeline

Combine an OpenSearch query with an `ext` object that contains the query context for the large language model (LLM). Provide the `query_text` that will be used to rerank the results:

```json
POST /_search?search_pipeline=rerank_pipeline
{
"query": {
"match": {
"text_representation": "Where is Albuquerque?"
}
},
"ext": {
"rerank": {
"query_context": {
"query_text": "Where is Albuquerque?"
}
}
}
}
```
{% include copy-curl.html %}

Instead of specifying `query_text`, you can provide a full path to the field containing text to use for reranking. For example, if you specify a subfield `query` in the `text_representation` object, specify its path in the `query_text_path` parameter:

```json
POST /_search?search_pipeline=rerank_pipeline
{
"query": {
"match": {
"text_representation": {
"query": "Where is Albuquerque?"
}
}
},
"ext": {
"rerank": {
"query_context": {
"query_text_path": "query.match.text_representation.query"
}
}
}
}
```
{% include copy-curl.html %}

The `query_context` object contains the following fields.

Field name | Description
:--- | :---
`query_text` | The natural language text of the question that you want to use to rerank the search results. Either `query_text` or `query_text_path` (not both) is required.
`query_text_path` | The full JSON path to the text of the question that you want to use to rerank the search results. Either `query_text` or `query_text_path` (not both) is required. The maximum number of characters in the path is `1000`.

For more information about setting up reranking, see [Reranking search results]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/reranking-search-results/).
1 change: 1 addition & 0 deletions _search-plugins/search-pipelines/search-processors.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ Processor | Description | Earliest available version
:--- | :--- | :---
[`personalize_search_ranking`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/personalize-search-ranking/) | Uses [Amazon Personalize](https://aws.amazon.com/personalize/) to rerank search results (requires setting up the Amazon Personalize service). | 2.9
[`rename_field`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rename-field-processor/)| Renames an existing field. | 2.8
[`rerank`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rerank-processor/)| Reranks search results using a cross-encoder model. | 2.12
[`collapse`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/collapse-processor/)| Deduplicates search hits based on a field value, similarly to `collapse` in a search request. | 2.12
[`truncate_hits`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/truncate-hits-processor/)| Discards search hits after a specified target count is reached. Can undo the effect of the `oversample` request processor. | 2.12

Expand Down
4 changes: 2 additions & 2 deletions _search-plugins/search-relevance/compare-search-results.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
layout: default
title: Compare Search Results
title: Comparing search results
nav_order: 55
parent: Search relevance
has_children: true
Expand All @@ -9,7 +9,7 @@ redirect_from:
- /search-plugins/search-relevance/
---

# Compare Search Results
# Comparing search results

With Compare Search Results in OpenSearch Dashboards, you can compare results from two queries side by side to determine whether one query produces better results than the other. Using this tool, you can evaluate search quality by experimenting with queries.

Expand Down
6 changes: 4 additions & 2 deletions _search-plugins/search-relevance/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@ Search relevance evaluates the accuracy of the search results returned by a quer

OpenSearch provides the following search relevance features:

- [Compare Search Results]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/compare-search-results/) in OpenSearch Dashboards lets you compare results from two queries side by side.
- [Comparing search results]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/compare-search-results/) from two queries side by side in OpenSearch Dashboards.

- [Querqy]({{site.url}}{{site.baseurl}}/search-plugins/querqy/) offers query rewriting capability.
- [Reranking search results]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/reranking-search-results/) using a cross-encoder reranker.

- Rewriting queries using [Querqy]({{site.url}}{{site.baseurl}}/search-plugins/querqy/).
118 changes: 118 additions & 0 deletions _search-plugins/search-relevance/reranking-search-results.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
---
layout: default
title: Reranking search results
parent: Search relevance
has_children: false
nav_order: 60
---

# Reranking search results
Introduced 2.12
{: .label .label-purple }

You can rerank search results using a cross-encoder reranker in order to improve search relevance. To implement reranking, you need to configure a [search pipeline]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/index/) that runs at search time. The search pipeline intercepts search results and applies the [`rerank` processor]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rerank-processor/) to them. The `rerank` processor evaluates the search results and sorts them based on the new scores provided by the cross-encoder model.

**PREREQUISITE**<br>
Before using hybrid search, you must set up a cross-encoder model. For more information, see [Choosing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/#choosing-a-model).
{: .note}

## Running a search with reranking

To run a search with reranking, follow these steps:

1. [Configure a search pipeline](#step-1-configure-a-search-pipeline).
1. [Create an index for ingestion](#step-2-create-an-index-for-ingestion).
1. [Ingest documents into the index](#step-3-ingest-documents-into-the-index).
1. [Search using reranking](#step-4-search-using-reranking).

## Step 1: Configure a search pipeline

Next, configure a search pipeline with a [`rerank` processor]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rerank-processor/).

The following example request creates a search pipeline with an `ml_opensearch` rerank processor. In the request, provide a model ID for the cross-encoder model and the document fields to use as context:

```json
PUT /_search/pipeline/my_pipeline
{
"description": "Pipeline for reranking with a cross-encoder",
"response_processors": [
{
"rerank": {
"ml_opensearch": {
"model_id": "gnDIbI0BfUsSoeNT_jAw"
},
"context": {
"document_fields": [
"passage_text"
]
}
}
}
]
}
```
{% include copy-curl.html %}

For more information about the request fields, see [Request fields]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rerank-processor/#request-fields).

## Step 2: Create an index for ingestion

In order to use the rerank processor defined in your pipeline, create an OpenSearch index and add the pipeline created in the previous step as the default pipeline:

```json
PUT /my-index
{
"settings": {
"index.search.default_pipeline" : "my_pipeline"
},
"mappings": {
"properties": {
"passage_text": {
"type": "text"
}
}
}
}
```
{% include copy-curl.html %}

## Step 3: Ingest documents into the index

To ingest documents into the index created in the previous step, send the following bulk request:

```json
POST /_bulk
{ "index": { "_index": "my-index" } }
{ "passage_text" : "I said welcome to them and we entered the house" }
{ "index": { "_index": "my-index" } }
{ "passage_text" : "I feel welcomed in their family" }
{ "index": { "_index": "my-index" } }
{ "passage_text" : "Welcoming gifts are great" }

```
{% include copy-curl.html %}

## Step 4: Search using reranking

To perform reranking search on your index, use any OpenSearch query and provide an additional `ext.rerank` field:

```json
POST /my-index/_search
{
"query": {
"match": {
"passage_text": "how to welcome in family"
}
},
"ext": {
"rerank": {
"query_context": {
"query_text": "how to welcome in family"
}
}
}
}
```
{% include copy-curl.html %}

Alternatively, you can provide the full path to the field containing the context. For more information, see [Rerank processor example]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rerank-processor/#example).

0 comments on commit fc90fa0

Please sign in to comment.