From db21c431128e738a2860d4ad726fed69f7a8616f Mon Sep 17 00:00:00 2001 From: Brian Flores Date: Mon, 25 Nov 2024 10:45:18 -0500 Subject: [PATCH] Added tutorial on ML Inference processor with the by-field rerank type (#8694) * Added tutorial on ML Inference processor with the by-field rerank type We needed to show users the power of combining a agnostic remote model inference processor and reranking your documents based on what was returned by the ml inference processor. Testing was done by running a docker compose using the commands shown in the tutorial. Signed-off-by: Brian Flores * Update _search-plugins/search-relevance/ml-inference-rerank-by-field.md removed unneeded comment Signed-off-by: Brian Flores * Update _search-plugins/search-relevance/ml-inference-rerank-by-field.md Capitalized Harlem Signed-off-by: Brian Flores * Added SageMaker step and links to processors Signed-off-by: Brian Flores * fix: grammar and simplifying tutorial Signed-off-by: Brian Flores * Fix: broken link Signed-off-by: Brian Flores * Update _search-plugins/search-relevance/ml-inference-rerank-by-field.md Signed-off-by: Brian Flores * Doc review Signed-off-by: Fanit Kolchina * Add copy button Signed-off-by: Fanit Kolchina * Small rewrite Signed-off-by: Fanit Kolchina * Heading change Signed-off-by: Fanit Kolchina * updating explanation on pipeline config Signed-off-by: Brian Flores * spacing on config Signed-off-by: Brian Flores * Added endpoint info Signed-off-by: Brian Flores * modelUrl -> Sagemaker_endpoint Signed-off-by: Brian Flores * Rewording Signed-off-by: Fanit Kolchina * Apply suggestions from code review Co-authored-by: Nathan Bower Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> --------- Signed-off-by: Brian Flores Signed-off-by: Fanit Kolchina Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Co-authored-by: Fanit Kolchina Co-authored-by: Nathan Bower --- .../ml-inference-search-response.md | 6 +- .../search-pipelines/rerank-processor.md | 3 +- .../rerank-by-field-cross-encoder.md | 276 ++++++++++++++++++ .../search-relevance/rerank-by-field.md | 5 +- .../search-relevance/rerank-cross-encoder.md | 3 +- 5 files changed, 288 insertions(+), 5 deletions(-) create mode 100644 _search-plugins/search-relevance/rerank-by-field-cross-encoder.md diff --git a/_search-plugins/search-pipelines/ml-inference-search-response.md b/_search-plugins/search-pipelines/ml-inference-search-response.md index e8f17a667c..b0573d17be 100644 --- a/_search-plugins/search-pipelines/ml-inference-search-response.md +++ b/_search-plugins/search-pipelines/ml-inference-search-response.md @@ -751,4 +751,8 @@ The response includes the original documents and their reranked scores: "shards": [] } } -``` \ No newline at end of file +``` + +## Next steps + +- See a comprehensive example of [reranking by a field using an externally hosted cross-encoder model]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/rerank-by-field-cross-encoder/). \ No newline at end of file diff --git a/_search-plugins/search-pipelines/rerank-processor.md b/_search-plugins/search-pipelines/rerank-processor.md index 84819b17c8..11691eff95 100644 --- a/_search-plugins/search-pipelines/rerank-processor.md +++ b/_search-plugins/search-pipelines/rerank-processor.md @@ -191,4 +191,5 @@ POST /book-index/_search?search_pipeline=rerank_byfield_pipeline - Learn more about [reranking search results]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/reranking-search-results/). - See a complete example of [reranking using a cross-encoder model]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/rerank-cross-encoder/). -- See a complete example of [reranking by a document field]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/rerank-by-field/). \ No newline at end of file +- See a complete example of [reranking by a document field]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/rerank-by-field/). +- See a comprehensive example of [reranking by a field using an externally hosted cross-encoder model]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/rerank-by-field-cross-encoder/). \ No newline at end of file diff --git a/_search-plugins/search-relevance/rerank-by-field-cross-encoder.md b/_search-plugins/search-relevance/rerank-by-field-cross-encoder.md new file mode 100644 index 0000000000..7f30689491 --- /dev/null +++ b/_search-plugins/search-relevance/rerank-by-field-cross-encoder.md @@ -0,0 +1,276 @@ +--- +layout: default +title: Reranking by a field using a cross-encoder +parent: Reranking search results +grand_parent: Search relevance +has_children: false +nav_order: 30 +--- + +# Reranking by a field using an externally hosted cross-encoder model +Introduced 2.18 +{: .label .label-purple } + +In this tutorial, you'll learn how to use a cross-encoder model hosted on Amazon SageMaker to rerank search results and improve search relevance. + +To rerank documents, you'll configure a search pipeline that processes search results at query time. The pipeline intercepts search results and passes them to the [`ml_inference` search response processor]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/ml-inference-search-response/), which invokes the cross-encoder model. The model generates scores used to rerank the matching documents [`by_field`]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/rerank-by-field/). + +## Prerequisite: Deploy a model on Amazon SageMaker + +Run the following code to deploy a model on Amazon SageMaker. For this example, you'll use the [`ms-marco-MiniLM-L-6-v2`](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-6-v2) Hugging Face cross-encoder model hosted on Amazon SageMaker. We recommend using a GPU for better performance: + +```python +import sagemaker +import boto3 +from sagemaker.huggingface import HuggingFaceModel + +sess = sagemaker.Session() +role = sagemaker.get_execution_role() + +hub = { + 'HF_MODEL_ID':'cross-encoder/ms-marco-MiniLM-L-6-v2', + 'HF_TASK':'text-classification' +} +huggingface_model = HuggingFaceModel( + transformers_version='4.37.0', + pytorch_version='2.1.0', + py_version='py310', + env=hub, + role=role, +) +predictor = huggingface_model.deploy( + initial_instance_count=1, # number of instances + instance_type='ml.m5.xlarge' # ec2 instance type +) +``` +{% include copy.html %} + +After deploying the model, you can find the model endpoint by going to the Amazon SageMaker console in the AWS Management Console and selecting **Inference > Endpoints** on the left tab. Note the URL for the created model; you'll use it to create a connector. + +## Running a search with reranking + +To run a search with reranking, follow these steps: + +1. [Create a connector](#step-1-create-a-connector). +1. [Register the model](#step-2-register-the-model). +1. [Ingest documents into an index](#step-3-ingest-documents-into-an-index). +1. [Create a search pipeline](#step-4-create-a-search-pipeline). +1. [Search using reranking](#step-5-search-using-reranking). + +## Step 1: Create a connector + +Create a connector to the cross-encoder model by providing the model URL in the `actions.url` parameter: + +```json +POST /_plugins/_ml/connectors/_create +{ + "name": "SageMaker cross-encoder model", + "description": "Test connector for SageMaker cross-encoder hosted model", + "version": 1, + "protocol": "aws_sigv4", + "credential": { + "access_key": "", + "secret_key": "", + "session_token": "" + }, + "parameters": { + "region": "", + "service_name": "sagemaker" + }, + "actions": [ + { + "action_type": "predict", + "method": "POST", + "url": "", + "headers": { + "content-type": "application/json" + }, + "request_body": "{ \"inputs\": { \"text\": \"${parameters.text}\", \"text_pair\": \"${parameters.text_pair}\" }}" + } + ] +} +``` +{% include copy-curl.html %} + +Note the connector ID contained in the response; you'll use it in the following step. + +## Step 2: Register the model + +To register the model, provide the connector ID in the `connector_id` parameter: + +```json +POST /_plugins/_ml/models/_register +{ + "name": "Cross encoder model", + "version": "1.0.1", + "function_name": "remote", + "description": "Using a SageMaker endpoint to apply a cross encoder model", + "connector_id": "" +} +``` +{% include copy-curl.html %} + + +## Step 3: Ingest documents into an index + +Create an index and ingest sample documents containing facts about the New York City boroughs: + +```json +POST /nyc_areas/_bulk +{ "index": { "_id": 1 } } +{ "borough": "Queens", "area_name": "Astoria", "description": "Astoria is a neighborhood in the western part of Queens, New York City, known for its diverse community and vibrant cultural scene.", "population": 93000, "facts": "Astoria is home to many artists and has a large Greek-American community. The area also boasts some of the best Mediterranean food in NYC." } +{ "index": { "_id": 2 } } +{ "borough": "Queens", "area_name": "Flushing", "description": "Flushing is a neighborhood in the northern part of Queens, famous for its Asian-American population and bustling business district.", "population": 227000, "facts": "Flushing is one of the most ethnically diverse neighborhoods in NYC, with a large Chinese and Korean population. It is also home to the USTA Billie Jean King National Tennis Center." } +{ "index": { "_id": 3 } } +{ "borough": "Brooklyn", "area_name": "Williamsburg", "description": "Williamsburg is a trendy neighborhood in Brooklyn known for its hipster culture, vibrant art scene, and excellent restaurants.", "population": 150000, "facts": "Williamsburg is a hotspot for young professionals and artists. The neighborhood has seen rapid gentrification over the past two decades." } +{ "index": { "_id": 4 } } +{ "borough": "Manhattan", "area_name": "Harlem", "description": "Harlem is a historic neighborhood in Upper Manhattan, known for its significant African-American cultural heritage.", "population": 116000, "facts": "Harlem was the birthplace of the Harlem Renaissance, a cultural movement that celebrated Black culture through art, music, and literature." } +{ "index": { "_id": 5 } } +{ "borough": "The Bronx", "area_name": "Riverdale", "description": "Riverdale is a suburban-like neighborhood in the Bronx, known for its leafy streets and affluent residential areas.", "population": 48000, "facts": "Riverdale is one of the most affluent areas in the Bronx, with beautiful parks, historic homes, and excellent schools." } +{ "index": { "_id": 6 } } +{ "borough": "Staten Island", "area_name": "St. George", "description": "St. George is the main commercial and cultural center of Staten Island, offering stunning views of Lower Manhattan.", "population": 15000, "facts": "St. George is home to the Staten Island Ferry terminal and is a gateway to Staten Island, offering stunning views of the Statue of Liberty and Ellis Island." } +``` +{% include copy-curl.html %} + +## Step 4: Create a search pipeline + +Next, create a search pipeline for reranking. In the search pipeline configuration, the `input_map` and `output_map` define how the input data is prepared for the cross-encoder model and how the model's output is interpreted for reranking: + +- The `input_map` specifies which fields in the search documents and the query should be used as model inputs: + - The `text` field maps to the `facts` field in the indexed documents. It provides the document-specific content that the model will analyze. + - The `text_pair` field dynamically retrieves the search query text (`multi_match.query`) from the search request. + + The combination of `text` (document `facts`) and `text_pair` (search `query`) allows the cross-encoder model to compare the relevance of the document to the query, considering their semantic relationship. + +- The `output_map` field specifies how the output of the model is mapped to the fields in the response: + - The `rank_score` field in the response will store the model's relevance score, which will be used to perform reranking. + +When using the `by_field` rerank type, the `rank_score` field will contain the same score as the `_score` field. To remove the `rank_score` field from the search results, set `remove_target_field` to `true`. The original BM25 score, before reranking, is included for debugging purposes by setting `keep_previous_score` to `true`. This allows you to compare the original score with the reranked score to evaluate improvements in search relevance. + +To create the search pipeline, send the following request: + +```json +PUT /_search/pipeline/my_pipeline +{ + "response_processors": [ + { + "ml_inference": { + "tag": "ml_inference", + "description": "This processor runs ml inference during search response", + "model_id": "", + "function_name": "REMOTE", + "input_map": [ + { + "text": "facts", + "text_pair":"$._request.query.multi_match.query" + } + ], + "output_map": [ + { + "rank_score": "$.score" + } + ], + "full_response_path": false, + "model_config": {}, + "ignore_missing": false, + "ignore_failure": false, + "one_to_one": true + }, + + "rerank": { + "by_field": { + "target_field": "rank_score", + "remove_target_field": true, + "keep_previous_score" : true + } + } + + } + ] +} +``` +{% include copy-curl.html %} + +## Step 5: Search using reranking + +Use the following request to search indexed documents and rerank them using the cross-encoder model. The request retrieves documents containing any of the specified terms in the `description` or `facts` fields. These terms are then used to compare and rerank the matched documents: + +```json +POST /nyc_areas/_search?search_pipeline=my_pipeline +{ + "query": { + "multi_match": { + "query": "artists art creative community", + "fields": ["description", "facts"] + } + } +} +``` +{% include copy-curl.html %} + +In the response, the `previous_score` field contains the document's BM25 score, which it would have received if you hadn't applied the pipeline. Note that while BM25 ranked "Astoria" the highest, the cross-encoder model prioritized "Harlem" because it matched more search terms: + +```json +{ + "took": 4, + "timed_out": false, + "_shards": { + "total": 1, + "successful": 1, + "skipped": 0, + "failed": 0 + }, + "hits": { + "total": { + "value": 3, + "relation": "eq" + }, + "max_score": 0.03418137, + "hits": [ + { + "_index": "nyc_areas", + "_id": "4", + "_score": 0.03418137, + "_source": { + "area_name": "Harlem", + "description": "Harlem is a historic neighborhood in Upper Manhattan, known for its significant African-American cultural heritage.", + "previous_score": 1.6489418, + "borough": "Manhattan", + "facts": "Harlem was the birthplace of the Harlem Renaissance, a cultural movement that celebrated Black culture through art, music, and literature.", + "population": 116000 + } + }, + { + "_index": "nyc_areas", + "_id": "1", + "_score": 0.0090838, + "_source": { + "area_name": "Astoria", + "description": "Astoria is a neighborhood in the western part of Queens, New York City, known for its diverse community and vibrant cultural scene.", + "previous_score": 2.519608, + "borough": "Queens", + "facts": "Astoria is home to many artists and has a large Greek-American community. The area also boasts some of the best Mediterranean food in NYC.", + "population": 93000 + } + }, + { + "_index": "nyc_areas", + "_id": "3", + "_score": 0.0032599436, + "_source": { + "area_name": "Williamsburg", + "description": "Williamsburg is a trendy neighborhood in Brooklyn known for its hipster culture, vibrant art scene, and excellent restaurants.", + "previous_score": 1.5632852, + "borough": "Brooklyn", + "facts": "Williamsburg is a hotspot for young professionals and artists. The neighborhood has seen rapid gentrification over the past two decades.", + "population": 150000 + } + } + ] + }, + "profile": { + "shards": [] + } +} +``` + \ No newline at end of file diff --git a/_search-plugins/search-relevance/rerank-by-field.md b/_search-plugins/search-relevance/rerank-by-field.md index 9c7e7419e5..e6f65a4d25 100644 --- a/_search-plugins/search-relevance/rerank-by-field.md +++ b/_search-plugins/search-relevance/rerank-by-field.md @@ -116,7 +116,7 @@ POST /book-index/_search ``` {% include copy-curl.html %} -The response contains documents sorted in descending order based on the `reviews.starts` field. Each document contains the original query score in the `previous_score` field: +The response contains documents sorted in descending order based on the `reviews.stars` field. Each document contains the original query score in the `previous_score` field: ```json { @@ -205,4 +205,5 @@ The response contains documents sorted in descending order based on the `reviews ## Next steps -- Learn more about the [`rerank` processor]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rerank-processor/). \ No newline at end of file +- Learn more about the [`rerank` processor]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rerank-processor/). +- See a comprehensive example of [reranking by a field using an externally hosted cross-encoder model]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/rerank-by-field-cross-encoder/). \ No newline at end of file diff --git a/_search-plugins/search-relevance/rerank-cross-encoder.md b/_search-plugins/search-relevance/rerank-cross-encoder.md index 854908e69c..64f93c886c 100644 --- a/_search-plugins/search-relevance/rerank-cross-encoder.md +++ b/_search-plugins/search-relevance/rerank-cross-encoder.md @@ -118,4 +118,5 @@ Alternatively, you can provide the full path to the field containing the context ## Next steps -- Learn more about the [`rerank` processor]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rerank-processor/). \ No newline at end of file +- Learn more about the [`rerank` processor]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rerank-processor/). +- See a comprehensive example of [reranking by a field using an externally hosted cross-encoder model]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/rerank-by-field-cross-encoder/). \ No newline at end of file