From 22daf21c2d98ca7da63db18957270a094c1132b0 Mon Sep 17 00:00:00 2001 From: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> Date: Tue, 13 Aug 2024 14:37:23 -0500 Subject: [PATCH 001/190] Add IP option to SAN certificate (#7972) * Add IP option to SAN certificate Signed-off-by: Archer * Apply suggestions from code review Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Update _security/configuration/generate-certificates.md Co-authored-by: Nathan Bower Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Update _security/configuration/generate-certificates.md Co-authored-by: Nathan Bower Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> --------- Signed-off-by: Archer Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> Co-authored-by: Nathan Bower --- _security/configuration/generate-certificates.md | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/_security/configuration/generate-certificates.md b/_security/configuration/generate-certificates.md index 4e83ff83d1..2316fd33be 100755 --- a/_security/configuration/generate-certificates.md +++ b/_security/configuration/generate-certificates.md @@ -115,13 +115,21 @@ openssl req -new -key node1-key.pem -out node1.csr For all host and client certificates, you should specify a subject alternative name (SAN) to ensure compliance with [RFC 2818 (HTTP Over TLS)](https://datatracker.ietf.org/doc/html/rfc2818). The SAN should match the corresponding CN so that both refer to the same DNS A record. {: .note } -Before generating a signed certificate, create a SAN extension file which describes the DNS A record for the host: +Before generating a signed certificate, create a SAN extension file that describes the DNS A record for the host. If you're connecting to a host that only has an IP address, either IPv4 or IPv6, use the `IP` syntax: + +**No IP** ```bash echo 'subjectAltName=DNS:node1.dns.a-record' > node1.ext ``` -Generate the certificate: +**With IP** + +```bash +echo subjectAltName=IP:127.0.0.1 > node1.ext +``` + +With the DNS A record described, generate the certificate: ```bash openssl x509 -req -in node1.csr -CA root-ca.pem -CAkey root-ca-key.pem -CAcreateserial -sha256 -out node1.pem -days 730 -extfile node1.ext From ecd2232ac2b6c97dad9f5be5ab4e528a629c5a71 Mon Sep 17 00:00:00 2001 From: zhichao-aws Date: Wed, 14 Aug 2024 04:54:00 +0800 Subject: [PATCH 002/190] Refactor of the neural sparse search tutorial (#7922) * refactor Signed-off-by: zhichao-aws * fix Signed-off-by: zhichao-aws * Doc review Signed-off-by: Fanit Kolchina * Link fix Signed-off-by: Fanit Kolchina * Apply suggestions from code review Co-authored-by: Nathan Bower Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> --------- Signed-off-by: zhichao-aws Signed-off-by: Fanit Kolchina Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Co-authored-by: Fanit Kolchina Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Co-authored-by: Nathan Bower --- .../processors/sparse-encoding.md | 2 +- _ml-commons-plugin/pretrained-models.md | 8 +- _search-plugins/neural-sparse-search.md | 421 +-------------- .../neural-sparse-with-pipelines.md | 486 ++++++++++++++++++ .../neural-sparse-with-raw-vectors.md | 99 ++++ 5 files changed, 607 insertions(+), 409 deletions(-) create mode 100644 _search-plugins/neural-sparse-with-pipelines.md create mode 100644 _search-plugins/neural-sparse-with-raw-vectors.md diff --git a/_ingest-pipelines/processors/sparse-encoding.md b/_ingest-pipelines/processors/sparse-encoding.md index 38b44320b1..3af6f4e987 100644 --- a/_ingest-pipelines/processors/sparse-encoding.md +++ b/_ingest-pipelines/processors/sparse-encoding.md @@ -141,7 +141,7 @@ The response confirms that in addition to the `passage_text` field, the processo } ``` -Once you have created an ingest pipeline, you need to create an index for ingestion and ingest documents into the index. To learn more, see [Step 2: Create an index for ingestion]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/#step-2-create-an-index-for-ingestion) and [Step 3: Ingest documents into the index]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/#step-3-ingest-documents-into-the-index) of [Neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/). +Once you have created an ingest pipeline, you need to create an index for ingestion and ingest documents into the index. To learn more, see [Create an index for ingestion]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-with-pipelines/#step-2b-create-an-index-for-ingestion) and [Step 3: Ingest documents into the index]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-with-pipelines/#step-2c-ingest-documents-into-the-index) of [Neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/). --- diff --git a/_ml-commons-plugin/pretrained-models.md b/_ml-commons-plugin/pretrained-models.md index 30540cfe49..154b8b530f 100644 --- a/_ml-commons-plugin/pretrained-models.md +++ b/_ml-commons-plugin/pretrained-models.md @@ -46,11 +46,13 @@ The following table provides a list of sentence transformer models and artifact Sparse encoding models transfer text into a sparse vector and convert the vector to a list of `` pairs representing the text entry and its corresponding weight in the sparse vector. You can use these models for use cases such as clustering or sparse neural search. -We recommend the following models for optimal performance: +We recommend the following combinations for optimal performance: - Use the `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1` model during both ingestion and search. - Use the `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` model during ingestion and the -`amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1` model during search. +`amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1` tokenizer during search. + +For more information about the preceding options for running neural sparse search, see [Generating sparse vector embeddings within OpenSearch]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-with-pipelines/). The following table provides a list of sparse encoding models and artifact links you can use to download them. @@ -58,7 +60,7 @@ The following table provides a list of sparse encoding models and artifact links |:---|:---|:---|:---|:---| | `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1` | 1.0.1 | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-v1/1.0.1/torch_script/neural-sparse_opensearch-neural-sparse-encoding-v1-1.0.1-torch_script.zip)
- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-v1/1.0.1/torch_script/config.json) | A neural sparse encoding model. The model transforms text into a sparse vector, identifies the indexes of non-zero elements in the vector, and then converts the vector into `` pairs, where each entry corresponds to a non-zero element index. To experiment with this model using transformers and the PyTorch API, see the [HuggingFace documentation](https://huggingface.co/opensearch-project/opensearch-neural-sparse-encoding-v1). | | `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` | 1.0.1 | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1/1.0.1/torch_script/neural-sparse_opensearch-neural-sparse-encoding-doc-v1-1.0.1-torch_script.zip)
- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1/1.0.1/torch_script/config.json) | A neural sparse encoding model. The model transforms text into a sparse vector, identifies the indexes of non-zero elements in the vector, and then converts the vector into `` pairs, where each entry corresponds to a non-zero element index. To experiment with this model using transformers and the PyTorch API, see the [HuggingFace documentation](https://huggingface.co/opensearch-project/opensearch-neural-sparse-encoding-doc-v1). | -| `amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1` | 1.0.1 | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1/1.0.1/torch_script/neural-sparse_opensearch-neural-sparse-tokenizer-v1-1.0.1-torch_script.zip)
- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1/1.0.1/torch_script/config.json) | A neural sparse tokenizer model. The model tokenizes text into tokens and assigns each token a predefined weight, which is the token's inverse document frequency (IDF). If the IDF file is not provided, the weight defaults to 1. For more information, see [Preparing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/custom-local-models/#preparing-a-model). | +| `amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1` | 1.0.1 | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1/1.0.1/torch_script/neural-sparse_opensearch-neural-sparse-tokenizer-v1-1.0.1-torch_script.zip)
- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1/1.0.1/torch_script/config.json) | A neural sparse tokenizer. The tokenizer splits text into tokens and assigns each token a predefined weight, which is the token's inverse document frequency (IDF). If the IDF file is not provided, the weight defaults to 1. For more information, see [Preparing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/custom-local-models/#preparing-a-model). | ### Cross-encoder models **Introduced 2.12** diff --git a/_search-plugins/neural-sparse-search.md b/_search-plugins/neural-sparse-search.md index 8aa2ff7dbf..0beee26ef0 100644 --- a/_search-plugins/neural-sparse-search.md +++ b/_search-plugins/neural-sparse-search.md @@ -2,7 +2,7 @@ layout: default title: Neural sparse search nav_order: 50 -has_children: false +has_children: true redirect_from: - /search-plugins/neural-sparse-search/ - /search-plugins/sparse-search/ @@ -14,261 +14,20 @@ Introduced 2.11 [Semantic search]({{site.url}}{{site.baseurl}}/search-plugins/semantic-search/) relies on dense retrieval that is based on text embedding models. However, dense methods use k-NN search, which consumes a large amount of memory and CPU resources. An alternative to semantic search, neural sparse search is implemented using an inverted index and is thus as efficient as BM25. Neural sparse search is facilitated by sparse embedding models. When you perform a neural sparse search, it creates a sparse vector (a list of `token: weight` key-value pairs representing an entry and its weight) and ingests data into a rank features index. -When selecting a model, choose one of the following options: +To further boost search relevance, you can combine neural sparse search with dense [semantic search]({{site.url}}{{site.baseurl}}/search-plugins/semantic-search/) using a [hybrid query]({{site.url}}{{site.baseurl}}/query-dsl/compound/hybrid/). -- Use a sparse encoding model at both ingestion time and search time for better search relevance at the expense of relatively high latency. -- Use a sparse encoding model at ingestion time and a tokenizer at search time for lower search latency at the expense of relatively lower search relevance. Tokenization doesn't involve model inference, so you can deploy and invoke a tokenizer using the ML Commons Model API for a more streamlined experience. +You can configure neural sparse search in the following ways: -**PREREQUISITE**
-Before using neural sparse search, make sure to set up a [pretrained sparse embedding model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/#sparse-encoding-models) or your own sparse embedding model. For more information, see [Choosing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/#choosing-a-model). -{: .note} +- Generate vector embeddings within OpenSearch: Configure an ingest pipeline to generate and store sparse vector embeddings from document text at ingestion time. At query time, input plain text, which will be automatically converted into vector embeddings for search. For complete setup steps, see [Configuring ingest pipelines for neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-with-pipelines/). +- Ingest raw sparse vectors and search using sparse vectors directly. For complete setup steps, see [Ingesting and searching raw vectors]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-with-raw-vectors/). -## Using neural sparse search +To learn more about splitting long text into passages for neural search, see [Text chunking]({{site.url}}{{site.baseurl}}/search-plugins/text-chunking/). -To use neural sparse search, follow these steps: +## Accelerating neural sparse search -1. [Create an ingest pipeline](#step-1-create-an-ingest-pipeline). -1. [Create an index for ingestion](#step-2-create-an-index-for-ingestion). -1. [Ingest documents into the index](#step-3-ingest-documents-into-the-index). -1. [Search the index using neural search](#step-4-search-the-index-using-neural-sparse-search). -1. _Optional_ [Create and enable the two-phase processor](#step-5-create-and-enable-the-two-phase-processor-optional). +Starting with OpenSearch version 2.15, you can significantly accelerate the search process by creating a search pipeline with a `neural_sparse_two_phase_processor`. -## Step 1: Create an ingest pipeline - -To generate vector embeddings, you need to create an [ingest pipeline]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/index/) that contains a [`sparse_encoding` processor]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/processors/sparse-encoding/), which will convert the text in a document field to vector embeddings. The processor's `field_map` determines the input fields from which to generate vector embeddings and the output fields in which to store the embeddings. - -The following example request creates an ingest pipeline where the text from `passage_text` will be converted into text embeddings and the embeddings will be stored in `passage_embedding`: - -```json -PUT /_ingest/pipeline/nlp-ingest-pipeline-sparse -{ - "description": "An sparse encoding ingest pipeline", - "processors": [ - { - "sparse_encoding": { - "model_id": "aP2Q8ooBpBj3wT4HVS8a", - "field_map": { - "passage_text": "passage_embedding" - } - } - } - ] -} -``` -{% include copy-curl.html %} - -To split long text into passages, use the `text_chunking` ingest processor before the `sparse_encoding` processor. For more information, see [Text chunking]({{site.url}}{{site.baseurl}}/search-plugins/text-chunking/). - - -## Step 2: Create an index for ingestion - -In order to use the text embedding processor defined in your pipeline, create a rank features index, adding the pipeline created in the previous step as the default pipeline. Ensure that the fields defined in the `field_map` are mapped as correct types. Continuing with the example, the `passage_embedding` field must be mapped as [`rank_features`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/rank/#rank-features). Similarly, the `passage_text` field should be mapped as `text`. - -The following example request creates a rank features index that is set up with a default ingest pipeline: - -```json -PUT /my-nlp-index -{ - "settings": { - "default_pipeline": "nlp-ingest-pipeline-sparse" - }, - "mappings": { - "properties": { - "id": { - "type": "text" - }, - "passage_embedding": { - "type": "rank_features" - }, - "passage_text": { - "type": "text" - } - } - } -} -``` -{% include copy-curl.html %} - -To save disk space, you can exclude the embedding vector from the source as follows: - -```json -PUT /my-nlp-index -{ - "settings": { - "default_pipeline": "nlp-ingest-pipeline-sparse" - }, - "mappings": { - "_source": { - "excludes": [ - "passage_embedding" - ] - }, - "properties": { - "id": { - "type": "text" - }, - "passage_embedding": { - "type": "rank_features" - }, - "passage_text": { - "type": "text" - } - } - } -} -``` -{% include copy-curl.html %} - -Once the `` pairs are excluded from the source, they cannot be recovered. Before applying this optimization, make sure you don't need the `` pairs for your application. -{: .important} - -## Step 3: Ingest documents into the index - -To ingest documents into the index created in the previous step, send the following requests: - -```json -PUT /my-nlp-index/_doc/1 -{ - "passage_text": "Hello world", - "id": "s1" -} -``` -{% include copy-curl.html %} - -```json -PUT /my-nlp-index/_doc/2 -{ - "passage_text": "Hi planet", - "id": "s2" -} -``` -{% include copy-curl.html %} - -Before the document is ingested into the index, the ingest pipeline runs the `sparse_encoding` processor on the document, generating vector embeddings for the `passage_text` field. The indexed document includes the `passage_text` field, which contains the original text, and the `passage_embedding` field, which contains the vector embeddings. - -## Step 4: Search the index using neural sparse search - -To perform a neural sparse search on your index, use the `neural_sparse` query clause in [Query DSL]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/index/) queries. - -The following example request uses a `neural_sparse` query to search for relevant documents using a raw text query: - -```json -GET my-nlp-index/_search -{ - "query": { - "neural_sparse": { - "passage_embedding": { - "query_text": "Hi world", - "model_id": "aP2Q8ooBpBj3wT4HVS8a" - } - } - } -} -``` -{% include copy-curl.html %} - -The response contains the matching documents: - -```json -{ - "took" : 688, - "timed_out" : false, - "_shards" : { - "total" : 1, - "successful" : 1, - "skipped" : 0, - "failed" : 0 - }, - "hits" : { - "total" : { - "value" : 2, - "relation" : "eq" - }, - "max_score" : 30.0029, - "hits" : [ - { - "_index" : "my-nlp-index", - "_id" : "1", - "_score" : 30.0029, - "_source" : { - "passage_text" : "Hello world", - "passage_embedding" : { - "!" : 0.8708904, - "door" : 0.8587369, - "hi" : 2.3929274, - "worlds" : 2.7839446, - "yes" : 0.75845814, - "##world" : 2.5432441, - "born" : 0.2682308, - "nothing" : 0.8625516, - "goodbye" : 0.17146169, - "greeting" : 0.96817183, - "birth" : 1.2788506, - "come" : 0.1623208, - "global" : 0.4371151, - "it" : 0.42951578, - "life" : 1.5750692, - "thanks" : 0.26481047, - "world" : 4.7300377, - "tiny" : 0.5462298, - "earth" : 2.6555297, - "universe" : 2.0308156, - "worldwide" : 1.3903781, - "hello" : 6.696973, - "so" : 0.20279501, - "?" : 0.67785245 - }, - "id" : "s1" - } - }, - { - "_index" : "my-nlp-index", - "_id" : "2", - "_score" : 16.480486, - "_source" : { - "passage_text" : "Hi planet", - "passage_embedding" : { - "hi" : 4.338913, - "planets" : 2.7755864, - "planet" : 5.0969057, - "mars" : 1.7405145, - "earth" : 2.6087382, - "hello" : 3.3210192 - }, - "id" : "s2" - } - } - ] - } -} -``` - -You can also use the `neural_sparse` query with sparse vector embeddings: -```json -GET my-nlp-index/_search -{ - "query": { - "neural_sparse": { - "passage_embedding": { - "query_tokens": { - "hi" : 4.338913, - "planets" : 2.7755864, - "planet" : 5.0969057, - "mars" : 1.7405145, - "earth" : 2.6087382, - "hello" : 3.3210192 - } - } - } - } -} -``` -## Step 5: Create and enable the two-phase processor (Optional) - - -The `neural_sparse_two_phase_processor` is a new feature introduced in OpenSearch 2.15. Using the two-phase processor can significantly improve the performance of neural sparse queries. - -To quickly launch a search pipeline with neural sparse search, use the following example pipeline: +To create a search pipeline with a two-phase processor for neural sparse search, use the following request: ```json PUT /_search/pipeline/two_phase_search_pipeline @@ -277,7 +36,7 @@ PUT /_search/pipeline/two_phase_search_pipeline { "neural_sparse_two_phase_processor": { "tag": "neural-sparse", - "description": "This processor is making two-phase processor." + "description": "Creates a two-phase processor for neural sparse search." } } ] @@ -286,166 +45,18 @@ PUT /_search/pipeline/two_phase_search_pipeline {% include copy-curl.html %} Then choose the index you want to configure with the search pipeline and set the `index.search.default_pipeline` to the pipeline name, as shown in the following example: -```json -PUT /index-name/_settings -{ - "index.search.default_pipeline" : "two_phase_search_pipeline" -} -``` -{% include copy-curl.html %} - - - -## Setting a default model on an index or field - -A [`neural_sparse`]({{site.url}}{{site.baseurl}}/query-dsl/specialized/neural-sparse/) query requires a model ID for generating sparse embeddings. To eliminate passing the model ID with each neural_sparse query request, you can set a default model on index-level or field-level. - -First, create a [search pipeline]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/index/) with a [`neural_query_enricher`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/neural-query-enricher/) request processor. To set a default model for an index, provide the model ID in the `default_model_id` parameter. To set a default model for a specific field, provide the field name and the corresponding model ID in the `neural_field_default_id` map. If you provide both `default_model_id` and `neural_field_default_id`, `neural_field_default_id` takes precedence: - -```json -PUT /_search/pipeline/default_model_pipeline -{ - "request_processors": [ - { - "neural_query_enricher" : { - "default_model_id": "bQ1J8ooBpBj3wT4HVUsb", - "neural_field_default_id": { - "my_field_1": "uZj0qYoBMtvQlfhaYeud", - "my_field_2": "upj0qYoBMtvQlfhaZOuM" - } - } - } - ] -} -``` -{% include copy-curl.html %} - -Then set the default model for your index: - -```json -PUT /my-nlp-index/_settings -{ - "index.search.default_pipeline" : "default_model_pipeline" -} -``` -{% include copy-curl.html %} - -You can now omit the model ID when searching: ```json -GET /my-nlp-index/_search +PUT /my-nlp-index/_settings { - "query": { - "neural_sparse": { - "passage_embedding": { - "query_text": "Hi world" - } - } - } + "index.search.default_pipeline" : "two_phase_search_pipeline" } ``` {% include copy-curl.html %} -The response contains both documents: - -```json -{ - "took" : 688, - "timed_out" : false, - "_shards" : { - "total" : 1, - "successful" : 1, - "skipped" : 0, - "failed" : 0 - }, - "hits" : { - "total" : { - "value" : 2, - "relation" : "eq" - }, - "max_score" : 30.0029, - "hits" : [ - { - "_index" : "my-nlp-index", - "_id" : "1", - "_score" : 30.0029, - "_source" : { - "passage_text" : "Hello world", - "passage_embedding" : { - "!" : 0.8708904, - "door" : 0.8587369, - "hi" : 2.3929274, - "worlds" : 2.7839446, - "yes" : 0.75845814, - "##world" : 2.5432441, - "born" : 0.2682308, - "nothing" : 0.8625516, - "goodbye" : 0.17146169, - "greeting" : 0.96817183, - "birth" : 1.2788506, - "come" : 0.1623208, - "global" : 0.4371151, - "it" : 0.42951578, - "life" : 1.5750692, - "thanks" : 0.26481047, - "world" : 4.7300377, - "tiny" : 0.5462298, - "earth" : 2.6555297, - "universe" : 2.0308156, - "worldwide" : 1.3903781, - "hello" : 6.696973, - "so" : 0.20279501, - "?" : 0.67785245 - }, - "id" : "s1" - } - }, - { - "_index" : "my-nlp-index", - "_id" : "2", - "_score" : 16.480486, - "_source" : { - "passage_text" : "Hi planet", - "passage_embedding" : { - "hi" : 4.338913, - "planets" : 2.7755864, - "planet" : 5.0969057, - "mars" : 1.7405145, - "earth" : 2.6087382, - "hello" : 3.3210192 - }, - "id" : "s2" - } - } - ] - } -} -``` - -## Next steps - -- To learn more about splitting long text into passages for neural search, see [Text chunking]({{site.url}}{{site.baseurl}}/search-plugins/text-chunking/). - -## FAQ - -Refer to the following frequently asked questions for more information about neural sparse search. - -### How do I mitigate remote connector throttling exceptions? - -When using connectors to call a remote service like SageMaker, ingestion and search calls sometimes fail due to remote connector throttling exceptions. - -To mitigate throttling exceptions, modify the connector's [`client_config`]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/blueprints/#configuration-parameters) parameter to decrease the number of maximum connections, using the `max_connection` setting to prevent the maximum number of concurrent connections from exceeding the threshold of the remote service. You can also modify the retry settings to flatten the request spike during ingestion. - -For versions earlier than OpenSearch 2.15, the SageMaker throttling exception will be thrown as the following "error": - -``` - { - "type": "status_exception", - "reason": "Error from remote service: {\"message\":null}" - } -``` - +For information about `two_phase_search_pipeline`, see [Neural sparse query two-phase processor]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/neural-sparse-query-two-phase-processor/). -## Next steps +## Further reading -- To learn more about splitting long text into passages for neural search, see [Text chunking]({{site.url}}{{site.baseurl}}/search-plugins/text-chunking/). +- Learn more about how sparse encoding models work and explore OpenSearch neural sparse search benchmarks in [Improving document retrieval with sparse semantic encoders](https://opensearch.org/blog/improving-document-retrieval-with-sparse-semantic-encoders/). +- Learn the fundamentals of neural sparse search and its efficiency in [A deep dive into faster semantic sparse retrieval in OpenSearch 2.12](https://opensearch.org/blog/A-deep-dive-into-faster-semantic-sparse-retrieval-in-OS-2.12/). diff --git a/_search-plugins/neural-sparse-with-pipelines.md b/_search-plugins/neural-sparse-with-pipelines.md new file mode 100644 index 0000000000..fea2f0d795 --- /dev/null +++ b/_search-plugins/neural-sparse-with-pipelines.md @@ -0,0 +1,486 @@ +--- +layout: default +title: Configuring ingest pipelines +parent: Neural sparse search +nav_order: 10 +has_children: false +--- + +# Configuring ingest pipelines for neural sparse search + +Generating sparse vector embeddings within OpenSearch enables neural sparse search to function like lexical search. To take advantage of this encapsulation, set up an ingest pipeline to create and store sparse vector embeddings from document text during ingestion. At query time, input plain text, which will be automatically converted into vector embeddings for search. + +For this tutorial, you'll use neural sparse search with OpenSearch's built-in machine learning (ML) model hosting and ingest pipelines. Because the transformation of text to embeddings is performed within OpenSearch, you'll use text when ingesting and searching documents. + +At ingestion time, neural sparse search uses a sparse encoding model to generate sparse vector embeddings from text fields. + +At query time, neural sparse search operates in one of two search modes: + +- **Bi-encoder mode** (requires a sparse encoding model): A sparse encoding model generates sparse vector embeddings from query text. This approach provides better search relevance at the cost of a slight increase in latency. + +- **Doc-only mode** (requires a sparse encoding model and a tokenizer): A sparse encoding model generates sparse vector embeddings from query text. In this mode, neural sparse search tokenizes query text using a tokenizer and obtains the token weights from a lookup table. This approach provides faster retrieval at the cost of a slight decrease in search relevance. The tokenizer is deployed and invoked using the [Model API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/index/) for a uniform neural sparse search experience. + +For more information about choosing the neural sparse search mode that best suits your workload, see [Choose the search mode](#step-1a-choose-the-search-mode). + +## Tutorial + +This tutorial consists of the following steps: + +1. [**Configure a sparse encoding model/tokenizer**](#step-1-configure-a-sparse-encoding-modeltokenizer). + 1. [Choose the search mode](#step-1a-choose-the-search-mode) + 1. [Register the model/tokenizer](#step-1b-register-the-modeltokenizer) + 1. [Deploy the model/tokenizer](#step-1c-deploy-the-modeltokenizer) +1. [**Ingest data**](#step-2-ingest-data) + 1. [Create an ingest pipeline](#step-2a-create-an-ingest-pipeline) + 1. [Create an index for ingestion](#step-2b-create-an-index-for-ingestion) + 1. [Ingest documents into the index](#step-2c-ingest-documents-into-the-index) +1. [**Search the data**](#step-3-search-the-data) + +### Prerequisites + +Before you start, complete the [prerequisites]({{site.url}}{{site.baseurl}}/search-plugins/neural-search-tutorial/#prerequisites) for neural search. + +## Step 1: Configure a sparse encoding model/tokenizer + +Both the bi-encoder and doc-only search modes require you to configure a sparse encoding model. Doc-only mode requires you to configure a tokenizer in addition to the model. + +### Step 1(a): Choose the search mode + +Choose the search mode and the appropriate model/tokenizer combination: + +- **Bi-encoder**: Use the `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1` model during both ingestion and search. + +- **Doc-only**: Use the `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` model during ingestion and the `amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1` tokenizer during search. + +The following table provides a search relevance comparison for the two search modes so that you can choose the best mode for your use case. + +| Mode | Ingestion model | Search model | Avg search relevance on BEIR | Model parameters | +|-----------|---------------------------------------------------------------|---------------------------------------------------------------|------------------------------|------------------| +| Doc-only | `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` | `amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1` | 0.49 | 133M | +| Bi-encoder| `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1` | `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1` | 0.524 | 133M | + +### Step 1(b): Register the model/tokenizer + +When you register a model/tokenizer, OpenSearch creates a model group for the model/tokenizer. You can also explicitly create a model group before registering models. For more information, see [Model access control]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-access-control/). + +#### Bi-encoder mode + +When using bi-encoder mode, you only need to register the `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1` model. + +Register the sparse encoding model: + +```json +POST /_plugins/_ml/models/_register?deploy=true +{ + "name": "amazon/neural-sparse/opensearch-neural-sparse-encoding-v1", + "version": "1.0.1", + "model_format": "TORCH_SCRIPT" +} +``` +{% include copy-curl.html %} + +Registering a model is an asynchronous task. OpenSearch returns a task ID for every model you register: + +```json +{ + "task_id": "aFeif4oB5Vm0Tdw8yoN7", + "status": "CREATED" +} +``` + +You can check the status of the task by calling the Tasks API: + +```json +GET /_plugins/_ml/tasks/aFeif4oB5Vm0Tdw8yoN7 +``` +{% include copy-curl.html %} + +Once the task is complete, the task state will change to `COMPLETED` and the Tasks API response will contain the model ID of the registered model: + +```json +{ + "model_id": "", + "task_type": "REGISTER_MODEL", + "function_name": "SPARSE_ENCODING", + "state": "COMPLETED", + "worker_node": [ + "4p6FVOmJRtu3wehDD74hzQ" + ], + "create_time": 1694358489722, + "last_update_time": 1694358499139, + "is_async": true +} +``` + +Note the `model_id` of the model you've created; you'll need it for the following steps. + +#### Doc-only mode + +When using doc-only mode, you need to register the `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` model, which you'll use at ingestion time, and the `amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1` tokenizer, which you'll use at search time. + +Register the sparse encoding model: + +```json +POST /_plugins/_ml/models/_register?deploy=true +{ + "name": "amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1", + "version": "1.0.1", + "model_format": "TORCH_SCRIPT" +} +``` +{% include copy-curl.html %} + +Register the tokenizer: + +```json +POST /_plugins/_ml/models/_register?deploy=true +{ + "name": "amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1", + "version": "1.0.1", + "model_format": "TORCH_SCRIPT" +} +``` +{% include copy-curl.html %} + +Like in the bi-encoder mode, use the Tasks API to check the status of the registration task. After the Tasks API returns the task state as `COMPLETED`. Note the `model_id` of the model and the tokenizer you've created; you'll need them for the following steps. + +### Step 1(c): Deploy the model/tokenizer + +Next, you'll need to deploy the model/tokenizer you registered. Deploying a model creates a model instance and caches the model in memory. + +#### Bi-encoder mode + +To deploy the model, provide its model ID to the `_deploy` endpoint: + +```json +POST /_plugins/_ml/models//_deploy +``` +{% include copy-curl.html %} + +As with the register operation, the deploy operation is asynchronous, so you'll get a task ID in the response: + +```json +{ + "task_id": "ale6f4oB5Vm0Tdw8NINO", + "status": "CREATED" +} +``` + +You can check the status of the task by using the Tasks API: + +```json +GET /_plugins/_ml/tasks/ale6f4oB5Vm0Tdw8NINO +``` +{% include copy-curl.html %} + +Once the task is complete, the task state will change to `COMPLETED`: + +```json +{ + "model_id": "", + "task_type": "DEPLOY_MODEL", + "function_name": "SPARSE_ENCODING", + "state": "COMPLETED", + "worker_node": [ + "4p6FVOmJRtu3wehDD74hzQ" + ], + "create_time": 1694360024141, + "last_update_time": 1694360027940, + "is_async": true +} +``` + +#### Doc-only mode + +To deploy the model, provide its model ID to the `_deploy` endpoint: + +```json +POST /_plugins/_ml/models//_deploy +``` +{% include copy-curl.html %} + +You can deploy the tokenizer in the same way: + +```json +POST /_plugins/_ml/models//_deploy +``` +{% include copy-curl.html %} + +As with bi-encoder mode, you can check the status of both deploy tasks by using the Tasks API. Once the task is complete, the task state will change to `COMPLETED`. + +## Step 2: Ingest data + +In both the bi-encoder and doc-only modes, you'll use a sparse encoding model at ingestion time to generate sparse vector embeddings. + +### Step 2(a): Create an ingest pipeline + +To generate sparse vector embeddings, you need to create an [ingest pipeline]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/index/) that contains a [`sparse_encoding` processor]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/processors/sparse-encoding/), which will convert the text in a document field to vector embeddings. The processor's `field_map` determines the input fields from which to generate vector embeddings and the output fields in which to store the embeddings. + +The following example request creates an ingest pipeline where the text from `passage_text` will be converted into sparse vector embeddings, which will be stored in `passage_embedding`. Provide the model ID of the registered model in the request: + +```json +PUT /_ingest/pipeline/nlp-ingest-pipeline-sparse +{ + "description": "An sparse encoding ingest pipeline", + "processors": [ + { + "sparse_encoding": { + "model_id": "", + "field_map": { + "passage_text": "passage_embedding" + } + } + } + ] +} +``` +{% include copy-curl.html %} + +To split long text into passages, use the `text_chunking` ingest processor before the `sparse_encoding` processor. For more information, see [Text chunking]({{site.url}}{{site.baseurl}}/search-plugins/text-chunking/). + +### Step 2(b): Create an index for ingestion + +In order to use the sparse encoding processor defined in your pipeline, create a rank features index, adding the pipeline created in the previous step as the default pipeline. Ensure that the fields defined in the `field_map` are mapped as correct types. Continuing with the example, the `passage_embedding` field must be mapped as [`rank_features`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/rank/#rank-features). Similarly, the `passage_text` field must be mapped as `text`. + +The following example request creates a rank features index configured with a default ingest pipeline: + +```json +PUT /my-nlp-index +{ + "settings": { + "default_pipeline": "nlp-ingest-pipeline-sparse" + }, + "mappings": { + "properties": { + "id": { + "type": "text" + }, + "passage_embedding": { + "type": "rank_features" + }, + "passage_text": { + "type": "text" + } + } + } +} +``` +{% include copy-curl.html %} + +To save disk space, you can exclude the embedding vector from the source as follows: + +```json +PUT /my-nlp-index +{ + "settings": { + "default_pipeline": "nlp-ingest-pipeline-sparse" + }, + "mappings": { + "_source": { + "excludes": [ + "passage_embedding" + ] + }, + "properties": { + "id": { + "type": "text" + }, + "passage_embedding": { + "type": "rank_features" + }, + "passage_text": { + "type": "text" + } + } + } +} +``` +{% include copy-curl.html %} + +Once the `` pairs are excluded from the source, they cannot be recovered. Before applying this optimization, make sure you don't need the `` pairs for your application. +{: .important} + +### Step 2(c): Ingest documents into the index + +To ingest documents into the index created in the previous step, send the following requests: + +```json +PUT /my-nlp-index/_doc/1 +{ + "passage_text": "Hello world", + "id": "s1" +} +``` +{% include copy-curl.html %} + +```json +PUT /my-nlp-index/_doc/2 +{ + "passage_text": "Hi planet", + "id": "s2" +} +``` +{% include copy-curl.html %} + +Before the document is ingested into the index, the ingest pipeline runs the `sparse_encoding` processor on the document, generating vector embeddings for the `passage_text` field. The indexed document includes the `passage_text` field, which contains the original text, and the `passage_embedding` field, which contains the vector embeddings. + +## Step 3: Search the data + +To perform a neural sparse search on your index, use the `neural_sparse` query clause in [Query DSL]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/index/) queries. + +The following example request uses a `neural_sparse` query to search for relevant documents using a raw text query. Provide the model ID for bi-encoder mode or the tokenizer ID for doc-only mode: + +```json +GET my-nlp-index/_search +{ + "query": { + "neural_sparse": { + "passage_embedding": { + "query_text": "Hi world", + "model_id": "" + } + } + } +} +``` +{% include copy-curl.html %} + +The response contains the matching documents: + +```json +{ + "took" : 688, + "timed_out" : false, + "_shards" : { + "total" : 1, + "successful" : 1, + "skipped" : 0, + "failed" : 0 + }, + "hits" : { + "total" : { + "value" : 2, + "relation" : "eq" + }, + "max_score" : 30.0029, + "hits" : [ + { + "_index" : "my-nlp-index", + "_id" : "1", + "_score" : 30.0029, + "_source" : { + "passage_text" : "Hello world", + "passage_embedding" : { + "!" : 0.8708904, + "door" : 0.8587369, + "hi" : 2.3929274, + "worlds" : 2.7839446, + "yes" : 0.75845814, + "##world" : 2.5432441, + "born" : 0.2682308, + "nothing" : 0.8625516, + "goodbye" : 0.17146169, + "greeting" : 0.96817183, + "birth" : 1.2788506, + "come" : 0.1623208, + "global" : 0.4371151, + "it" : 0.42951578, + "life" : 1.5750692, + "thanks" : 0.26481047, + "world" : 4.7300377, + "tiny" : 0.5462298, + "earth" : 2.6555297, + "universe" : 2.0308156, + "worldwide" : 1.3903781, + "hello" : 6.696973, + "so" : 0.20279501, + "?" : 0.67785245 + }, + "id" : "s1" + } + }, + { + "_index" : "my-nlp-index", + "_id" : "2", + "_score" : 16.480486, + "_source" : { + "passage_text" : "Hi planet", + "passage_embedding" : { + "hi" : 4.338913, + "planets" : 2.7755864, + "planet" : 5.0969057, + "mars" : 1.7405145, + "earth" : 2.6087382, + "hello" : 3.3210192 + }, + "id" : "s2" + } + } + ] + } +} +``` + +## Accelerating neural sparse search + +To learn more about improving retrieval time for neural sparse search, see [Accelerating neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/#accelerating-neural-sparse-search). + +## Creating a search pipeline for neural sparse search + +You can create a search pipeline that augments neural sparse search functionality by: + +- Accelerating neural sparse search for faster retrieval. +- Setting the default model ID on an index for easier use. + +To configure the pipeline, add a [`neural_sparse_two_phase_processor`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/neural-sparse-query-two-phase-processor/) or a [`neural_query_enricher`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/neural-query-enricher/) processor. The following request creates a pipeline with both processors: + +```json +PUT /_search/pipeline/neural_search_pipeline +{ + "request_processors": [ + { + "neural_sparse_two_phase_processor": { + "tag": "neural-sparse", + "description": "Creates a two-phase processor for neural sparse search." + } + }, + { + "neural_query_enricher" : { + "default_model_id": "" + } + } + ] +} +``` +{% include copy-curl.html %} + +Then set the default pipeline for your index to the newly created search pipeline: + +```json +PUT /my-nlp-index/_settings +{ + "index.search.default_pipeline" : "neural_search_pipeline" +} +``` +{% include copy-curl.html %} + +For more information about setting a default model on an index, or to learn how to set a default model on a specific field, see [Setting a default model on an index or field]({{site.url}}{{site.baseurl}}/search-plugins/semantic-search/#setting-a-default-model-on-an-index-or-field). + +## Troubleshooting + +This section contains information about resolving common issues encountered while running neural sparse search. + +### Remote connector throttling exceptions + +When using connectors to call a remote service such as Amazon SageMaker, ingestion and search calls sometimes fail because of remote connector throttling exceptions. + +For OpenSearch versions earlier than 2.15, a throttling exception will be returned as an error from the remote service: + +```json +{ + "type": "status_exception", + "reason": "Error from remote service: {\"message\":null}" +} +``` + +To mitigate throttling exceptions, decrease the maximum number of connections specified in the `max_connection` setting in the connector's [`client_config`]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/blueprints/#configuration-parameters) object. Doing so will prevent the maximum number of concurrent connections from exceeding the threshold of the remote service. You can also modify the retry settings to avoid a request spike during ingestion. \ No newline at end of file diff --git a/_search-plugins/neural-sparse-with-raw-vectors.md b/_search-plugins/neural-sparse-with-raw-vectors.md new file mode 100644 index 0000000000..d69a789a1d --- /dev/null +++ b/_search-plugins/neural-sparse-with-raw-vectors.md @@ -0,0 +1,99 @@ +--- +layout: default +title: Using raw vectors +parent: Neural sparse search +nav_order: 20 +has_children: false +--- + +# Using raw vectors for neural sparse search + +If you're using self-hosted sparse embedding models, you can ingest raw sparse vectors and use neural sparse search. + +## Tutorial + +This tutorial consists of the following steps: + +1. [**Ingest sparse vectors**](#step-1-ingest-sparse-vectors) + 1. [Create an index](#step-1a-create-an-index) + 1. [Ingest documents into the index](#step-1b-ingest-documents-into-the-index) +1. [**Search the data using raw sparse vector**](#step-2-search-the-data-using-a-sparse-vector). + + +## Step 1: Ingest sparse vectors + +Once you have generated sparse vector embeddings, you can directly ingest them into OpenSearch. + +### Step 1(a): Create an index + +In order to ingest documents containing raw sparse vectors, create a rank features index: + +```json +PUT /my-nlp-index +{ + "mappings": { + "properties": { + "id": { + "type": "text" + }, + "passage_embedding": { + "type": "rank_features" + }, + "passage_text": { + "type": "text" + } + } + } +} +``` +{% include copy-curl.html %} + +### Step 1(b): Ingest documents into the index + +To ingest documents into the index created in the previous step, send the following request: + +```json +PUT /my-nlp-index/_doc/1 +{ + "passage_text": "Hello world", + "id": "s1", + "passage_embedding": { + "hi" : 4.338913, + "planets" : 2.7755864, + "planet" : 5.0969057, + "mars" : 1.7405145, + "earth" : 2.6087382, + "hello" : 3.3210192 + } +} +``` +{% include copy-curl.html %} + +## Step 2: Search the data using a sparse vector + +To search the documents using a sparse vector, provide the sparse embeddings in the `neural_sparse` query: + +```json +GET my-nlp-index/_search +{ + "query": { + "neural_sparse": { + "passage_embedding": { + "query_tokens": { + "hi" : 4.338913, + "planets" : 2.7755864, + "planet" : 5.0969057, + "mars" : 1.7405145, + "earth" : 2.6087382, + "hello" : 3.3210192 + } + } + } + } +} +``` +{% include copy-curl.html %} + +## Accelerating neural sparse search + +To learn more about improving retrieval time for neural sparse search, see [Accelerating neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/#accelerating-neural-sparse-search). From 61396f73690bfb6bc8694cbe7cc38135e41ca01f Mon Sep 17 00:00:00 2001 From: Daniel Widdis Date: Thu, 15 Aug 2024 06:39:39 -0700 Subject: [PATCH 003/190] Add missing link to Get model group API (#7992) Signed-off-by: Daniel Widdis --- _ml-commons-plugin/api/model-group-apis/index.md | 1 + 1 file changed, 1 insertion(+) diff --git a/_ml-commons-plugin/api/model-group-apis/index.md b/_ml-commons-plugin/api/model-group-apis/index.md index 6df8b3e8fe..85dabf3c3b 100644 --- a/_ml-commons-plugin/api/model-group-apis/index.md +++ b/_ml-commons-plugin/api/model-group-apis/index.md @@ -13,5 +13,6 @@ ML Commons supports the following model-group-level APIs: - [Register model group]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-group-apis/register-model-group/) - [Update model group]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-group-apis/update-model-group/) +- [Get model group]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-group-apis/get-model-group/) - [Search model group]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-group-apis/search-model-group/) - [Delete model group]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-group-apis/delete-model-group/) \ No newline at end of file From d9d19fcec6da1d0922c96350214fec1215be195f Mon Sep 17 00:00:00 2001 From: Pawel Wlodarczyk Date: Thu, 15 Aug 2024 14:40:29 +0100 Subject: [PATCH 004/190] Update rolling-upgrade.md (#7993) Signed-off-by: Pawel Wlodarczyk --- _install-and-configure/upgrade-opensearch/rolling-upgrade.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_install-and-configure/upgrade-opensearch/rolling-upgrade.md b/_install-and-configure/upgrade-opensearch/rolling-upgrade.md index f6b0470b66..1e4145e7ba 100644 --- a/_install-and-configure/upgrade-opensearch/rolling-upgrade.md +++ b/_install-and-configure/upgrade-opensearch/rolling-upgrade.md @@ -181,7 +181,7 @@ Review [Upgrading OpenSearch]({{site.url}}{{site.baseurl}}/upgrade-opensearch/in "active_shards_percent_as_number" : 100.0 } ``` -1. Repeat steps 5 through 11 for each node in your cluster. Remember to upgrade an eligible cluster manager node last. After replacing the last node, query the `_cat/nodes` endpoint to confirm that all nodes have joined the cluster. The cluster is now bootstrapped to the new version of OpenSearch. You can verify the cluster version by querying the `_cat/nodes` API endpoint: +1. Repeat steps 2 through 11 for each node in your cluster. Remember to upgrade an eligible cluster manager node last. After replacing the last node, query the `_cat/nodes` endpoint to confirm that all nodes have joined the cluster. The cluster is now bootstrapped to the new version of OpenSearch. You can verify the cluster version by querying the `_cat/nodes` API endpoint: ```bash GET "/_cat/nodes?v&h=name,version,node.role,master" | column -t ``` From e5c6395507d12f0c3ec8ce681f9be5e5d9c0ecd0 Mon Sep 17 00:00:00 2001 From: Jun Ohtani Date: Thu, 15 Aug 2024 22:41:00 +0900 Subject: [PATCH 005/190] Fix typo in the logs index mappings (#7986) Signed-off-by: Jun Ohtani --- _field-types/supported-field-types/derived.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_field-types/supported-field-types/derived.md b/_field-types/supported-field-types/derived.md index 2ca00927d1..d989c3e4a4 100644 --- a/_field-types/supported-field-types/derived.md +++ b/_field-types/supported-field-types/derived.md @@ -69,7 +69,7 @@ PUT logs } } }, - "client_ip": { + "clientip": { "type": "keyword" } } From e3396512a913948dbbc8c6d525cd407bd3abb37a Mon Sep 17 00:00:00 2001 From: AntonEliatra Date: Thu, 15 Aug 2024 14:50:15 +0100 Subject: [PATCH 006/190] Adding search shard routing docs (#7656) * Adding documentation for search-shard-routing #7507 Signed-off-by: AntonEliatra * Adding documentation for search-shard-routing #7507 Signed-off-by: AntonEliatra * Update search-shard-routing.md Signed-off-by: AntonEliatra * fixing typo Signed-off-by: AntonEliatra * updating details as per comments Signed-off-by: AntonEliatra * Update search-shard-routing.md Signed-off-by: AntonEliatra * Apply suggestions from code review Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Signed-off-by: AntonEliatra * moving search shard routing to a new location and updating as per PR comments Signed-off-by: Anton Rubin * adding link to configuring static and dymanic settings Signed-off-by: Anton Rubin * Apply suggestions from code review Co-authored-by: Nathan Bower Signed-off-by: AntonEliatra * Update search-shard-routing.md Signed-off-by: AntonEliatra * Apply suggestions from code review Co-authored-by: Nathan Bower Signed-off-by: AntonEliatra --------- Signed-off-by: AntonEliatra Signed-off-by: Anton Rubin Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Co-authored-by: Nathan Bower --- .../cluster-settings.md | 2 + .../searching-data/search-shard-routing.md | 212 ++++++++++++++++++ 2 files changed, 214 insertions(+) create mode 100644 _search-plugins/searching-data/search-shard-routing.md diff --git a/_install-and-configure/configuring-opensearch/cluster-settings.md b/_install-and-configure/configuring-opensearch/cluster-settings.md index 1bda6db262..9af0f5c5b1 100644 --- a/_install-and-configure/configuring-opensearch/cluster-settings.md +++ b/_install-and-configure/configuring-opensearch/cluster-settings.md @@ -106,6 +106,8 @@ OpenSearch supports the following cluster-level routing and shard allocation set OpenSearch supports the following cluster-level shard, block, and task settings: +- `action.search.shard_count.limit` (Integer): Limits the maximum number of shards to be hit during search. Requests that exceed this limit will be rejected. + - `cluster.blocks.read_only` (Boolean): Sets the entire cluster to read-only. Default is `false`. - `cluster.blocks.read_only_allow_delete` (Boolean): Similar to `cluster.blocks.read_only`, but allows you to delete indexes. diff --git a/_search-plugins/searching-data/search-shard-routing.md b/_search-plugins/searching-data/search-shard-routing.md new file mode 100644 index 0000000000..77c5fc7ce4 --- /dev/null +++ b/_search-plugins/searching-data/search-shard-routing.md @@ -0,0 +1,212 @@ +--- +layout: default +parent: Searching data +title: Search shard routing +nav_order: 70 +--- + +# Search shard routing + +To ensure redundancy and improve search performance, OpenSearch distributes index data across multiple primary shards, with each primary shard having one or more replica shards. When a search query is executed, OpenSearch routes the request to a node containing either a primary or replica index shard. This technique is known as _search shard routing_. + + +## Adaptive replica selection + +In order to improve latency, search requests are routed using _adaptive replica selection_, which chooses the nodes based on the following factors: + +- The amount of time it took a particular node to run previous requests. +- The latency between the coordinating node and the selected node. +- The queue size of the node's search thread pool. + +If you have permissions to call the OpenSearch REST APIs, you can turn off search shard routing. For more information about REST API user access, see [REST management API settings]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/security-settings/#rest-management-api-settings). To disable search shard routing, update the cluster settings as follows: + +```json +PUT /_cluster/settings +{ + "persistent": { + "cluster.routing.use_adaptive_replica_selection": false + } +} +``` +{% include copy-curl.html %} + +If you turn off search shard routing, OpenSearch will use round-robin routing, which can negatively impact search latency. +{: .note} + +## Node and shard selection during searches + +OpenSearch uses all nodes to choose the best routing for search requests. However, in some cases you may want to manually select the nodes or shards to which the search request is sent, including the following: + +- Using cached previous searches. +- Dedicating specific hardware to searches. +- Using only local nodes for searches. + +You can use the `preference` parameter in the search query to indicate the search destination. The following is a complete list of available options: + +1. `_primary`: Forces the search to execute only on primary shards. + + ```json + GET /my-index/_search?preference=_primary + ``` + {% include copy-curl.html %} + +2. `_primary_first`: Prefers primary shards but will use replica shards if the primary shards are not available. + + ```json + GET /my-index/_search?preference=_primary_first + ``` + {% include copy-curl.html %} + +3. `_replica`: Forces the search to execute only on replica shards. + + ```json + GET /my-index/_search?preference=_replica + ``` + {% include copy-curl.html %} + +4. `_replica_first`: Prefers replica shards but will use primary shards if no replica shards are available. + + ```json + GET /my-index/_search?preference=_replica_first + ``` + {% include copy-curl.html %} + +5. `_only_nodes:,`: Limits the search to execute only on specific nodes according to their IDs. + + ```json + GET /my-index/_search?preference=_only_nodes:node-1,node-2 + ``` + {% include copy-curl.html %} + +6. `_prefer_nodes:,`: Prefers to execute the search on specific nodes but will use other nodes if the preferred nodes are not available. + + ```json + GET /my-index/_search?preference=_prefer_nodes:node-1,node-2 + ``` + {% include copy-curl.html %} + +7. `_shards:,`: Limits the search to specific shards. + + ```json + GET /my-index/_search?preference=_shards:0,1 + ``` + {% include copy-curl.html %} + +8. `_local`: Executes the search on the local node if possible, which can reduce latency. + + ```json + GET /my-index/_search?preference=_local + ``` + {% include copy-curl.html %} + +9. Custom string: You can use any custom string as the preference value. This custom string ensures that requests containing the same string are routed to the same shards consistently, which can be useful for caching. + + ```json + GET /my-index/_search?preference=custom_string + ``` + {% include copy-curl.html %} + +## Custom routing during index and search + +You can specify routing during both indexing and search operations. + +### Routing during indexing +When you index a document, OpenSearch calculates a hash of the routing value and uses this hash to determine the shard on which the document will be stored. If you don't specify a routing value, OpenSearch uses the document ID to calculate the hash. + +The following is an example index operation with a routing value: + +```json +POST /index1/_doc/1?routing=user1 +{ + "name": "John Doe", + "age": 20 +} +``` +{% include copy-curl.html %} + +In this example, the document with ID `1` is indexed with the routing value `user1`. All documents with the same routing value will be stored on the same shard. + +### Routing during searches + +When you search for documents, specifying the same routing value ensures that the search request is routed to the appropriate shard. This can significantly improve performance by reducing the number of shards that need to be queried. + +The following example request searches with a specific routing value: + +```json +GET /index1/_search?routing=user1 +{ + "query": { + "match": { + "name": "John Doe" + } + } +} +``` +{% include copy-curl.html %} + +In this example, the search query is routed to the shard containing documents indexed with the routing value `user1`. + +Caution needs to be exercised when using custom routing in order to prevent hot spots and data skew: + + - A _hot spot_ occurs when a disproportionate number of documents are routed to a single shard. This can lead to that shard becoming a bottleneck because it will have to handle more read and write operations compared to other shards. Consequently, this shard may experience higher CPU, memory, and I/O usage, leading to performance degradation. + + - _Data skew_ refers to an uneven distribution of data across shards. If routing values are not evenly distributed, some shards may end up storing significantly more data than others. This can result in imbalanced storage usage, where certain nodes have a much higher disk utilization compared to others. + +## Concurrent shard request + +Hitting a large number of shards simultaneously during a search can significantly impact CPU and memory consumption. By default, OpenSearch does not reject these requests. However, there are a number of methods that you can use to mitigate this risk. The following sections describe these methods. + +### Limit the number of shards that can be queried concurrently + +You can use the `max_concurrent_shard_requests` parameter in the search request to limit the number of shards that can be queried concurrently. For example, the following request limits the number of concurrent shard requests to `12`: + +```json +GET /index1/_search?max_concurrent_shard_requests=12 +{ + "query": { + "match_all": {} + } +} +``` +{% include copy-curl.html %} + + +### Define a search shard count limit + +You can define the dynamic `action.search.shard_count.limit` setting either in your `opensearch.yml` file or by using the REST API. Any search request that exceeds this limit will be rejected and throw an error. This helps to prevent a single search request from consuming too many resources, which can degrade the performance of the entire cluster. The following example request updates this cluster setting using the API: + +```json +PUT /_cluster/settings +{ + "transient": { + "action.search.shard_count.limit": 1000 + } +} +``` +{% include copy-curl.html %} + +### Search thread pool + +OpenSearch uses thread pools to manage the execution of various tasks, including search operations. The search thread pool is specifically used for search requests. You can adjust the size and queue capacity of the search thread pool by adding the following settings to `opensearch.yml`: +``` +thread_pool.search.size: 100 +thread_pool.search.queue_size: 1000 +``` +This setting is static. For more information about how to configure dynamic and static settings, see [Configuring OpenSearch]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/index/). + +#### Thread pool states + +The following three states describe thread pool operations: + + - _Thread Assignment_: If there are available threads in the search thread pool, then the request is immediately assigned to a thread and begins processing. + + - _Queueing_: If all threads in the search thread pool are busy, then the request is placed in the queue. + + - _Rejection_: If the queue is full (for example, the number of queued requests reaches the queue size limit), then additional incoming search requests are rejected until there is space available in the queue. + +You can check the current configuration of the search thread pool by running the following request: + +```json +GET /_cat/thread_pool/search?v&h=id,name,active,rejected,completed,size,queue_size +``` +{% include copy-curl.html %} From d95a9bf051c5e1cb8cfacbb5fbc8a95e922fa51c Mon Sep 17 00:00:00 2001 From: Landon Lengyel Date: Thu, 15 Aug 2024 09:20:31 -0600 Subject: [PATCH 007/190] Correcting contradictions on SecurityAdmin.sh port (#7989) Signed-off-by: Landon Lengyel Co-authored-by: Landon Lengyel --- _security/configuration/security-admin.md | 2 +- _troubleshoot/security-admin.md | 8 ++++---- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/_security/configuration/security-admin.md b/_security/configuration/security-admin.md index 77d3711385..a03d30fd03 100755 --- a/_security/configuration/security-admin.md +++ b/_security/configuration/security-admin.md @@ -197,7 +197,7 @@ If you run a default OpenSearch installation, which listens on port 9200 and use Name | Description :--- | :--- `-h` | OpenSearch hostname. Default is `localhost`. -`-p` | OpenSearch port. Default is 9200 - not the HTTP port. +`-p` | OpenSearch port. Default is 9200 `-cn` | Cluster name. Default is `opensearch`. `-icl` | Ignore cluster name. `-sniff` | Sniff cluster nodes. Sniffing detects available nodes using the OpenSearch `_cluster/state` API. diff --git a/_troubleshoot/security-admin.md b/_troubleshoot/security-admin.md index f36f1e3b0b..f4770c1ddb 100644 --- a/_troubleshoot/security-admin.md +++ b/_troubleshoot/security-admin.md @@ -24,8 +24,8 @@ If `securityadmin.sh` can't reach the cluster, it outputs: ``` OpenSearch Security Admin v6 -Will connect to localhost:9300 -ERR: Seems there is no opensearch running on localhost:9300 - Will exit +Will connect to localhost:9200 +ERR: Seems there is no opensearch running on localhost:9200 - Will exit ``` @@ -36,9 +36,9 @@ By default, `securityadmin.sh` uses `localhost`. If your cluster runs on any oth ### Check the port -Check that you are running `securityadmin.sh` against the transport port, **not** the HTTP port. +Check that you are running `securityadmin.sh` against the HTTP port, **not** the transport port. -By default, `securityadmin.sh` uses `9300`. If your cluster runs on a different port, use the `-p` option to specify the port number. +By default, `securityadmin.sh` uses `9200`. If your cluster runs on a different port, use the `-p` option to specify the port number. ## None of the configured nodes are available From 3f3364a46cd990d087b448944633b995c14a4033 Mon Sep 17 00:00:00 2001 From: Qi Chen Date: Thu, 15 Aug 2024 10:37:04 -0500 Subject: [PATCH 008/190] [Data Prepper] MAINT: add HTML comment on obfuscate processor config table (#7651) * MAINT: add HTML comment Signed-off-by: George Chen * MNT: address comments Signed-off-by: George Chen * MAINT: period Signed-off-by: George Chen * Update _data-prepper/pipelines/configuration/processors/obfuscate.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Signed-off-by: Qi Chen --------- Signed-off-by: George Chen Signed-off-by: Qi Chen Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> --- .../pipelines/configuration/processors/obfuscate.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/_data-prepper/pipelines/configuration/processors/obfuscate.md b/_data-prepper/pipelines/configuration/processors/obfuscate.md index 8d6bf901da..96b03e7405 100644 --- a/_data-prepper/pipelines/configuration/processors/obfuscate.md +++ b/_data-prepper/pipelines/configuration/processors/obfuscate.md @@ -62,6 +62,13 @@ When run, the `obfuscate` processor parses the fields into the following output: Use the following configuration options with the `obfuscate` processor. + + | Parameter | Required | Description | | :--- | :--- | :--- | | `source` | Yes | The source field to obfuscate. | From de627129900046347ce968fbe16dd037c3ce8c9b Mon Sep 17 00:00:00 2001 From: Melissa Vagi Date: Thu, 15 Aug 2024 14:07:11 -0600 Subject: [PATCH 009/190] Update README.md (#7990) Update points of contact Signed-off-by: Melissa Vagi --- README.md | 1 - 1 file changed, 1 deletion(-) diff --git a/README.md b/README.md index 7d8de14151..66beb1948c 100644 --- a/README.md +++ b/README.md @@ -21,7 +21,6 @@ The following resources provide important guidance regarding contributions to th If you encounter problems or have questions when contributing to the documentation, these people can help: -- [hdhalter](https://github.com/hdhalter) - [kolchfa-aws](https://github.com/kolchfa-aws) - [Naarcha-AWS](https://github.com/Naarcha-AWS) - [vagimeli](https://github.com/vagimeli) From e1fc06541e543ebc552fd4a6cd83dcc825ec1b42 Mon Sep 17 00:00:00 2001 From: Jay Deng Date: Thu, 15 Aug 2024 13:46:21 -0700 Subject: [PATCH 010/190] Remove composite agg limitations for concurrent search (#7904) Signed-off-by: Jay Deng Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> --- _search-plugins/concurrent-segment-search.md | 1 - 1 file changed, 1 deletion(-) diff --git a/_search-plugins/concurrent-segment-search.md b/_search-plugins/concurrent-segment-search.md index 9c0e2da7c6..cbbb993ac9 100644 --- a/_search-plugins/concurrent-segment-search.md +++ b/_search-plugins/concurrent-segment-search.md @@ -95,7 +95,6 @@ Concurrent segment search helps to improve the performance of search requests at The following aggregations do not support the concurrent search model. If a search request contains one of these aggregations, the request will be executed using the non-concurrent path even if concurrent segment search is enabled at the cluster level or index level. - Parent aggregations on [join]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/join/) fields. See [this GitHub issue](https://github.com/opensearch-project/OpenSearch/issues/9316) for more information. - `sampler` and `diversified_sampler` aggregations. See [this GitHub issue](https://github.com/opensearch-project/OpenSearch/issues/11075) for more information. -- Composite aggregations that use scripts. See [this GitHub issue](https://github.com/opensearch-project/OpenSearch/issues/12947) for more information. Composite aggregations without scripts do support concurrent segment search. ## Other considerations From 9b8c68de498c327ee180c25aefb5996e8baf15d3 Mon Sep 17 00:00:00 2001 From: Zelin Hao Date: Thu, 15 Aug 2024 14:29:24 -0700 Subject: [PATCH 011/190] Add new results landing page for website search (#7942) * Add new results landing page for website Signed-off-by: Zelin Hao * Update some features Signed-off-by: Zelin Hao * Update the margin between search results Signed-off-by: Zelin Hao * Update display when no checkbox selected Signed-off-by: Zelin Hao --------- Signed-off-by: Zelin Hao --- _layouts/search_layout.html | 195 ++++++++++++++++++++++++++++++++++++ _sass/custom/custom.scss | 65 ++++++++++++ assets/js/search.js | 81 ++++++++++++++- search.md | 12 +++ 4 files changed, 349 insertions(+), 4 deletions(-) create mode 100644 _layouts/search_layout.html create mode 100644 search.md diff --git a/_layouts/search_layout.html b/_layouts/search_layout.html new file mode 100644 index 0000000000..47b8f25d1c --- /dev/null +++ b/_layouts/search_layout.html @@ -0,0 +1,195 @@ +--- +layout: table_wrappers +--- + + + + +{% include head.html %} + + + + Expand + + + + + + +{% include header.html %} + +
+
+ + + + Results Page Head from layout + +
+ +
+
+ + + + + +
+
+

+
+ +
+
+
+
+ + + +
+
+ +{% include footer.html %} + + + + + + + + diff --git a/_sass/custom/custom.scss b/_sass/custom/custom.scss index 7d7a168fb4..3a9dcc5e6d 100755 --- a/_sass/custom/custom.scss +++ b/_sass/custom/custom.scss @@ -1035,6 +1035,71 @@ body { border-bottom: 1px solid #eeebee; } +.search-page { + display: flex; + align-items: flex-start; + justify-content: center; + gap: 20px; + margin: 0 auto; +} + +.search-page--sidebar { + flex: 1; + max-width: 200px; + flex: 0 0 200px; +} + +.search-page--sidebar--category-filter--checkbox-child { + padding-left: 20px; +} + +.search-page--results { + flex: 3; + display: flex; + flex-direction: column; + align-items: center; + max-width: 60%; +} + +.search-page--results--input { + width: 100%; + position: relative; +} + +.search-page--results--input-box { + width: 100%; + padding: 10px; + margin-bottom: 20px; + border: 1px solid #ccc; + border-radius: 4px; +} + +.search-page--results--input-icon { + position: absolute; + top: 35%; + right: 10px; + transform: translateY(-50%); + pointer-events: none; + color: #333; +} + +.search-page--results--diplay { + width: 100%; + position: relative; + flex-flow: column nowrap; +} + +.search-page--results--diplay--header { + text-align: center; + margin-bottom: 20px; + background-color: transparent; +} + +.search-page--results--diplay--container--item { + margin-bottom: 1%; + display: block; +} + @mixin body-text($color: #000) { color: $color; font-family: 'Open Sans'; diff --git a/assets/js/search.js b/assets/js/search.js index 37de270ebd..8d9cab2ec5 100644 --- a/assets/js/search.js +++ b/assets/js/search.js @@ -13,7 +13,11 @@ const CLASSNAME_HIGHLIGHTED = 'highlighted'; const canSmoothScroll = 'scrollBehavior' in document.documentElement.style; - const docsVersion = elInput.getAttribute('data-docs-version'); + + //Extract version from the URL path + const urlPath = window.location.pathname; + const versionMatch = urlPath.match(/(\d+\.\d+)/); + const docsVersion = versionMatch ? versionMatch[1] : elInput.getAttribute('data-docs-version'); let _showingResults = false, animationFrame, @@ -46,7 +50,7 @@ case 'Enter': e.preventDefault(); - navToHighlightedResult(); + navToResult(); break; } }); @@ -247,9 +251,19 @@ } }; - const navToHighlightedResult = () => { + const navToResultsPage = () => { + const query = encodeURIComponent(elInput.value); + window.location.href = `/docs/${docsVersion}/search.html?q=${query}`; + } + + const navToResult = () => { const searchResultClassName = 'top-banner-search--field-with-results--field--wrapper--search-component--search-results--result'; - elResults.querySelector(`.${searchResultClassName}.highlighted a[href]`)?.click?.(); + const element = elResults.querySelector(`.${searchResultClassName}.highlighted a[href]`); + if (element) { + element.click?.(); + } else { + navToResultsPage(); + } }; const recordEvent = (name, data) => { @@ -261,3 +275,62 @@ }; }); })(); + + +window.doResultsPageSearch = async (query, type, version) => { + console.log("Running results page search!"); + + const searchResultsContainer = document.getElementById('searchPageResultsContainer'); + + try { + const response = await fetch(`https://search-api.opensearch.org/search?q=${query}&v=${version}&t=${type}`); + const data = await response.json(); + // Clear any previous search results + searchResultsContainer.innerHTML = ''; + + if (data.results && data.results.length > 0) { + data.results.forEach(result => { + const resultElement = document.createElement('div'); + resultElement.classList.add('search-page--results--diplay--container--item'); + + const contentCite = document.createElement('cite'); + const crumbs = [...result.ancestors]; + if (result.type === 'DOCS') crumbs.unshift(`OpenSearch ${result.versionLabel || result.version}`); + else if (result.type) crumbs.unshift(result.type); + contentCite.textContent = crumbs.join(' › ')?.replace?.(/ Date: Fri, 16 Aug 2024 22:27:25 +0800 Subject: [PATCH 012/190] Add documentation for v2 neural sparse models (#7987) * update for v2 model Signed-off-by: zhichao-aws * exclude source Signed-off-by: zhichao-aws * Doc review Signed-off-by: Fanit Kolchina * Apply suggestions from code review Co-authored-by: Nathan Bower Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> --------- Signed-off-by: zhichao-aws Signed-off-by: Fanit Kolchina Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Co-authored-by: Fanit Kolchina Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Co-authored-by: Nathan Bower --- .../agents-tools/tools/neural-sparse-tool.md | 6 +-- .../api/model-apis/register-model.md | 4 +- _ml-commons-plugin/pretrained-models.md | 11 ++-- .../neural-sparse-with-pipelines.md | 51 ++++++++++++++----- 4 files changed, 50 insertions(+), 22 deletions(-) diff --git a/_ml-commons-plugin/agents-tools/tools/neural-sparse-tool.md b/_ml-commons-plugin/agents-tools/tools/neural-sparse-tool.md index 9014c585c8..b78d3d641e 100644 --- a/_ml-commons-plugin/agents-tools/tools/neural-sparse-tool.md +++ b/_ml-commons-plugin/agents-tools/tools/neural-sparse-tool.md @@ -20,13 +20,13 @@ The `NeuralSparseSearchTool` performs sparse vector retrieval. For more informat OpenSearch supports several pretrained sparse encoding models. You can either use one of those models or your own custom model. For a list of supported pretrained models, see [Sparse encoding models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/#sparse-encoding-models). For more information, see [OpenSearch-provided pretrained models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/) and [Custom local models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/custom-local-models/). -In this example, you'll use the `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1` pretrained model for both ingestion and search. To register and deploy the model to OpenSearch, send the following request: +In this example, you'll use the `amazon/neural-sparse/opensearch-neural-sparse-encoding-v2-distill` pretrained model for both ingestion and search. To register the model and deploy it to OpenSearch, send the following request: ```json POST /_plugins/_ml/models/_register?deploy=true { - "name": "amazon/neural-sparse/opensearch-neural-sparse-encoding-v1", - "version": "1.0.1", + "name": "amazon/neural-sparse/opensearch-neural-sparse-encoding-v2-distill", + "version": "1.0.0", "model_format": "TORCH_SCRIPT" } ``` diff --git a/_ml-commons-plugin/api/model-apis/register-model.md b/_ml-commons-plugin/api/model-apis/register-model.md index 2a0e9706e9..7d8f6d8cc6 100644 --- a/_ml-commons-plugin/api/model-apis/register-model.md +++ b/_ml-commons-plugin/api/model-apis/register-model.md @@ -95,8 +95,8 @@ Field | Data type | Required/Optional | Description ```json POST /_plugins/_ml/models/_register { - "name": "amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1", - "version": "1.0.1", + "name": "amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v2-distill", + "version": "1.0.0", "model_group_id": "Z1eQf4oB5Vm0Tdw8EIP2", "model_format": "TORCH_SCRIPT" } diff --git a/_ml-commons-plugin/pretrained-models.md b/_ml-commons-plugin/pretrained-models.md index 154b8b530f..1b0c726c33 100644 --- a/_ml-commons-plugin/pretrained-models.md +++ b/_ml-commons-plugin/pretrained-models.md @@ -48,8 +48,8 @@ Sparse encoding models transfer text into a sparse vector and convert the vector We recommend the following combinations for optimal performance: -- Use the `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1` model during both ingestion and search. -- Use the `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` model during ingestion and the +- Use the `amazon/neural-sparse/opensearch-neural-sparse-encoding-v2-distill` model during both ingestion and search. +- Use the `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v2-distill` model during ingestion and the `amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1` tokenizer during search. For more information about the preceding options for running neural sparse search, see [Generating sparse vector embeddings within OpenSearch]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-with-pipelines/). @@ -58,8 +58,11 @@ The following table provides a list of sparse encoding models and artifact links | Model name | Version | Auto-truncation | TorchScript artifact | Description | |:---|:---|:---|:---|:---| -| `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1` | 1.0.1 | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-v1/1.0.1/torch_script/neural-sparse_opensearch-neural-sparse-encoding-v1-1.0.1-torch_script.zip)
- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-v1/1.0.1/torch_script/config.json) | A neural sparse encoding model. The model transforms text into a sparse vector, identifies the indexes of non-zero elements in the vector, and then converts the vector into `` pairs, where each entry corresponds to a non-zero element index. To experiment with this model using transformers and the PyTorch API, see the [HuggingFace documentation](https://huggingface.co/opensearch-project/opensearch-neural-sparse-encoding-v1). | -| `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` | 1.0.1 | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1/1.0.1/torch_script/neural-sparse_opensearch-neural-sparse-encoding-doc-v1-1.0.1-torch_script.zip)
- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1/1.0.1/torch_script/config.json) | A neural sparse encoding model. The model transforms text into a sparse vector, identifies the indexes of non-zero elements in the vector, and then converts the vector into `` pairs, where each entry corresponds to a non-zero element index. To experiment with this model using transformers and the PyTorch API, see the [HuggingFace documentation](https://huggingface.co/opensearch-project/opensearch-neural-sparse-encoding-doc-v1). | +| `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1` | 1.0.1 | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-v1/1.0.1/torch_script/neural-sparse_opensearch-neural-sparse-encoding-v1-1.0.1-torch_script.zip)
- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-v1/1.0.1/torch_script/config.json) | A neural sparse encoding model. The model transforms text into a sparse vector, identifies the indices of non-zero elements in the vector, and then converts the vector into `` pairs, where each entry corresponds to a non-zero element index. To experiment with this model using transformers and the PyTorch API, see the [Hugging Face documentation](https://huggingface.co/opensearch-project/opensearch-neural-sparse-encoding-v1). | +| `amazon/neural-sparse/opensearch-neural-sparse-encoding-v2-distill` | 1.0.0 | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-v2-distill/1.0.0/torch_script/neural-sparse_opensearch-neural-sparse-encoding-v2-distill-1.0.0-torch_script.zip)
- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-v2-distill/1.0.0/torch_script/config.json) | A neural sparse encoding model. The model transforms text into a sparse vector, identifies the indices of non-zero elements in the vector, and then converts the vector into `` pairs, where each entry corresponds to a non-zero element index. To experiment with this model using transformers and the PyTorch API, see the [Hugging Face documentation](https://huggingface.co/opensearch-project/opensearch-neural-sparse-encoding-v2-distill). | +| `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` | 1.0.1 | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1/1.0.1/torch_script/neural-sparse_opensearch-neural-sparse-encoding-doc-v1-1.0.1-torch_script.zip)
- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1/1.0.1/torch_script/config.json) | A neural sparse encoding model. The model transforms text into a sparse vector, identifies the indices of non-zero elements in the vector, and then converts the vector into `` pairs, where each entry corresponds to a non-zero element index. To experiment with this model using transformers and the PyTorch API, see the [Hugging Face documentation](https://huggingface.co/opensearch-project/opensearch-neural-sparse-encoding-doc-v1). | +| `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v2-distill` | 1.0.0 | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v2-distill/1.0.0/torch_script/neural-sparse_opensearch-neural-sparse-encoding-doc-v2-distill-1.0.0-torch_script.zip)
- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v2-distill/1.0.0/torch_script/config.json) | A neural sparse encoding model. The model transforms text into a sparse vector, identifies the indices of non-zero elements in the vector, and then converts the vector into `` pairs, where each entry corresponds to a non-zero element index. To experiment with this model using transformers and the PyTorch API, see the [Hugging Face documentation](https://huggingface.co/opensearch-project/opensearch-neural-sparse-encoding-doc-v2-distill). | +| `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v2-mini` | 1.0.0 | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v2-mini/1.0.0/torch_script/neural-sparse_opensearch-neural-sparse-encoding-doc-v2-mini-1.0.0-torch_script.zip)
- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v2-mini/1.0.0/torch_script/config.json) | A neural sparse encoding model. The model transforms text into a sparse vector, identifies the indices of non-zero elements in the vector, and then converts the vector into `` pairs, where each entry corresponds to a non-zero element index. To experiment with this model using transformers and the PyTorch API, see the [Hugging Face documentation](https://huggingface.co/opensearch-project/opensearch-neural-sparse-encoding-doc-v2-mini). | | `amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1` | 1.0.1 | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1/1.0.1/torch_script/neural-sparse_opensearch-neural-sparse-tokenizer-v1-1.0.1-torch_script.zip)
- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1/1.0.1/torch_script/config.json) | A neural sparse tokenizer. The tokenizer splits text into tokens and assigns each token a predefined weight, which is the token's inverse document frequency (IDF). If the IDF file is not provided, the weight defaults to 1. For more information, see [Preparing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/custom-local-models/#preparing-a-model). | ### Cross-encoder models diff --git a/_search-plugins/neural-sparse-with-pipelines.md b/_search-plugins/neural-sparse-with-pipelines.md index fea2f0d795..ef7044494a 100644 --- a/_search-plugins/neural-sparse-with-pipelines.md +++ b/_search-plugins/neural-sparse-with-pipelines.md @@ -16,9 +16,9 @@ At ingestion time, neural sparse search uses a sparse encoding model to generate At query time, neural sparse search operates in one of two search modes: -- **Bi-encoder mode** (requires a sparse encoding model): A sparse encoding model generates sparse vector embeddings from query text. This approach provides better search relevance at the cost of a slight increase in latency. +- **Bi-encoder mode** (requires a sparse encoding model): A sparse encoding model generates sparse vector embeddings from both documents and query text. This approach provides better search relevance at the cost of an increase in latency. -- **Doc-only mode** (requires a sparse encoding model and a tokenizer): A sparse encoding model generates sparse vector embeddings from query text. In this mode, neural sparse search tokenizes query text using a tokenizer and obtains the token weights from a lookup table. This approach provides faster retrieval at the cost of a slight decrease in search relevance. The tokenizer is deployed and invoked using the [Model API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/index/) for a uniform neural sparse search experience. +- **Doc-only mode** (requires a sparse encoding model and a tokenizer): A sparse encoding model generates sparse vector embeddings from documents. In this mode, neural sparse search tokenizes query text using a tokenizer and obtains the token weights from a lookup table. This approach provides faster retrieval at the cost of a slight decrease in search relevance. The tokenizer is deployed and invoked using the [Model API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/index/) for a uniform neural sparse search experience. For more information about choosing the neural sparse search mode that best suits your workload, see [Choose the search mode](#step-1a-choose-the-search-mode). @@ -48,32 +48,35 @@ Both the bi-encoder and doc-only search modes require you to configure a sparse Choose the search mode and the appropriate model/tokenizer combination: -- **Bi-encoder**: Use the `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1` model during both ingestion and search. +- **Bi-encoder**: Use the `amazon/neural-sparse/opensearch-neural-sparse-encoding-v2-distill` model during both ingestion and search. -- **Doc-only**: Use the `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` model during ingestion and the `amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1` tokenizer during search. +- **Doc-only**: Use the `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v2-distill` model during ingestion and the `amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1` tokenizer during search. -The following table provides a search relevance comparison for the two search modes so that you can choose the best mode for your use case. +The following table provides a search relevance comparison for all available combinations of the two search modes so that you can choose the best combination for your use case. | Mode | Ingestion model | Search model | Avg search relevance on BEIR | Model parameters | |-----------|---------------------------------------------------------------|---------------------------------------------------------------|------------------------------|------------------| | Doc-only | `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` | `amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1` | 0.49 | 133M | +| Doc-only | `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v2-distill` | `amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1` | 0.504 | 67M | +| Doc-only | `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v2-mini` | `amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1` | 0.497 | 23M | | Bi-encoder| `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1` | `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1` | 0.524 | 133M | +| Bi-encoder| `amazon/neural-sparse/opensearch-neural-sparse-encoding-v2-distill` | `amazon/neural-sparse/opensearch-neural-sparse-encoding-v2-distill` | 0.528 | 67M | -### Step 1(b): Register the model/tokenizer +### Step 1(b): Register the model/tokenizer When you register a model/tokenizer, OpenSearch creates a model group for the model/tokenizer. You can also explicitly create a model group before registering models. For more information, see [Model access control]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-access-control/). #### Bi-encoder mode -When using bi-encoder mode, you only need to register the `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1` model. +When using bi-encoder mode, you only need to register the `amazon/neural-sparse/opensearch-neural-sparse-encoding-v2-distill` model. Register the sparse encoding model: ```json POST /_plugins/_ml/models/_register?deploy=true { - "name": "amazon/neural-sparse/opensearch-neural-sparse-encoding-v1", - "version": "1.0.1", + "name": "amazon/neural-sparse/opensearch-neural-sparse-encoding-v2-distill", + "version": "1.0.0", "model_format": "TORCH_SCRIPT" } ``` @@ -116,15 +119,15 @@ Note the `model_id` of the model you've created; you'll need it for the followin #### Doc-only mode -When using doc-only mode, you need to register the `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` model, which you'll use at ingestion time, and the `amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1` tokenizer, which you'll use at search time. +When using doc-only mode, you need to register the `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v2-distill` model, which you'll use at ingestion time, and the `amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1` tokenizer, which you'll use at search time. Register the sparse encoding model: ```json POST /_plugins/_ml/models/_register?deploy=true { - "name": "amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1", - "version": "1.0.1", + "name": "amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v2-distill", + "version": "1.0.0", "model_format": "TORCH_SCRIPT" } ``` @@ -276,7 +279,7 @@ PUT /my-nlp-index "default_pipeline": "nlp-ingest-pipeline-sparse" }, "mappings": { - "_source": { + "_source": { "excludes": [ "passage_embedding" ] @@ -421,6 +424,28 @@ The response contains the matching documents: } ``` +To minimize disk and network I/O latency related to sparse embedding sources, you can exclude the embedding vector source from the query as follows: + +```json +GET my-nlp-index/_search +{ + "_source": { + "excludes": [ + "passage_embedding" + ] + }, + "query": { + "neural_sparse": { + "passage_embedding": { + "query_text": "Hi world", + "model_id": "" + } + } + } +} +``` +{% include copy-curl.html %} + ## Accelerating neural sparse search To learn more about improving retrieval time for neural sparse search, see [Accelerating neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/#accelerating-neural-sparse-search). From c088e5741dd7eb4675756d71b1be7d7f13066718 Mon Sep 17 00:00:00 2001 From: Melissa Vagi Date: Fri, 16 Aug 2024 10:17:41 -0600 Subject: [PATCH 013/190] Update CODEOWNERS (#8003) * Update CODEOWNERS Updated with current list of codeowners Signed-off-by: Melissa Vagi * Update .github/CODEOWNERS Co-authored-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> Signed-off-by: Melissa Vagi --------- Signed-off-by: Melissa Vagi Co-authored-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> --- .github/CODEOWNERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS index 0ec6c5e009..815687fa17 100644 --- a/.github/CODEOWNERS +++ b/.github/CODEOWNERS @@ -1 +1 @@ -* @hdhalter @kolchfa-aws @Naarcha-AWS @vagimeli @AMoo-Miki @natebower @dlvenable @scrawfor99 @epugh +* @kolchfa-aws @Naarcha-AWS @vagimeli @AMoo-Miki @natebower @dlvenable @stephen-crawford @epugh From 8b99242c541ff95e74450ba2adca5e55272b8135 Mon Sep 17 00:00:00 2001 From: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Date: Mon, 19 Aug 2024 08:26:37 -0400 Subject: [PATCH 014/190] Revert the header to the previous version to fix hamburger menu on mobile (#8044) Signed-off-by: Fanit Kolchina --- _includes/header.html | 474 +++++++++++++++++------------------------- 1 file changed, 196 insertions(+), 278 deletions(-) diff --git a/_includes/header.html b/_includes/header.html index b7dce4c317..20d82c451e 100644 --- a/_includes/header.html +++ b/_includes/header.html @@ -1,3 +1,71 @@ +{% assign url_parts = page.url | split: "/" %} +{% if url_parts.size > 0 %} + + {% assign last_url_part = url_parts | last %} + + {% comment %} Does the URL contain a filename, and is it an index.html or not? {% endcomment %} + {% if last_url_part contains ".html" %} + {% assign url_has_filename = true %} + {% if last_url_part == 'index.html' %} + {% assign url_filename_is_index = true %} + {% else %} + {% assign url_filename_is_index = false %} + {% endif %} + {% else %} + {% assign url_has_filename = false %} + {% endif %} + + {% comment %} + OpenSearchCon URLs require some special consideration, because it's a specialization + of the /events URL which is itself a child of Community; te OpenSearchCon menu is NOT + a child of Community. + {% endcomment %} + {% if page.url contains "opensearchcon" %} + {% assign is_conference_page = true %} + {% else %} + {% assign is_conference_page = false %} + {% endif %} + + {% if is_conference_page %} + {% comment %} + If the page is a confernce page and it has a filename then its the penultimate + path component that has the child menu item of the OpenSearchCon that needs + to be marked as in-category. If there's no filename then reference the ultimate + path component. + Unless the filename is opensearchcon2023-cfp, because it's a one off that is not + within the /events/opensearchcon/... structure. + {% endcomment %} + {% if url_has_filename %} + {% unless page.url contains 'opensearchcon2023-cfp' %} + {% assign url_fragment_index = url_parts | size | minus: 2 %} + {% assign url_fragment = url_parts[url_fragment_index] %} + {% else %} + {% assign url_fragment = 'opensearchcon2023-cfp' %} + {% endunless %} + {% else %} + {% assign url_fragment = last_url_part %} + {% endif %} + {% else %} + {% comment %} + If the page is NOT a conference page, the URL has a filename, and the filename + is NOT index.html then refer to the filename without the .html extension. + If the filename is index.html then refer to the penultimate path component. + If there is not filename then refer to the ultimate path component. + {% endcomment %} + {% if url_has_filename %} + {% unless url_filename_is_index %} + {% assign url_fragment = last_url_part | replace: '.html', '' %} + {% else %} + {% assign url_fragment_index = url_parts | size | minus: 2 %} + {% assign url_fragment = url_parts[url_fragment_index] %} + {% endunless %} + {% else %} + {% assign url_fragment = last_url_part %} + {% endif %} + {% endif %} +{% else %} + {% assign url_fragment = '' %} +{% endif %} {% if page.alert %}