From b782565497e9b51341ef118fb6556cecebd53f22 Mon Sep 17 00:00:00 2001 From: Tejas Shah Date: Wed, 17 Jul 2024 12:58:31 -0700 Subject: [PATCH 01/10] Adds documentation for dynamic query parameters Signed-off-by: Tejas Shah --- _search-plugins/knn/approximate-knn.md | 46 +++++++++++++++++++++++++- 1 file changed, 45 insertions(+), 1 deletion(-) diff --git a/_search-plugins/knn/approximate-knn.md b/_search-plugins/knn/approximate-knn.md index c0a9557728..aa2b62ef90 100644 --- a/_search-plugins/knn/approximate-knn.md +++ b/_search-plugins/knn/approximate-knn.md @@ -142,7 +142,7 @@ The following table provides examples of the number of results returned by vario 10 | 1 | 1 | 4 | 4 | 1 10 | 10 | 1 | 4 | 10 | 10 10 | 1 | 2 | 4 | 8 | 2 - + The number of results returned by Faiss/NMSLIB differs from the number of results returned by Lucene only when `k` is smaller than `size`. If `k` and `size` are equal, all engines return the same number of results. Starting in OpenSearch 2.14, you can use `k`, `min_score`, or `max_distance` for [radial search]({{site.url}}{{site.baseurl}}/search-plugins/knn/radial-search-knn/). @@ -256,6 +256,50 @@ POST _bulk After data is ingested, it can be search just like any other `knn_vector` field! +### Additional query parameters +Starting with version 2.16, k-NN plugin supports additional method parameters to tune search results. This can be done by providing the right values in `method_parameters` section of the query. Here is an example of the query + +```json +GET my-knn-index-1/_search +{ + "size": 2, + "query": { + "knn": { + "my_vector2": { + "vector": [2, 3, 5, 6], + "k": 2, + "method_parameters" : { + "ef_search": 100 + } + } + } + } +} +``` +These parameters are dependent on the combination of engine and method used to build the index. The tables will help with the valid set of parameters. Invalid combination of parameters will throw an error + +#### HNSW + +`ef_search` + +Explores the `ef_search` number of vectors to find top k nearest neighbors. Higher value will help improve recall at the cost of search latency. The value must be greater than 0 + +Supported engines | Radial query support | Notes +:--- | :--- | :--- +Nmslib | no | Overrides the value in index settings if present in search query +Faiss | yes | Overrides the value in index settings if present in search query +Lucene | no | Engine supports `k` or `ef_search`. The final result can be controlled by `size`. k-NN plugin will pick `max(k, ef_search)` + +#### IVF + +`nprobes` + +Explores `nprobes` number of clusters to find the top k nearest neighbors. Higher value will help improve recall at the cost of search latency. The value must be greater than 0 + +Supported engines | Notes +:--- | :--- +faiss | Overrides the value in index settings if present in search query + ### Using approximate k-NN with filters To learn about using filters with k-NN search, see [k-NN search with filters]({{site.url}}{{site.baseurl}}/search-plugins/knn/filter-search-knn/). From a4d87d2f8fe4b8b52ad48533a1367e9b72edb675 Mon Sep 17 00:00:00 2001 From: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Date: Fri, 19 Jul 2024 17:26:12 -0400 Subject: [PATCH 02/10] Update _search-plugins/knn/approximate-knn.md Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> --- _search-plugins/knn/approximate-knn.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_search-plugins/knn/approximate-knn.md b/_search-plugins/knn/approximate-knn.md index 1bb3d14693..8f39093a76 100644 --- a/_search-plugins/knn/approximate-knn.md +++ b/_search-plugins/knn/approximate-knn.md @@ -253,7 +253,7 @@ POST _bulk ... ``` -After data is ingested, it can be search just like any other `knn_vector` field! +After data is ingested, it can be searched just like any other `knn_vector` field. ### Additional query parameters Starting with version 2.16, k-NN plugin supports additional method parameters to tune search results. This can be done by providing the right values in `method_parameters` section of the query. Here is an example of the query From 6519ef3a8a5f2ab91f2db4205e2b3515e2804ef6 Mon Sep 17 00:00:00 2001 From: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Date: Fri, 19 Jul 2024 17:41:17 -0400 Subject: [PATCH 03/10] Update _search-plugins/knn/approximate-knn.md Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> --- _search-plugins/knn/approximate-knn.md | 1 + 1 file changed, 1 insertion(+) diff --git a/_search-plugins/knn/approximate-knn.md b/_search-plugins/knn/approximate-knn.md index 8f39093a76..4d2f2dba27 100644 --- a/_search-plugins/knn/approximate-knn.md +++ b/_search-plugins/knn/approximate-knn.md @@ -256,6 +256,7 @@ POST _bulk After data is ingested, it can be searched just like any other `knn_vector` field. ### Additional query parameters + Starting with version 2.16, k-NN plugin supports additional method parameters to tune search results. This can be done by providing the right values in `method_parameters` section of the query. Here is an example of the query ```json From 835259d00a35fc24c40aa2175802f37bb921c384 Mon Sep 17 00:00:00 2001 From: Tejas Shah Date: Fri, 19 Jul 2024 15:29:00 -0700 Subject: [PATCH 04/10] Apply suggestions from code review Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Signed-off-by: Tejas Shah --- _search-plugins/knn/approximate-knn.md | 24 +++++++++++++----------- 1 file changed, 13 insertions(+), 11 deletions(-) diff --git a/_search-plugins/knn/approximate-knn.md b/_search-plugins/knn/approximate-knn.md index 4d2f2dba27..a3cc8852fc 100644 --- a/_search-plugins/knn/approximate-knn.md +++ b/_search-plugins/knn/approximate-knn.md @@ -257,7 +257,7 @@ After data is ingested, it can be searched just like any other `knn_vector` fiel ### Additional query parameters -Starting with version 2.16, k-NN plugin supports additional method parameters to tune search results. This can be done by providing the right values in `method_parameters` section of the query. Here is an example of the query +Starting with version 2.16, you can provide `method_parameters` in a search request: ```json GET my-knn-index-1/_search @@ -276,27 +276,29 @@ GET my-knn-index-1/_search } } ``` -These parameters are dependent on the combination of engine and method used to build the index. The tables will help with the valid set of parameters. Invalid combination of parameters will throw an error +These parameters are dependent on the combination of engine and method used to create the index. The following sections provide information about the supported `method_parameters`. #### HNSW -`ef_search` -Explores the `ef_search` number of vectors to find top k nearest neighbors. Higher value will help improve recall at the cost of search latency. The value must be greater than 0 +You can provide the `ef_search` parameter when searching an index created using the `hnsw` method. The `ef_search` parameter specifies to explore the `ef_search` number of vectors to find the top k nearest neighbors. Higher value of `ef_search` improves recall at the cost of increased search latency. The value must be positive. -Supported engines | Radial query support | Notes +The following table provides information about the `ef_search` parameter for the supported engines. + +Engine | Radial query support | Notes :--- | :--- | :--- -Nmslib | no | Overrides the value in index settings if present in search query -Faiss | yes | Overrides the value in index settings if present in search query +`nmslib` | No | If `ef_search` is present in a query, it overrides the `index.knn.algo_param.ef_search` index setting. +`faiss` | Yes | If `ef_search` is present in a query, it overrides the `index.knn.algo_param.ef_search` index setting. Lucene | no | Engine supports `k` or `ef_search`. The final result can be controlled by `size`. k-NN plugin will pick `max(k, ef_search)` -#### IVF +#### `nprobes` + -`nprobes` +You can provide the `nprobes` parameter when searching an index created using the `ivf` method. The `nprobes` parameter specifies to explore the `nprobes` number of clusters to find the top k nearest neighbors. Higher value of `nprobes` improves recall at the cost of increased search latency. The value must be positive. -Explores `nprobes` number of clusters to find the top k nearest neighbors. Higher value will help improve recall at the cost of search latency. The value must be greater than 0 +The following table provides information about the `nprobes` parameter for the supported engines. -Supported engines | Notes +Engine | Notes :--- | :--- faiss | Overrides the value in index settings if present in search query From f8b82db3568723f932c4638bcd6ad77db56c3ec5 Mon Sep 17 00:00:00 2001 From: Tejas Shah Date: Mon, 22 Jul 2024 10:07:24 -0700 Subject: [PATCH 05/10] Update _search-plugins/knn/approximate-knn.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Signed-off-by: Tejas Shah --- _search-plugins/knn/approximate-knn.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_search-plugins/knn/approximate-knn.md b/_search-plugins/knn/approximate-knn.md index a3cc8852fc..0338d8cc55 100644 --- a/_search-plugins/knn/approximate-knn.md +++ b/_search-plugins/knn/approximate-knn.md @@ -289,7 +289,7 @@ Engine | Radial query support | Notes :--- | :--- | :--- `nmslib` | No | If `ef_search` is present in a query, it overrides the `index.knn.algo_param.ef_search` index setting. `faiss` | Yes | If `ef_search` is present in a query, it overrides the `index.knn.algo_param.ef_search` index setting. -Lucene | no | Engine supports `k` or `ef_search`. The final result can be controlled by `size`. k-NN plugin will pick `max(k, ef_search)` +`lucene` | No | When creating a search query, you must specify `k`. If you provide both `k` and `ef_search`, then the larger value is passed to the engine. If `ef_search` is larger than `k`, you can provide the `size` parameter to limit the final number of results to `k`. #### `nprobes` From 58d5697d720a129e2152098280ef1395ed866ccc Mon Sep 17 00:00:00 2001 From: Tejas Shah Date: Mon, 22 Jul 2024 10:07:35 -0700 Subject: [PATCH 06/10] Update _search-plugins/knn/approximate-knn.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Signed-off-by: Tejas Shah --- _search-plugins/knn/approximate-knn.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/_search-plugins/knn/approximate-knn.md b/_search-plugins/knn/approximate-knn.md index 0338d8cc55..3dc2b6403e 100644 --- a/_search-plugins/knn/approximate-knn.md +++ b/_search-plugins/knn/approximate-knn.md @@ -276,6 +276,8 @@ GET my-knn-index-1/_search } } ``` +{% include copy-curl.html %} + These parameters are dependent on the combination of engine and method used to create the index. The following sections provide information about the supported `method_parameters`. #### HNSW From d3d1020d80439c66716e85fd08d9b6a1784502e8 Mon Sep 17 00:00:00 2001 From: Tejas Shah Date: Mon, 22 Jul 2024 10:08:04 -0700 Subject: [PATCH 07/10] Update _search-plugins/knn/approximate-knn.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Signed-off-by: Tejas Shah --- _search-plugins/knn/approximate-knn.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_search-plugins/knn/approximate-knn.md b/_search-plugins/knn/approximate-knn.md index 3dc2b6403e..2416d9b7e2 100644 --- a/_search-plugins/knn/approximate-knn.md +++ b/_search-plugins/knn/approximate-knn.md @@ -302,7 +302,7 @@ The following table provides information about the `nprobes` parameter for the s Engine | Notes :--- | :--- -faiss | Overrides the value in index settings if present in search query +`faiss` | If `nprobes` is present in a query, it overrides the value provided when creating the index. ### Using approximate k-NN with filters From 5dc7bbfc95dcd4490ca9b394f0f49b5aecad8fe5 Mon Sep 17 00:00:00 2001 From: Tejas Shah Date: Mon, 22 Jul 2024 10:08:39 -0700 Subject: [PATCH 08/10] Update _search-plugins/knn/approximate-knn.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Signed-off-by: Tejas Shah --- _search-plugins/knn/approximate-knn.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_search-plugins/knn/approximate-knn.md b/_search-plugins/knn/approximate-knn.md index 2416d9b7e2..932166093c 100644 --- a/_search-plugins/knn/approximate-knn.md +++ b/_search-plugins/knn/approximate-knn.md @@ -280,7 +280,7 @@ GET my-knn-index-1/_search These parameters are dependent on the combination of engine and method used to create the index. The following sections provide information about the supported `method_parameters`. -#### HNSW +#### `ef_search` You can provide the `ef_search` parameter when searching an index created using the `hnsw` method. The `ef_search` parameter specifies to explore the `ef_search` number of vectors to find the top k nearest neighbors. Higher value of `ef_search` improves recall at the cost of increased search latency. The value must be positive. From 00e9280b1c5d1c917d34a600c60d547b45410409 Mon Sep 17 00:00:00 2001 From: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Date: Mon, 22 Jul 2024 13:47:12 -0400 Subject: [PATCH 09/10] Apply suggestions from code review Co-authored-by: Nathan Bower Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> --- _search-plugins/knn/approximate-knn.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/_search-plugins/knn/approximate-knn.md b/_search-plugins/knn/approximate-knn.md index 932166093c..32832f7195 100644 --- a/_search-plugins/knn/approximate-knn.md +++ b/_search-plugins/knn/approximate-knn.md @@ -253,7 +253,7 @@ POST _bulk ... ``` -After data is ingested, it can be searched just like any other `knn_vector` field. +After data is ingested, it can be searched in the same way as any other `knn_vector` field. ### Additional query parameters @@ -283,7 +283,7 @@ These parameters are dependent on the combination of engine and method used to c #### `ef_search` -You can provide the `ef_search` parameter when searching an index created using the `hnsw` method. The `ef_search` parameter specifies to explore the `ef_search` number of vectors to find the top k nearest neighbors. Higher value of `ef_search` improves recall at the cost of increased search latency. The value must be positive. +You can provide the `ef_search` parameter when searching an index created using the `hnsw` method. The `ef_search` parameter specifies the number of vectors to examine in order to find the top k nearest neighbors. Higher `ef_search` values improve recall at the cost of increased search latency. The value must be positive. The following table provides information about the `ef_search` parameter for the supported engines. @@ -296,7 +296,7 @@ Engine | Radial query support | Notes #### `nprobes` -You can provide the `nprobes` parameter when searching an index created using the `ivf` method. The `nprobes` parameter specifies to explore the `nprobes` number of clusters to find the top k nearest neighbors. Higher value of `nprobes` improves recall at the cost of increased search latency. The value must be positive. +You can provide the `nprobes` parameter when searching an index created using the `ivf` method. The `nprobes` parameter specifies the number of `nprobes` clusters to examine in order to find the top k nearest neighbors. Higher `nprobes` values improve recall at the cost of increased search latency. The value must be positive. The following table provides information about the `nprobes` parameter for the supported engines. From 501bb5ab35965c5e77cd1c7ebc6bc32ec3f1ae75 Mon Sep 17 00:00:00 2001 From: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Date: Mon, 22 Jul 2024 13:48:05 -0400 Subject: [PATCH 10/10] Apply suggestions from code review Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> --- _search-plugins/knn/approximate-knn.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/_search-plugins/knn/approximate-knn.md b/_search-plugins/knn/approximate-knn.md index 32832f7195..fa1b4096c7 100644 --- a/_search-plugins/knn/approximate-knn.md +++ b/_search-plugins/knn/approximate-knn.md @@ -282,7 +282,6 @@ These parameters are dependent on the combination of engine and method used to c #### `ef_search` - You can provide the `ef_search` parameter when searching an index created using the `hnsw` method. The `ef_search` parameter specifies the number of vectors to examine in order to find the top k nearest neighbors. Higher `ef_search` values improve recall at the cost of increased search latency. The value must be positive. The following table provides information about the `ef_search` parameter for the supported engines. @@ -295,7 +294,6 @@ Engine | Radial query support | Notes #### `nprobes` - You can provide the `nprobes` parameter when searching an index created using the `ivf` method. The `nprobes` parameter specifies the number of `nprobes` clusters to examine in order to find the top k nearest neighbors. Higher `nprobes` values improve recall at the cost of increased search latency. The value must be positive. The following table provides information about the `nprobes` parameter for the supported engines.