Add documentation about setting a default model for neural search (#5121

) * Add documentation about setting a default model for neural search Signed-off-by: Fanit Kolchina <[email protected]> * Add new processor to the processor list Signed-off-by: Fanit Kolchina <[email protected]> * More tweaks Signed-off-by: Fanit Kolchina <[email protected]> * Refactor search pipeline documentation Signed-off-by: Fanit Kolchina <[email protected]> * Refactor retrieving search pipelines Signed-off-by: Fanit Kolchina <[email protected]> * Add working examples Signed-off-by: Fanit Kolchina <[email protected]> * Implement tech review comments Signed-off-by: Fanit Kolchina <[email protected]> * Add responses to documentation Signed-off-by: Fanit Kolchina <[email protected]> * Update _search-plugins/search-pipelines/neural-query-enricher.md Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> * Apply suggestions from code review Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> --------- Signed-off-by: Fanit Kolchina <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> Co-authored-by: Melissa Vagi <[email protected]> Co-authored-by: Nathan Bower <[email protected]>
opensearch-project · Oct 4, 2023 · 06527a2 · 06527a2
1 parent b149493
commit 06527a2
Show file tree

Hide file tree

Showing 12 changed files with 526 additions and 270 deletions.
diff --git a/_search-plugins/neural-search.md b/_search-plugins/neural-search.md
diff --git a/_search-plugins/search-pipelines/creating-search-pipeline.md b/_search-plugins/search-pipelines/creating-search-pipeline.md
@@ -0,0 +1,156 @@
+---
+layout: default
+title: Creating a search pipeline
+nav_order: 10
+has_children: false
+parent: Search pipelines
+grand_parent: Search
+---
+
+# Creating a search pipeline
+
+Search pipelines are stored in the cluster state. To create a search pipeline, you must configure an ordered list of processors in your OpenSearch cluster. You can have more than one processor of the same type in the pipeline. Each processor has a `tag` identifier that distinguishes it from the others. Tagging a specific processor can be helpful when debugging error messages, especially if you add multiple processors of the same type.
+
+#### Example request
+
+The following request creates a search pipeline with a `filter_query` request processor that uses a term query to return only public messages and a response processor that renames the field `message` to `notification`:
+
+```json
+PUT /_search/pipeline/my_pipeline 
+{
+  "request_processors": [
+    {
+      "filter_query" : {
+        "tag" : "tag1",
+        "description" : "This processor is going to restrict to publicly visible documents",
+        "query" : {
+          "term": {
+            "visibility": "public"
+          }
+        }
+      }
+    }
+  ],
+  "response_processors": [
+    {
+      "rename_field": {
+        "field": "message",
+        "target_field": "notification"
+      }
+    }
+  ]
+}
+```
+{% include copy-curl.html %}
+
+## Ignoring processor failures
+
+By default, a search pipeline stops if one of its processors fails. If you want the pipeline to continue running when a processor fails, you can set the `ignore_failure` parameter for that processor to `true` when creating the pipeline:
+
+```json
+"filter_query" : {
+  "tag" : "tag1",
+  "description" : "This processor is going to restrict to publicly visible documents",
+  "ignore_failure": true,
+  "query" : {
+    "term": {
+      "visibility": "public"
+    }
+  }
+}
+```
+
+If the processor fails, OpenSearch logs the failure and continues to run all remaining processors in the search pipeline. To check whether there were any failures, you can use [search pipeline metrics]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/search-pipeline-metrics/). 
+
+## Updating a search pipeline
+
+To update a search pipeline dynamically, replace the search pipeline using the Search Pipeline API. 
+
+#### Example request
+
+The following example request upserts `my_pipeline` by adding a `filter_query` request processor and a `rename_field` response processor:
+
+```json
+PUT /_search/pipeline/my_pipeline
+{
+  "request_processors": [
+    {
+      "filter_query": {
+        "tag": "tag1",
+        "description": "This processor returns only publicly visible documents",
+        "query": {
+          "term": {
+            "visibility": "public"
+          }
+        }
+      }
+    }
+  ],
+  "response_processors": [
+    {
+      "rename_field": {
+        "field": "message",
+        "target_field": "notification"
+      }
+    }
+  ]
+}
+```
+{% include copy-curl.html %}
+
+## Search pipeline versions
+
+When creating your pipeline, you can specify a version for it in the `version` parameter:
+
+```json
+PUT _search/pipeline/my_pipeline
+{
+  "version": 1234,
+  "request_processors": [
+    {
+      "script": {
+        "source": """
+           if (ctx._source['size'] > 100) {
+             ctx._source['explain'] = false;
+           }
+         """
+      }
+    }
+  ]
+}
+```
+{% include copy-curl.html %}
+
+The version is provided in all subsequent responses to `get pipeline` requests:
+
+```json
+GET _search/pipeline/my_pipeline
+```
+
+The response contains the pipeline version:
+
+<details open markdown="block">
+  <summary>
+    Response
+  </summary>
+  {: .text-delta}
+
+```json
+{
+  "my_pipeline": {
+    "version": 1234,
+    "request_processors": [
+      {
+        "script": {
+          "source": """
+           if (ctx._source['size'] > 100) {
+             ctx._source['explain'] = false;
+           }
+         """
+        }
+      }
+    ]
+  }
+}
+```
+</details>
diff --git a/_search-plugins/search-pipelines/filter-query-processor.md b/_search-plugins/search-pipelines/filter-query-processor.md
@@ -20,7 +20,7 @@ Field | Data type | Description
 `query` | Object | A query in query domain-specific language (DSL). For a list of OpenSearch query types, see [Query DSL]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/). Required. 
 `tag` | String | The processor's identifier. Optional.
 `description` | String | A description of the processor. Optional.
-`ignore_failure` | Boolean | If `true`, OpenSearch [ignores a failure]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/index/#ignoring-processor-failures) of this processor and continues to run the remaining processors in the search pipeline. Optional. Default is `false`.
+`ignore_failure` | Boolean | If `true`, OpenSearch [ignores any failure]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/creating-search-pipeline/#ignoring-processor-failures) of this processor and continues to run the remaining processors in the search pipeline. Optional. Default is `false`.
 
 ## Example 
 

diff --git a/_search-plugins/search-pipelines/index.md b/_search-plugins/search-pipelines/index.md
@@ -29,13 +29,10 @@ Both request and response processing for the pipeline are performed on the coord
 
 To learn more about available search processors, see [Search processors]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/search-processors/).
 
-## Creating a search pipeline
 
-Search pipelines are stored in the cluster state. To create a search pipeline, you must configure an ordered list of processors in your OpenSearch cluster. You can have more than one processor of the same type in the pipeline. Each processor has a `tag` identifier that distinguishes it from the others. Tagging a specific processor can be helpful for debugging error messages, especially if you add multiple processors of the same type.
+## Example
 
-#### Example request
-
-The following request creates a search pipeline with a `filter_query` request processor that uses a term query to return only public messages and a response processor that renames the field `message` to `notification`:
+To create a search pipeline, send a request to the search pipeline endpoint specifying an ordered list of processors, which will be applied sequentially:
 
 ```json
 PUT /_search/pipeline/my_pipeline 
@@ -65,26 +62,7 @@ PUT /_search/pipeline/my_pipeline
 ```
 {% include copy-curl.html %}
 
-### Ignoring processor failures
-
-By default, a search pipeline stops if one of its processors fails. If you want the pipeline to continue running when a processor fails, you can set the `ignore_failure` parameter for that processor to `true` when creating the pipeline:
-
-```json
-"filter_query" : {
-  "tag" : "tag1",
-  "description" : "This processor is going to restrict to publicly visible documents",
-  "ignore_failure": true,
-  "query" : {
-    "term": {
-      "visibility": "public"
-    }
-  }
-}
-```
-
-If the processor fails, OpenSearch logs the failure and continues to run all remaining processors in the search pipeline. To check whether there were any failures, you can use [search pipeline metrics](#search-pipeline-metrics). 
-
-## Using search pipelines
+For more information about creating and updating a search pipeline, see [Creating a search pipeline]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/creating-search-pipeline/). 
 
 To use a pipeline with a query, specify the pipeline name in the `search_pipeline` query parameter:
 
@@ -95,151 +73,8 @@ GET /my_index/_search?search_pipeline=my_pipeline
 
 Alternatively, you can use a temporary pipeline with a request or set a default pipeline for an index. To learn more, see [Using a search pipeline]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/using-search-pipeline/).
 
-## Retrieving search pipelines
-
-To retrieve the details of an existing search pipeline, use the Search Pipeline API. 
-
-To view all search pipelines, use the following request:
-
-```json
-GET /_search/pipeline
-```
-{% include copy-curl.html %}
-
-The response contains the pipeline that you set up in the previous section:
-<details open markdown="block">
-  <summary>
-    Response
-  </summary>
-  {: .text-delta}
-
-```json
-{
-  "my_pipeline" : {
-    "request_processors" : [
-      {
-        "filter_query" : {
-          "tag" : "tag1",
-          "description" : "This processor is going to restrict to publicly visible documents",
-          "query" : {
-            "term" : {
-              "visibility" : "public"
-            }
-          }
-        }
-      }
-    ]
-  }
-}
-```
-</details>
-
-To view a particular pipeline, specify the pipeline name as a path parameter:
-
-```json
-GET /_search/pipeline/my_pipeline
-```
-{% include copy-curl.html %}
-
-You can also use wildcard patterns to view a subset of pipelines, for example:
-
-```json
-GET /_search/pipeline/my*
-```
-{% include copy-curl.html %}
-
-## Updating a search pipeline
-
-To update a search pipeline dynamically, replace the search pipeline using the Search Pipeline API. 
-
-#### Example request
-
-The following request upserts `my_pipeline` by adding a `filter_query` request processor and a `rename_field` response processor:
-
-```json
-PUT /_search/pipeline/my_pipeline
-{
-  "request_processors": [
-    {
-      "filter_query": {
-        "tag": "tag1",
-        "description": "This processor returns only publicly visible documents",
-        "query": {
-          "term": {
-            "visibility": "public"
-          }
-        }
-      }
-    }
-  ],
-  "response_processors": [
-    {
-      "rename_field": {
-        "field": "message",
-        "target_field": "notification"
-      }
-    }
-  ]
-}
-```
-{% include copy-curl.html %}
-
-## Search pipeline versions
-
-When creating your pipeline, you can specify a version for it in the `version` parameter:
+To learn about retrieving details for an existing search pipeline, see [Retrieving search pipelines]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/retrieving-search-pipeline/).
 
-```json
-PUT _search/pipeline/my_pipeline
-{
-  "version": 1234,
-  "request_processors": [
-    {
-      "script": {
-        "source": """
-           if (ctx._source['size'] > 100) {
-             ctx._source['explain'] = false;
-           }
-         """
-      }
-    }
-  ]
-}
-```
-{% include copy-curl.html %}
-
-The version is provided in all subsequent responses to `get pipeline` requests:
-
-```json
-GET _search/pipeline/my_pipeline
-```
-
-The response contains the pipeline version:
-
-<details open markdown="block">
-  <summary>
-    Response
-  </summary>
-  {: .text-delta}
-
-```json
-{
-  "my_pipeline": {
-    "version": 1234,
-    "request_processors": [
-      {
-        "script": {
-          "source": """
-           if (ctx._source['size'] > 100) {
-             ctx._source['explain'] = false;
-           }
-         """
-        }
-      }
-    ]
-  }
-}
-```
-</details>
 
 ## Search pipeline metrics