Skip to content

Commit

Permalink
Add documentation about setting a default model for neural search (#5121
Browse files Browse the repository at this point in the history
)

* Add documentation about setting a default model for neural search

Signed-off-by: Fanit Kolchina <[email protected]>

* Add new processor to the processor list

Signed-off-by: Fanit Kolchina <[email protected]>

* More tweaks

Signed-off-by: Fanit Kolchina <[email protected]>

* Refactor search pipeline documentation

Signed-off-by: Fanit Kolchina <[email protected]>

* Refactor retrieving search pipelines

Signed-off-by: Fanit Kolchina <[email protected]>

* Add working examples

Signed-off-by: Fanit Kolchina <[email protected]>

* Implement tech review comments

Signed-off-by: Fanit Kolchina <[email protected]>

* Add responses to documentation

Signed-off-by: Fanit Kolchina <[email protected]>

* Update _search-plugins/search-pipelines/neural-query-enricher.md

Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: kolchfa-aws <[email protected]>

* Apply suggestions from code review

Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: kolchfa-aws <[email protected]>

---------

Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: kolchfa-aws <[email protected]>
Co-authored-by: Melissa Vagi <[email protected]>
Co-authored-by: Nathan Bower <[email protected]>
  • Loading branch information
3 people committed Dec 20, 2023
1 parent 5bbb9e1 commit 169e397
Show file tree
Hide file tree
Showing 12 changed files with 526 additions and 270 deletions.
340 changes: 248 additions & 92 deletions _search-plugins/neural-search.md

Large diffs are not rendered by default.

156 changes: 156 additions & 0 deletions _search-plugins/search-pipelines/creating-search-pipeline.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
---
layout: default
title: Creating a search pipeline
nav_order: 10
has_children: false
parent: Search pipelines
grand_parent: Search
---

# Creating a search pipeline

Search pipelines are stored in the cluster state. To create a search pipeline, you must configure an ordered list of processors in your OpenSearch cluster. You can have more than one processor of the same type in the pipeline. Each processor has a `tag` identifier that distinguishes it from the others. Tagging a specific processor can be helpful when debugging error messages, especially if you add multiple processors of the same type.

#### Example request

The following request creates a search pipeline with a `filter_query` request processor that uses a term query to return only public messages and a response processor that renames the field `message` to `notification`:

```json
PUT /_search/pipeline/my_pipeline
{
"request_processors": [
{
"filter_query" : {
"tag" : "tag1",
"description" : "This processor is going to restrict to publicly visible documents",
"query" : {
"term": {
"visibility": "public"
}
}
}
}
],
"response_processors": [
{
"rename_field": {
"field": "message",
"target_field": "notification"
}
}
]
}
```
{% include copy-curl.html %}

## Ignoring processor failures

By default, a search pipeline stops if one of its processors fails. If you want the pipeline to continue running when a processor fails, you can set the `ignore_failure` parameter for that processor to `true` when creating the pipeline:

```json
"filter_query" : {
"tag" : "tag1",
"description" : "This processor is going to restrict to publicly visible documents",
"ignore_failure": true,
"query" : {
"term": {
"visibility": "public"
}
}
}
```

If the processor fails, OpenSearch logs the failure and continues to run all remaining processors in the search pipeline. To check whether there were any failures, you can use [search pipeline metrics]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/search-pipeline-metrics/).

## Updating a search pipeline

To update a search pipeline dynamically, replace the search pipeline using the Search Pipeline API.

#### Example request

The following example request upserts `my_pipeline` by adding a `filter_query` request processor and a `rename_field` response processor:

```json
PUT /_search/pipeline/my_pipeline
{
"request_processors": [
{
"filter_query": {
"tag": "tag1",
"description": "This processor returns only publicly visible documents",
"query": {
"term": {
"visibility": "public"
}
}
}
}
],
"response_processors": [
{
"rename_field": {
"field": "message",
"target_field": "notification"
}
}
]
}
```
{% include copy-curl.html %}

## Search pipeline versions

When creating your pipeline, you can specify a version for it in the `version` parameter:

```json
PUT _search/pipeline/my_pipeline
{
"version": 1234,
"request_processors": [
{
"script": {
"source": """
if (ctx._source['size'] > 100) {
ctx._source['explain'] = false;
}
"""
}
}
]
}
```
{% include copy-curl.html %}

The version is provided in all subsequent responses to `get pipeline` requests:

```json
GET _search/pipeline/my_pipeline
```

The response contains the pipeline version:

<details open markdown="block">
<summary>
Response
</summary>
{: .text-delta}

```json
{
"my_pipeline": {
"version": 1234,
"request_processors": [
{
"script": {
"source": """
if (ctx._source['size'] > 100) {
ctx._source['explain'] = false;
}
"""
}
}
]
}
}
```
</details>
2 changes: 1 addition & 1 deletion _search-plugins/search-pipelines/filter-query-processor.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Field | Data type | Description
`query` | Object | A query in query domain-specific language (DSL). For a list of OpenSearch query types, see [Query DSL]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/). Required.
`tag` | String | The processor's identifier. Optional.
`description` | String | A description of the processor. Optional.
`ignore_failure` | Boolean | If `true`, OpenSearch [ignores a failure]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/index/#ignoring-processor-failures) of this processor and continues to run the remaining processors in the search pipeline. Optional. Default is `false`.
`ignore_failure` | Boolean | If `true`, OpenSearch [ignores any failure]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/creating-search-pipeline/#ignoring-processor-failures) of this processor and continues to run the remaining processors in the search pipeline. Optional. Default is `false`.

## Example

Expand Down
173 changes: 4 additions & 169 deletions _search-plugins/search-pipelines/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,13 +29,10 @@ Both request and response processing for the pipeline are performed on the coord

To learn more about available search processors, see [Search processors]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/search-processors/).

## Creating a search pipeline

Search pipelines are stored in the cluster state. To create a search pipeline, you must configure an ordered list of processors in your OpenSearch cluster. You can have more than one processor of the same type in the pipeline. Each processor has a `tag` identifier that distinguishes it from the others. Tagging a specific processor can be helpful for debugging error messages, especially if you add multiple processors of the same type.
## Example

#### Example request

The following request creates a search pipeline with a `filter_query` request processor that uses a term query to return only public messages and a response processor that renames the field `message` to `notification`:
To create a search pipeline, send a request to the search pipeline endpoint specifying an ordered list of processors, which will be applied sequentially:

```json
PUT /_search/pipeline/my_pipeline
Expand Down Expand Up @@ -65,26 +62,7 @@ PUT /_search/pipeline/my_pipeline
```
{% include copy-curl.html %}

### Ignoring processor failures

By default, a search pipeline stops if one of its processors fails. If you want the pipeline to continue running when a processor fails, you can set the `ignore_failure` parameter for that processor to `true` when creating the pipeline:

```json
"filter_query" : {
"tag" : "tag1",
"description" : "This processor is going to restrict to publicly visible documents",
"ignore_failure": true,
"query" : {
"term": {
"visibility": "public"
}
}
}
```

If the processor fails, OpenSearch logs the failure and continues to run all remaining processors in the search pipeline. To check whether there were any failures, you can use [search pipeline metrics](#search-pipeline-metrics).

## Using search pipelines
For more information about creating and updating a search pipeline, see [Creating a search pipeline]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/creating-search-pipeline/).

To use a pipeline with a query, specify the pipeline name in the `search_pipeline` query parameter:

Expand All @@ -95,151 +73,8 @@ GET /my_index/_search?search_pipeline=my_pipeline

Alternatively, you can use a temporary pipeline with a request or set a default pipeline for an index. To learn more, see [Using a search pipeline]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/using-search-pipeline/).

## Retrieving search pipelines

To retrieve the details of an existing search pipeline, use the Search Pipeline API.

To view all search pipelines, use the following request:

```json
GET /_search/pipeline
```
{% include copy-curl.html %}

The response contains the pipeline that you set up in the previous section:
<details open markdown="block">
<summary>
Response
</summary>
{: .text-delta}

```json
{
"my_pipeline" : {
"request_processors" : [
{
"filter_query" : {
"tag" : "tag1",
"description" : "This processor is going to restrict to publicly visible documents",
"query" : {
"term" : {
"visibility" : "public"
}
}
}
}
]
}
}
```
</details>

To view a particular pipeline, specify the pipeline name as a path parameter:

```json
GET /_search/pipeline/my_pipeline
```
{% include copy-curl.html %}

You can also use wildcard patterns to view a subset of pipelines, for example:

```json
GET /_search/pipeline/my*
```
{% include copy-curl.html %}

## Updating a search pipeline

To update a search pipeline dynamically, replace the search pipeline using the Search Pipeline API.

#### Example request

The following request upserts `my_pipeline` by adding a `filter_query` request processor and a `rename_field` response processor:

```json
PUT /_search/pipeline/my_pipeline
{
"request_processors": [
{
"filter_query": {
"tag": "tag1",
"description": "This processor returns only publicly visible documents",
"query": {
"term": {
"visibility": "public"
}
}
}
}
],
"response_processors": [
{
"rename_field": {
"field": "message",
"target_field": "notification"
}
}
]
}
```
{% include copy-curl.html %}

## Search pipeline versions

When creating your pipeline, you can specify a version for it in the `version` parameter:
To learn about retrieving details for an existing search pipeline, see [Retrieving search pipelines]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/retrieving-search-pipeline/).

```json
PUT _search/pipeline/my_pipeline
{
"version": 1234,
"request_processors": [
{
"script": {
"source": """
if (ctx._source['size'] > 100) {
ctx._source['explain'] = false;
}
"""
}
}
]
}
```
{% include copy-curl.html %}

The version is provided in all subsequent responses to `get pipeline` requests:

```json
GET _search/pipeline/my_pipeline
```

The response contains the pipeline version:

<details open markdown="block">
<summary>
Response
</summary>
{: .text-delta}

```json
{
"my_pipeline": {
"version": 1234,
"request_processors": [
{
"script": {
"source": """
if (ctx._source['size'] > 100) {
ctx._source['explain'] = false;
}
"""
}
}
]
}
}
```
</details>

## Search pipeline metrics

Expand Down
Loading

0 comments on commit 169e397

Please sign in to comment.