Skip to content

Commit

Permalink
[DOC] Add drop processor (#5767) (#6627)
Browse files Browse the repository at this point in the history
* Add drop processor doc to address content gap



* Address tech review feedback



* Address tech review changes



* Delete _ingest-pipelines/processors/date.md





* Revert "Delete _ingest-pipelines/processors/date.md"

This reverts commit 73296f5.

* Update _ingest-pipelines/processors/drop.md




* Update _ingest-pipelines/processors/drop.md




* Update _ingest-pipelines/processors/drop.md




* Update _ingest-pipelines/processors/drop.md




* Update _ingest-pipelines/processors/drop.md




* Update drop.md





---------



(cherry picked from commit b38ba75)

Signed-off-by: Melissa Vagi <[email protected]>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Nathan Bower <[email protected]>
  • Loading branch information
3 people authored Mar 7, 2024
1 parent a666c5f commit 8d1e284
Show file tree
Hide file tree
Showing 2 changed files with 124 additions and 1 deletion.
2 changes: 1 addition & 1 deletion _ingest-pipelines/processors/date.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ redirect_from:

The `date` processor is used to parse dates from document fields and to add the parsed data to a new field. By default, the parsed data is stored in the `@timestamp` field.

## Syntax
## Syntax example

The following is the syntax for the `date` processor:

Expand Down
123 changes: 123 additions & 0 deletions _ingest-pipelines/processors/drop.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
---
layout: default
title: Drop
parent: Ingest processors
nav_order: 70
---

# Drop processor

The `drop` processor is used to discard documents without indexing them. This can be useful for preventing documents from being indexed based on certain conditions. For example, you might use a `drop` processor to prevent documents that are missing important fields or contain sensitive information from being indexed.

The `drop` processor does not raise any errors when it discards documents, making it useful for preventing indexing problems without cluttering your OpenSearch logs with error messages.

## Syntax example

The following is the syntax for the `drop` processor:

```json
{
"drop": {
"if": "ctx.foo == 'bar'"
}
}
```
{% include copy-curl.html %}

## Configuration parameters

The following table lists the required and optional parameters for the `drop` processor.

Parameter | Required | Description |
|-----------|-----------|-----------|
`description` | Optional | A brief description of the processor. |
`if` | Optional | A condition for running the processor. |
`ignore_failure` | Optional | If set to `true`, failures are ignored. Default is `false`. See [Handling pipeline failures]({{site.url}}{{site.baseurl}}/ingest-pipelines/pipeline-failures/) for more information. |
`on_failure` | Optional | A list of processors to run if the processor fails. See [Handling pipeline failures]({{site.url}}{{site.baseurl}}/ingest-pipelines/pipeline-failures/) for more information. |
`tag` | Optional | An identifier tag for the processor. Useful for distinguishing between processors of the same type when debugging. |

## Using the processor

Follow these steps to use the processor in a pipeline.

**Step 1: Create a pipeline**

The following query creates a pipeline, named `drop-pii`, that uses the `drop` processor to prevent a document containing personally identifiable information (PII) from being indexed:

```json
PUT /_ingest/pipeline/drop-pii
{
"description": "Pipeline that prevents PII from being indexed",
"processors": [
{
"drop": {
"if" : "ctx.user_info.contains('password') || ctx.user_info.contains('credit card')"
}
}
]
}
```
{% include copy-curl.html %}

**Step 2 (Optional): Test the pipeline**

It is recommended that you test your pipeline before ingesting documents.
{: .tip}

To test the pipeline, run the following query:

```json
POST _ingest/pipeline/drop-pii/_simulate
{
"docs": [
{
"_index": "testindex1",
"_id": "1",
"_source": {
"user_info": "Sensitive information including credit card"
}
}
]
}
```
{% include copy-curl.html %}

#### Response

The following example response confirms that the pipeline is working as expected (the document has been dropped):

```json
{
"docs": [
null
]
}
```
{% include copy-curl.html %}

**Step 3: Ingest a document**

The following query ingests a document into an index named `testindex1`:

```json
PUT testindex1/_doc/1?pipeline=drop-pii
{
"user_info": "Sensitive information including credit card"
}
```
{% include copy-curl.html %}

The following response confirms that the document with the ID of `1` was not indexed:

{
"_index": "testindex1",
"_id": "1",
"_version": -3,
"result": "noop",
"_shards": {
"total": 0,
"successful": 0,
"failed": 0
}
}
{% include copy-curl.html %}

0 comments on commit 8d1e284

Please sign in to comment.