Skip to content

Commit

Permalink
fluent: docs: update grep for style
Browse files Browse the repository at this point in the history
Signed-off-by: Lynette Miles <[email protected]>
  • Loading branch information
esmerel committed Sep 12, 2024
1 parent 59b0598 commit 1cd9857
Showing 1 changed file with 66 additions and 34 deletions.
100 changes: 66 additions & 34 deletions pipeline/filters/grep.md
Original file line number Diff line number Diff line change
@@ -1,28 +1,33 @@
---
description: Select or exclude records per patterns
description: Select or exclude records using patterns
---

# Grep

The _Grep Filter_ plugin allows you to match or exclude specific records based on regular expression patterns for values or nested values.
The _Grep Filter_ plugin lets you match or exclude specific records based on
regular expression patterns for values or nested values.

## Configuration Parameters
## Configuration parameters

The plugin supports the following configuration parameters:

| Key | Value Format | Description |
| :--- | :--- | :--- |
| Regex | KEY REGEX | Keep records in which the content of KEY matches the regular expression. |
| Exclude | KEY REGEX | Exclude records in which the content of KEY matches the regular expression. |
| Logical_Op| Operation | Specify which logical operator to use. `AND` , `OR` and `legacy` are allowed as an Operation. Default is `legacy` for backward compatibility. In `legacy` mode the behaviour is either AND or OR depending whether the `grep` is including (uses AND) or excluding (uses OR). Only available from 2.1+. |
| Key | Value Format | Description |
| ------------ | ------------ | ----------- |
| `Regex` | KEY REGEX | Keep records where the content of KEY matches the regular expression. |
| `Exclude` | KEY REGEX | Exclude records where the content of KEY matches the regular expression. |
| `Logical_Op` | Operation | Specify a logical operator: `AND`, `OR` or `legacy` (default). In `legacy` mode the behaviour is either `AND` or `OR` depending on whether the `grep` is including (uses AND) or excluding (uses OR). Available from 2.1 or higher. |

#### Record Accessor Enabled
### Record Accessor Enabled

This plugin enables the [Record Accessor](../../administration/configuring-fluent-bit/classic-mode/record-accessor.md) feature to specify the KEY. Using the _record accessor_ is suggested if you want to match values against nested values.
Enable the [Record Accessor](../../administration/configuring-fluent-bit/classic-mode/record-accessor.md)
feature to specify the KEY. Use the record accesso to match values against nested
values.

## Getting Started
## Filter records

In order to start filtering records, you can run the filter from the command line or through the configuration file. The following example assumes that you have a file called `lines.txt` with the following content:
To start filtering records, run the filter from the command line or through the
configuration file. The following example assumes that you have a file named
`lines.txt` with the following content:

```text
{"log": "aaa"}
Expand All @@ -35,20 +40,25 @@ In order to start filtering records, you can run the filter from the command lin
{"log": "ggg"}
```

### Command Line
### Command line

> Note: using the command line mode need special attention to quote the regular expressions properly. It's suggested to use a configuration file.
When using the command line, pay special attention to quote the regular expressions.
Using a configuration file might be easier.

The following command will load the _tail_ plugin and read the content of `lines.txt` file. Then the _grep_ filter will apply a regular expression rule over the _log_ field \(created by tail plugin\) and only _pass_ the records which field value starts with _aa_:
The following command loads the [tail](../../pipeline/inputs/tail) plugin and
reads the content of `lines.txt` file. Then the `grep` filter applies a regular
expression rule over the `log` field created by the `tail` plugin and only passes
records with a field value starting with `aa`:

```text
$ bin/fluent-bit -i tail -p 'path=lines.txt' -F grep -p 'regex=log aa' -m '*' -o stdout
```

### Configuration File
### Configuration file

{% tabs %}
{% tab title="fluent-bit.conf" %}

```python
[SERVICE]
parsers_file /path/to/parsers.conf
Expand All @@ -67,9 +77,11 @@ $ bin/fluent-bit -i tail -p 'path=lines.txt' -F grep -p 'regex=log aa' -m '*' -o
name stdout
match *
```

{% endtab %}

{% tab title="fluent-bit.yaml" %}

```yaml
service:
parsers_file: /path/to/parsers.conf
Expand All @@ -87,14 +99,21 @@ pipeline:
match: '*'

```

{% endtab %}
{% endtabs %}

The filter allows to use multiple rules which are applied in order, you can have many _Regex_ and _Exclude_ entries as required.
The filter allows you to use multiple rules which are applied in order. You can
have as many `Regex` and `Exclude` entries as required.

### Nested fields example

If you want to match or exclude records based on nested values, you can use a [Record Accessor ](../../administration/configuring-fluent-bit/classic-mode/record-accessor.md)format as the KEY name. Consider the following record example:
To match or exclude records based on nested values, you can use
[Record
Accessor](../../administration/configuring-fluent-bit/classic-mode/record-accessor.md)
format as the `KEY` name.

Consider the following record example:

```javascript
{
Expand All @@ -113,40 +132,45 @@ If you want to match or exclude records based on nested values, you can use a [R
}
```

if you want to exclude records that match given nested field \(for example `kubernetes.labels.app`\), you can use the following rule:
For example, to exclude records that match the nested field `kubernetes.labels.app`,
use the following rule:

{% tabs %}
{% tab title="fluent-bit.conf" %}

```python
[FILTER]
Name grep
Match *
Exclude $kubernetes['labels']['app'] myapp
```
{% endtab %}

{% endtab %}
{% tab title="fluent-bit.yaml" %}

```yaml
filters:
- name: grep
match: '*'
exclude: $kubernetes['labels']['app'] myapp
```
{% endtab %}
{% endtabs %}
### Excluding records missing/invalid fields
It may be that in your processing pipeline you want to drop records that are missing certain keys.
### Excluding records with missing or invalid fields
A simple way to do this is just to `exclude` with a regex that matches anything, a missing key will fail this check.
You might want to drop records that are missing certain keys.
Here is an example that checks for a specific valid value for the key as well:
One way to do this is to `exclude` with a regex that matches anything. A missing
key fails this check.

The followinfg example checks for a specific valid value for the key:

{% tabs %}
{% tab title="fluent-bit.conf" %}
```

```text
# Use Grep to verify the contents of the iot_timestamp value.
# If the iot_timestamp key does not exist, this will fail
# and exclude the row.
Expand All @@ -156,30 +180,34 @@ Here is an example that checks for a specific valid value for the key as well:
Match iots_thread.*
Regex iot_timestamp ^\d{4}-\d{2}-\d{2}
```
{% endtab %}

{% endtab %}
{% tab title="fluent-bit.yaml" %}

```yaml
filters:
- name: grep
alias: filter-iots-grep
match: iots_thread.*
regex: iot_timestamp ^\d{4}-\d{2}-\d{2}
```

{% endtab %}
{% endtabs %}

The specified key `iot_timestamp` must match the expected expression - if it does not or is missing/empty then it will be excluded.
The specified key `iot_timestamp` must match the expected expression. If it doesn't,
or is missing or empty, then it will be excluded.

### Multiple conditions

If you want to set multiple `Regex` or `Exclude`, you can use `Logical_Op` property to use logical conjuction or disjunction.

Note: If `Logical_Op` is set, setting both 'Regex' and `Exclude` results in an error.
If you want to set multiple `Regex` or `Exclude`, use the `Logical_Op` property
to use a logical conjuction or disjunction.

If `Logical_Op` is set, setting both `Regex` and `Exclude` results in an error.

{% tabs %}
{% tab title="fluent-bit.conf" %}

```python
[INPUT]
Name dummy
Expand All @@ -196,9 +224,11 @@ Note: If `Logical_Op` is set, setting both 'Regex' and `Exclude` results in an e
[OUTPUT]
Name stdout
```

{% endtab %}

{% tab title="fluent-bit.yaml" %}

```yaml
pipeline:
inputs:
Expand All @@ -215,11 +245,13 @@ pipeline:
outputs:
- name: stdout
```

{% endtab %}
{% endtabs %}

Output will be
```
The output looks similar to:

```text
Fluent Bit v2.0.9
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
Expand All @@ -236,4 +268,4 @@ Fluent Bit v2.0.9
[2023/01/22 09:46:49] [ info] [output:stdout:stdout.0] worker #0 started
[0] dummy: [1674348410.558341857, {"endpoint"=>"localhost", "value"=>"something"}]
[0] dummy: [1674348411.546425499, {"endpoint"=>"localhost", "value"=>"something"}]
```
```

0 comments on commit 1cd9857

Please sign in to comment.