Skip to content

Commit

Permalink
Merge pull request #1456 from fluent/lynettemiles/sc-108045/update-ma…
Browse files Browse the repository at this point in the history
…nual-pipeline-parsers-decoders-fluent

docs: decoders: updating decoders page for grammar and style
  • Loading branch information
esmerel authored Sep 11, 2024
2 parents b83439e + cd90884 commit d00a9dd
Showing 1 changed file with 40 additions and 30 deletions.
70 changes: 40 additions & 30 deletions pipeline/parsers/decoders.md
Original file line number Diff line number Diff line change
@@ -1,29 +1,35 @@
# Decoders

There are certain cases where the log messages being parsed contains encoded data, a typical use case can be found in containerized environments with Docker: application logs it data in JSON format but becomes an escaped string, Consider the following example
There are cases where the log messages being parsed contain encoded data. A typical
use case can be found in containerized environments with Docker. Docker logs its
data in JSON format, which uses escaped strings.

Original message generated by the application:
Consider the following message generated by the application:

```text
{"status": "up and running"}
```

Then the Docker log message become encapsulated as follows:
The Docker log message encapsulates something like this:

```text
{"log":"{\"status\": \"up and running\"}\r\n","stream":"stdout","time":"2018-03-09T01:01:44.851160855Z"}
```

as you can see the original message is handled as an escaped string. Ideally in Fluent Bit we would like to keep having the original structured message and not a string.
The original message is handled as an escaped string. Fluent Bit wants to use the
original structured message and not a string.

## Getting Started

Decoders are a built-in feature available through the Parsers file, each Parser definition can optionally set one or multiple decoders. There are two type of decoders type:
Decoders are a built-in feature available through the Parsers file. Each parser
definition can optionally set one or more decoders. There are two types of decoders:

* Decode\_Field: if the content can be decoded in a structured message, append that structure message \(keys and values\) to the original log message.
* Decode\_Field\_As: any content decoded \(unstructured or structured\) will be replaced in the same key/value, no extra keys are added.
- `Decode_Field`: If the content can be decoded in a structured message, append
the structured message (keys and values) to the original log message.
- `Decode_Field_As`: Any decoded content (unstructured or structured) will be
replaced in the same key/value, and no extra keys are added.

Our pre-defined Docker Parser have the following definition:
Our pre-defined Docker parser has the following definition:

```text
[PARSER]
Expand All @@ -37,54 +43,59 @@ Our pre-defined Docker Parser have the following definition:
Decode_Field_As escaped log
```

Each line in the parser with a key _Decode\_Field_ instruct the parser to apply a specific decoder on a given field, optionally it offer the option to take an extra action if the decoder cannot succeed.
Each line in the parser with a key `Decode_Field` instructs the parser to apply
a specific decoder on a given field. Optionally, it offers the option to take an
extra action if the decoder doesn't succeed.

### Decoders
### Decoder options

| Name | Description |
| :--- | :--- |
| json | handle the field content as a JSON map. If it find a JSON map it will replace the content with a structured map. |
| escaped | decode an escaped string. |
| escaped\_utf8 | decode a UTF8 escaped string. |
| Name | Description |
| -------------- | ----------- |
| `json` | Handle the field content as a JSON map. If it finds a JSON map, it replaces the content with a structured map. |
| `escaped` | Decode an escaped string. |
| `escaped_utf8` | Decode a UTF8 escaped string. |

### Optional Actions

By default if a decoder fails to decode the field or want to try a next decoder, is possible to define an optional action. Available actions are:
If a decoder fails to decode the field or, you want to try another decoder, you can
define an optional action. Available actions are:

| Name | Description |
| :--- | :--- |
| try\_next | if the decoder failed, apply the next Decoder in the list for the same field. |
| do\_next | if the decoder succeeded or failed, apply the next Decoder in the list for the same field. |
| -----| ----------- |
| `try_next` | if the decoder failed, apply the next decoder in the list for the same field. |
| `do_next` | if the decoder succeeded or failed, apply the next decoder in the list for the same field. |

Note that actions are affected by some restrictions:
Actions are affected by some restrictions:

* on Decode\_Field\_As, if succeeded, another decoder of the same type in the same field can be applied only if the data continues being an unstructured message \(raw text\).
* on Decode\_Field, if succeeded, can only be applied once for the same field. By nature Decode\_Field aims to decode a structured message.
- `Decode_Field_As`: If successful, another decoder of the same type and the same
field can be applied only if the data continues being an unstructured message (raw text).
- `Decode_Field`: If successful, can only be applied once for the same field.
`Decode`_Field` is intended to decode a structured message.

### Examples

### escaped\_utf8
#### `escaped_utf8`

Example input \(from `/path/to/log.log` in configuration below\)
Example input from `/path/to/log.log`:

```text
{"log":"\u0009Checking indexes...\n","stream":"stdout","time":"2018-02-19T23:25:29.1845444Z"}
{"log":"\u0009\u0009Validated: _audit _internal _introspection _telemetry _thefishbucket history main snmp_data summary\n","stream":"stdout","time":"2018-02-19T23:25:29.1845536Z"}
{"log":"\u0009Done\n","stream":"stdout","time":"2018-02-19T23:25:29.1845622Z"}
```

Example output
Example output:

```text
[24] tail.0: [1519082729.184544400, {"log"=>" Checking indexes...
[24] tail.0: [1519082729.184544400, {"log"=>" Checking indexes...
", "stream"=>"stdout", "time"=>"2018-02-19T23:25:29.1845444Z"}]
[25] tail.0: [1519082729.184553600, {"log"=>" Validated: _audit _internal _introspection _telemetry _thefishbucket history main snmp_data summary
", "stream"=>"stdout", "time"=>"2018-02-19T23:25:29.1845536Z"}]
[26] tail.0: [1519082729.184562200, {"log"=>" Done
[26] tail.0: [1519082729.184562200, {"log"=>" Done
", "stream"=>"stdout", "time"=>"2018-02-19T23:25:29.1845622Z"}]
```

Configuration file
Decoder configuration file:

```text
[SERVICE]
Expand All @@ -100,7 +111,7 @@ Configuration file
Match *
```

The `fluent-bit-parsers.conf` file,
The `fluent-bit-parsers.conf` file:

```text
[PARSER]
Expand All @@ -110,4 +121,3 @@ The `fluent-bit-parsers.conf` file,
Time_Format %Y-%m-%dT%H:%M:%S %z
Decode_Field_as escaped_utf8 log
```

0 comments on commit d00a9dd

Please sign in to comment.