Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

administration: config: YAML: document all sections #1513

Merged
merged 2 commits into from
Nov 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 9 additions & 2 deletions SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,15 +49,22 @@
## Administration

* [Configuring Fluent Bit](administration/configuring-fluent-bit/README.md)
* [YAML Configuration](administration/configuring-fluent-bit/yaml/README.md)
* [Service](administration/configuring-fluent-bit/yaml/service-section.md)
* [Parsers](administration/configuring-fluent-bit/yaml/parsers-section.md)
* [Multiline Parsers](administration/configuring-fluent-bit/yaml/multiline-parsers-section.md)
* [Pipeline](administration/configuring-fluent-bit/yaml/pipeline-section.md)
* [Plugins](administration/configuring-fluent-bit/yaml/plugins-section.md)
* [Upstream Servers](administration/configuring-fluent-bit/yaml/upstream-servers-section.md))
* [Environment Variables](administration/configuring-fluent-bit/yaml/environment-variables-section.md)
* [Includes](administration/configuring-fluent-bit/yaml/includes-section.md)
* [Classic mode](administration/configuring-fluent-bit/classic-mode/README.md)
* [Format and Schema](administration/configuring-fluent-bit/classic-mode/format-schema.md)
* [Configuration File](administration/configuring-fluent-bit/classic-mode/configuration-file.md)
* [Variables](administration/configuring-fluent-bit/classic-mode/variables.md)
* [Commands](administration/configuring-fluent-bit/classic-mode/commands.md)
* [Upstream Servers](administration/configuring-fluent-bit/classic-mode/upstream-servers.md)
* [Record Accessor](administration/configuring-fluent-bit/classic-mode/record-accessor.md)
* [YAML Configuration](administration/configuring-fluent-bit/yaml/README.md)
* [Configuration File](administration/configuring-fluent-bit/yaml/configuration-file.md)
* [Unit Sizes](administration/configuring-fluent-bit/unit-sizes.md)
* [Multiline Parsing](administration/configuring-fluent-bit/multiline-parsing.md)
* [Transport Security](administration/transport-security.md)
Expand Down
11 changes: 5 additions & 6 deletions administration/configuring-fluent-bit/README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,13 @@
# Configuring Fluent Bit

Fluent Bit supports these configuration formats:
Currently, Fluent Bit supports two configuration formats:

- [Classic mode](classic-mode/README.md)
- [YAML](yaml/README.md) (Fluent Bit 2.0 or greater)
* [Yaml](yaml/README.md): standard configuration format as of v3.2.
* [Classic mode](classic-mode/README.md): to be deprecated at the end of 2025.

## CLI flags
## Command line interface

Fluent Bit also supports a CLI with various flags for the available configuration
options.
Fluent Bit exposes most of it features through the command line interface. Running the `-h` option you can get a list of the options available:

```shell
$ docker run --rm -it fluent/fluent-bit --help
Expand Down
45 changes: 43 additions & 2 deletions administration/configuring-fluent-bit/yaml/README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,44 @@
# Fluent Bit YAML configuration
# Fluent Bit YAML Configuration

YAML configuration feature was introduced since FLuent Bit version 1.9 as experimental, and it is production ready since Fluent Bit 2.0.
## Before You Get Started

Fluent Bit traditionally offered a `classic` configuration mode, a custom configuration format that we are gradually phasing out. While `classic` mode has served well for many years, it has several limitations. Its basic design only supports grouping sections with key-value pairs and lacks the ability to handle sub-sections or complex data structures like lists.

YAML, now a mainstream configuration format, has become essential in a cloud ecosystem where everything is configured this way. To minimize friction and provide a more intuitive experience for creating data pipelines, we strongly encourage users to transition to YAML. The YAML format enables features, such as processors, that are not possible to configure in `classic` mode.

As of Fluent Bit v3.2, you can configure everything in YAML.

## List of Available Sections

Configuring Fluent Bit with YAML introduces the following root-level sections:

| Section Name |Description |
|----------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------|
| `service` | Describes the global configuration for the Fluent Bit service. This section is optional; if not set, default values will apply. Only one `service` section can be defined. |
| `parsers` | Lists parsers to be used by components like inputs, processors, filters, or output plugins. You can define multiple `parsers` sections, which can also be loaded from external files included in the main YAML configuration. |
| `multiline_parsers` | Lists multiline parsers, functioning similarly to `parsers`. Multiple definitions can exist either in the root or in included files. |
| `pipeline` | Defines a pipeline composed of inputs, processors, filters, and output plugins. You can define multiple `pipeline` sections, but they will not operate independently. Instead, all components will be merged into a single pipeline internally. |
| `plugins` | Specifies the path to external plugins (.so files) to be loaded by Fluent Bit at runtime. |
| `upstream_servers` | Refers to a group of node endpoints that can be referenced by output plugins that support this feature. |
| `env` | Sets a list of environment variables for Fluent Bit. Note that system environment variables are available, while the ones defined in the configuration apply only to Fluent Bit. |

## Section Documentation

To access detailed configuration guides for each section, use the following links:

- [Service Section documentation](service-section.md)
- Overview of global settings, configuration options, and examples.
- [Parsers Section documentation](parsers-section.md)
- Detailed guide on defining parsers and supported formats.
- [Multiline Parsers Section documentation](multiline-parsers-section.md)
- Explanation of multiline parsing configuration.
- [Pipeline Section documentation](pipeline-section.md)
- Details on setting up pipelines and using processors.
- [Plugins Section documentation](plugins-section.md)
- How to load external plugins.
- [Upstream Servers Section documentation](upstream-servers-section.md)
- Guide on setting up and using upstream nodes with supported plugins.
- [Environment Variables Section documentation](environment-variables-section.md)
- Information on setting environment variables and their scope within Fluent Bit.
- [Includes Section documentation](includes-section.md)
- Description on how to include external YAML files.
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# Environment Variables Section

The `env` section allows you to define environment variables directly within the configuration file. These variables can then be used to dynamically replace values throughout your configuration using the `${VARIABLE_NAME}` syntax.

Values set in the `env` section are case-sensitive. However, as a best practice, we recommend using uppercase names for environment variables. The example below defines two variables, `FLUSH_INTERVAL` and `STDOUT_FMT`, which can be accessed in the configuration using `${FLUSH_INTERVAL}` and `${STDOUT_FMT}`:

```yaml
env:
FLUSH_INTERVAL: 1
STDOUT_FMT: 'json_lines'

service:
flush: ${FLUSH_INTERVAL}
log_level: info

pipeline:
inputs:
- name: random

outputs:
- name: stdout
match: '*'
format: ${STDOUT_FMT}
```

## Predefined Variables

Fluent Bit provides a set of predefined environment variables that can be used in your configuration:

| Name | Description |
|--|--|
| `${HOSTNAME}` | The system’s hostname. |

## External Variables

In addition to variables defined in the configuration file or the predefined ones, Fluent Bit can access system environment variables set in the user space. These external variables can be referenced in the configuration using the same ${VARIABLE_NAME} pattern.

For example, to set the FLUSH_INTERVAL system environment variable to 2 and use it in your configuration:

```bash
export FLUSH_INTERVAL=2
```

In the configuration file, you can then access this value as follows:

```yaml
service:
flush: ${FLUSH_INTERVAL}
log_level: info

pipeline:
inputs:
- name: random

outputs:
- name: stdout
match: '*'
format: json_lines
```

This approach allows you to easily manage and override configuration values using environment variables, providing flexibility in various deployment environments.
32 changes: 32 additions & 0 deletions administration/configuring-fluent-bit/yaml/includes-section.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Includes Section

The `includes` section allows you to specify additional YAML configuration files to be merged into the current configuration. These files are identified as a list of filenames and can include relative or absolute paths. If no absolute path is provided, the file is assumed to be located in a directory relative to the file that references it.

This feature is useful for organizing complex configurations into smaller, manageable files and including them as needed.

### Usage

Below is an example demonstrating how to include additional YAML files using relative path references. This is the file system path structure

```
├── fluent-bit.yaml
├── inclusion-1.yaml
└── subdir
└── inclusion-2.yaml
```

The content of `fluent-bit.yaml`

```yaml
includes:
- inclusion-1.yaml
- subdir/inclusion-2.yaml
```

## Key Points

- Relative Paths: If a path is not specified as absolute, it will be treated as relative to the file that includes it.

- Organized Configurations: Using the includes section helps keep your configuration modular and easier to maintain.

> note: Ensure that the included files are formatted correctly and contain valid YAML configurations for seamless integration.
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Multiline Parsers

Multiline parsers are used to combine logs that span multiple events into a single, cohesive message. This is particularly useful for handling stack traces, error logs, or any log entry that contains multiple lines of information.

In YAML configuration, the syntax for defining multiline parsers differs slightly from the classic configuration format introducing minor breaking changes, specifically on how the rules are defined.

Below is an example demonstrating how to define a multiline parser directly in the main configuration file, as well as how to include additional definitions from external files:

```yaml
multiline_parsers:
- name: multiline-regex-test
type: regex
flush_timeout: 1000
rules:
- state: start_state
regex: '/([a-zA-Z]+ \d+ \d+:\d+:\d+)(.*)/'
next_state: cont
- state: cont
regex: '/^\s+at.*/'
next_state: cont
```

The example above defines a multiline parser named `multiline-regex-test` that uses regular expressions to handle multi-event logs. The parser contains two rules: the first rule transitions from start_state to cont when a matching log entry is detected, and the second rule continues to match subsequent lines.

For more detailed information on configuring multiline parsers, including advanced options and use cases, please refer to the Configuring Multiline Parsers section.

23 changes: 23 additions & 0 deletions administration/configuring-fluent-bit/yaml/parsers-section.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Parsers Section

Parsers enable Fluent Bit components to transform unstructured data into a structured internal representation. You can define parsers either directly in the main configuration file or in separate external files for better organization.

This page provides a general overview of how to declare parsers.

The main section name is `parsers`, and it allows you to define a list of parser configurations. The following example demonstrates how to set up two simple parsers:

```yaml
parsers:
- name: json
format: json

- name: docker
format: json
time_key: time
time_format: "%Y-%m-%dT%H:%M:%S.%L"
time_keep: true
```

You can define multiple parsers sections, either within the main configuration file or distributed across included files.

For more detailed information on parser options and advanced configurations, please refer to the [Configuring Parsers]() section.
149 changes: 149 additions & 0 deletions administration/configuring-fluent-bit/yaml/pipeline-section.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
# Pipeline Section

The `pipeline` section defines the flow of how data is collected, processed, and sent to its final destination. It encompasses the following core concepts:

| Name | Description |
|---|---|
| `inputs` | Specifies the name of the plugin responsible for collecting or receiving data. This component serves as the data source in the pipeline. Examples of input plugins include `tail`, `http`, and `random`. |
| `processors` | **Unique to YAML configuration**, processors are specialized plugins that handle data processing directly attached to input plugins. Unlike filters, processors are not dependent on tag or matching rules. Instead, they work closely with the input to modify or enrich the data before it reaches the filtering or output stages. Processors are defined within an input plugin section. |
| `filters` | Filters are used to transform, enrich, or discard events based on specific criteria. They allow matching tags using strings or regular expressions, providing a more flexible way to manipulate data. Filters run as part of the main event loop and can be applied across multiple inputs and filters. Examples of filters include `modify`, `grep`, and `nest`. |
| `outputs` | Defines the destination for processed data. Outputs specify where the data will be sent, such as to a remote server, a file, or another service. Each output plugin is configured with matching rules to determine which events are sent to that destination. Common output plugins include `stdout`, `elasticsearch`, and `kafka`. |

## Example Configuration

Here’s a simple example of a pipeline configuration:

```yaml
pipeline:
inputs:
- name: tail
path: /var/log/example.log
parser: json

processors:
logs:
- name: record_modifier
filters:
- name: grep
match: '*'
regex: key pattern

outputs:
- name: stdout
match: '*'
```

## Pipeline Processors

Processors operate on specific signals such as logs, metrics, and traces. They are attached to an input plugin and must specify the signal type they will process.

### Example of a Processor

In the example below, the content_modifier processor inserts or updates (upserts) the key my_new_key with the value 123 for all log records generated by the tail plugin. This processor is only applied to log signals:

```yaml
parsers:
- name: json
format: json

pipeline:
inputs:
- name: tail
path: /var/log/example.log
parser: json

processors:
logs:
- name: content_modifier
action: upsert
key: my_new_key
value: 123
filters:
- name: grep
match: '*'
regex: key pattern

outputs:
- name: stdout
match: '*'
```

Here is a more complete example with multiple processors:

```yaml
service:
log_level: info
http_server: on
http_listen: 0.0.0.0
http_port: 2021

pipeline:
inputs:
- name: random
tag: test-tag
interval_sec: 1
processors:
logs:
- name: modify
add: hostname monox
- name: lua
call: append_tag
code: |
function append_tag(tag, timestamp, record)
new_record = record
new_record["tag"] = tag
return 1, timestamp, new_record
end

outputs:
- name: stdout
match: '*'
processors:
logs:
- name: lua
call: add_field
code: |
function add_field(tag, timestamp, record)
new_record = record
new_record["output"] = "new data"
return 1, timestamp, new_record
end
```

You might noticed that processors not only can be attached to input, but also to an output.

### How Are Processors Different from Filters?

While processors and filters are similar in that they can transform, enrich, or drop data from the pipeline, there is a significant difference in how they operate:

- Processors: Run in the same thread as the input plugin when the input plugin is configured to be threaded (threaded: true). This design provides better performance, especially in multi-threaded setups.

- Filters: Run in the main event loop. When multiple filters are used, they can introduce performance overhead, particularly under heavy workloads.

## Running Filters as Processors

You can configure existing [Filters](https://docs.fluentbit.io/manual/pipeline/filters) to run as processors. There are no specific changes needed; you simply use the filter name as if it were a native processor.

### Example of a Filter Running as a Processor

In the example below, the grep filter is used as a processor to filter log events based on a pattern:

```yaml
parsers:
- name: json
format: json

pipeline:
inputs:
- name: tail
path: /var/log/example.log
parser: json

processors:
logs:
- name: grep
regex: log aa
outputs:
- name: stdout
match: '*'
```
Loading