diff --git a/.gitbook/assets/logo_documentation_0.10.png b/.gitbook/assets/logo_documentation_0.10.png new file mode 100644 index 000000000..7a072e651 Binary files /dev/null and b/.gitbook/assets/logo_documentation_0.10.png differ diff --git a/.gitbook/assets/logo_documentation_0.11.png b/.gitbook/assets/logo_documentation_0.11.png new file mode 100644 index 000000000..c252e9cb0 Binary files /dev/null and b/.gitbook/assets/logo_documentation_0.11.png differ diff --git a/.gitbook/assets/logo_documentation_0.12.png b/.gitbook/assets/logo_documentation_0.12.png new file mode 100644 index 000000000..d135b3507 Binary files /dev/null and b/.gitbook/assets/logo_documentation_0.12.png differ diff --git a/.gitbook/assets/logo_documentation_0.13.png b/.gitbook/assets/logo_documentation_0.13.png new file mode 100644 index 000000000..722360683 Binary files /dev/null and b/.gitbook/assets/logo_documentation_0.13.png differ diff --git a/.gitbook/assets/logo_documentation_0.14.png b/.gitbook/assets/logo_documentation_0.14.png new file mode 100644 index 000000000..66663951d Binary files /dev/null and b/.gitbook/assets/logo_documentation_0.14.png differ diff --git a/.gitbook/assets/logo_documentation_1.0.png b/.gitbook/assets/logo_documentation_1.0.png new file mode 100644 index 000000000..f5477bce3 Binary files /dev/null and b/.gitbook/assets/logo_documentation_1.0.png differ diff --git a/.gitbook/assets/logo_documentation_1.2 (1).png b/.gitbook/assets/logo_documentation_1.2 (1).png new file mode 100644 index 000000000..52ebf0a87 Binary files /dev/null and b/.gitbook/assets/logo_documentation_1.2 (1).png differ diff --git a/.gitbook/assets/logo_documentation_1.2.png b/.gitbook/assets/logo_documentation_1.2.png new file mode 100644 index 000000000..52ebf0a87 Binary files /dev/null and b/.gitbook/assets/logo_documentation_1.2.png differ diff --git a/.gitbook/assets/windows_installer (1).png b/.gitbook/assets/windows_installer (1).png new file mode 100644 index 000000000..37ac237d1 Binary files /dev/null and b/.gitbook/assets/windows_installer (1).png differ diff --git a/.gitbook/assets/windows_installer.png b/.gitbook/assets/windows_installer.png new file mode 100644 index 000000000..37ac237d1 Binary files /dev/null and b/.gitbook/assets/windows_installer.png differ diff --git a/README.md b/README.md index fd1dc1d4e..026e2d6c4 100644 --- a/README.md +++ b/README.md @@ -8,7 +8,7 @@ description: Next generation Log and Stream processor [Fluent Bit](http://fluentbit.io) is a Fast and Lightweight Log Processor, Stream Processor and Forwarder for Linux, OSX, Windows and BSD family operating systems. It has been made with a strong focus on performance to allow the collection of events from different sources without complexity. -### Features +## Features * High Performance * Data Parsing @@ -25,17 +25,15 @@ description: Next generation Log and Stream processor * Write any input, filter or output plugin in C language * Bonus: write [Filters in Lua](pipeline/filters/lua.md) or [Output plugins in Golang](development/golang-output-plugins.md) * [Monitoring](administration/monitoring.md): expose internal metrics over HTTP in JSON and [Prometheus](https://prometheus.io/) format -* [Stream Processing](stream-processing/untitled.md): Perform data selection and transformation using simple SQL queries +* [Stream Processing](): Perform data selection and transformation using simple SQL queries * Create new streams of data using query results * Aggregation Windows * Data analysis and prediction: Timeseries forecasting * Portable: runs on Linux, MacOS, Windows and BSD systems -### Fluent Bit, Fluentd and CNCF +## Fluent Bit, Fluentd and CNCF [Fluent Bit](http://fluentbit.io) is a sub-component of the [Fluentd](http://fluentd.org) project ecosystem, it's licensed under the terms of the [Apache License v2.0](http://www.apache.org/licenses/LICENSE-2.0). This project was created by [Treasure Data](https://www.treasuredata.com) and is its current primary sponsor. Nowadays Fluent Bit get contributions from several companies and individuals and same as [Fluentd](https://www.fluentd.org), it's hosted as a [CNCF](https://cncf.io) subproject. - - diff --git a/SUMMARY.md b/SUMMARY.md index d2b18bcde..9480f73e5 100644 --- a/SUMMARY.md +++ b/SUMMARY.md @@ -44,7 +44,7 @@ ## Administration * [Configuring Fluent Bit](administration/configuring-fluent-bit/README.md) - * [Files Schema / Structure](administration/configuring-fluent-bit/files-schema-structure/README.md) + * [Files Schema / Structure](administration/configuring-fluent-bit/files-schema-structure.md) * [Variables](administration/configuring-fluent-bit/variables.md) * [Commands](administration/configuring-fluent-bit/commands.md) * [Configuration File](administration/configuring-fluent-bit/configuration-file.md) @@ -65,7 +65,7 @@ * [Disk I/O Metrics](pipeline/inputs/disk-io-metrics.md) * [Dummy](pipeline/inputs/dummy.md) * [Exec](pipeline/inputs/exec.md) - * [Forward](pipeline/inputs/forward.md) + * [Forward](pipeline/outputs/forward.md) * [Head](pipeline/inputs/head.md) * [Health](pipeline/inputs/health.md) * [Kernel Logs](pipeline/inputs/kernel-logs.md) @@ -96,7 +96,7 @@ * [Parser](pipeline/filters/parser.md) * [Record Modifier](pipeline/filters/record-modifier.md) * [Rewrite Tag](pipeline/filters/rewrite-tag.md) - * [Standard Output](pipeline/filters/standard-output.md) + * [Standard Output](pipeline/outputs/standard-output.md) * [Throttle](pipeline/filters/throttle.md) * [Nest](pipeline/filters/nest.md) * [Modify](pipeline/filters/modify.md) @@ -124,7 +124,7 @@ ## Stream Processing -* [Introduction](stream-processing/README.md) +* [Introduction](stream-processing/stream-processing.md) * [Overview](stream-processing/overview.md) * [Changelog](stream-processing/changelog.md) * [Getting Started](stream-processing/getting-started/README.md) @@ -137,3 +137,4 @@ * [C Library API](development/library_api.md) * [Ingest Records Manually](development/ingest-records-manually.md) * [Golang Output Plugins](development/golang-output-plugins.md) + diff --git a/about/history.md b/about/history.md index 25474c28b..0b8630a26 100644 --- a/about/history.md +++ b/about/history.md @@ -4,7 +4,7 @@ description: Every project has a story # A Brief History of Fluent Bit -On 2014, the [Fluentd](https://fluentd.org) team at [Treasure Data](https://www.treasuredata.com) forecasted the need of a lightweight log processor for constraint environments like Embedded Linux and Gateways, the project aimed to be part of the Fluentd Ecosystem and we called it [Fluent Bit](https://fluentbit.io), fully open source and available under the terms of the [Apache License v2.0](http://www.apache.org/licenses/LICENSE-2.0). +On 2014, the [Fluentd](https://fluentd.org) team at [Treasure Data](https://www.treasuredata.com) forecasted the need of a lightweight log processor for constraint environments like Embedded Linux and Gateways, the project aimed to be part of the Fluentd Ecosystem and we called it [Fluent Bit](https://fluentbit.io), fully open source and available under the terms of the [Apache License v2.0](http://www.apache.org/licenses/LICENSE-2.0). After the project was around for some time, it got some traction in the Embedded market but we also started getting requests for several features from the Cloud community like more inputs, filters, and outputs. Not so long after that, Fluent Bit becomes one of the preferred solutions to solve the logging challenges in Cloud environments. diff --git a/about/what-is-fluent-bit.md b/about/what-is-fluent-bit.md index 0bfcbc7a9..0671b260b 100644 --- a/about/what-is-fluent-bit.md +++ b/about/what-is-fluent-bit.md @@ -1,6 +1,6 @@ # What is Fluent Bit ? -​[Fluent Bit](http://fluentbit.io/) is an open source and multi-platform log processor tool which aims to be a generic Swiss knife for log processing and distribution. +​[Fluent Bit](http://fluentbit.io/) is an open source and multi-platform log processor tool which aims to be a generic Swiss knife for log processing and distribution. Nowadays the number of sources of information in our environments is ever increasing. Handling data collection at scale is complex, and collecting and aggregating diverse data requires a specialized tool that can deal with: @@ -10,7 +10,5 @@ Nowadays the number of sources of information in our environments is ever increa * Security * Multiple destinations -[Fluent Bit](https://fluentbit.io) has been designed with performance and low resources consumption in mind. - - +[Fluent Bit](https://fluentbit.io) has been designed with performance and low resources consumption in mind. diff --git a/administration/backpressure.md b/administration/backpressure.md index 98bc7df40..0a3cf1d08 100644 --- a/administration/backpressure.md +++ b/administration/backpressure.md @@ -34,5 +34,5 @@ After some seconds if the scheduler was able to flush the initial 700KB of data Each plugin is independent and not all of them implements the **pause** and **resume** callbacks. As said, these callbacks are just a notification mechanism for the plugin. -The plugin who implements and keep a good state is the [Tail Input](../input/tail.md) plugin. When the **pause** callback is triggered, it stop their collectors and stop appending data. Upon **resume**, it re-enable the collectors. +The plugin who implements and keep a good state is the [Tail Input](https://github.com/fluent/fluent-bit-docs/tree/b78cfe98123e74e165f2b6669229da009258f34e/input/tail.md) plugin. When the **pause** callback is triggered, it stop their collectors and stop appending data. Upon **resume**, it re-enable the collectors. diff --git a/administration/buffering-and-storage.md b/administration/buffering-and-storage.md index d4caed71b..a22792f68 100644 --- a/administration/buffering-and-storage.md +++ b/administration/buffering-and-storage.md @@ -1,8 +1,8 @@ -# Fluent Bit and Buffering +# Buffering & Storage The end-goal of [Fluent Bit](https://fluentbit.io) is to collect, parse, filter and ship logs to a central place. In this workflow there are many phases and one of the critical pieces is the ability to do _buffering_ : a mechanism to place processed data into a temporal location until is ready to be shipped. -By default when Fluent Bit process data, it uses Memory as a primary and temporal place to store the record logs, but there are certain scenarios where would be ideal to have a persistent buffering mechanism based in the filesystem to provide aggregation and data safety capabilities. +By default when Fluent Bit process data, it uses Memory as a primary and temporal place to store the record logs, but there are certain scenarios where would be ideal to have a persistent buffering mechanism based in the filesystem to provide aggregation and data safety capabilities. Starting with Fluent Bit v1.0, we introduced a new _storage layer_ that can either work in memory or in the file system. Input plugins can be configured to use one or the other upon demand at start time. @@ -10,23 +10,23 @@ Starting with Fluent Bit v1.0, we introduced a new _storage layer_ that can eith The storage layer configuration takes place in two areas: -- Service Section -- Input Section +* Service Section +* Input Section The known Service section configure a global environment for the storage layer, and then in the Input sections defines which mechanism to use. ### Service Section Configuration -| Key | Description | Default | -| ------------------------- | ------------------------------------------------------------ | ------- | -| storage.path | Set an optional location in the file system to store streams and chunks of data. If this parameter is not set, Input plugins can only use in-memory buffering. | | -| storage.sync | Configure the synchronization mode used to store the data into the file system. It can take the values _normal_ or _full_. | normal | -| storage.checksum | Enable the data integrity check when writing and reading data from the filesystem. The storage layer uses the CRC32 algorithm. | Off | -| storage.backlog.mem_limit | If _storage.path_ is set, Fluent Bit will look for data chunks that were not delivered and are still in the storage layer, these are called _backlog_ data. This option configure a hint of maximum value of memory to use when processing these records. | 5M | +| Key | Description | Default | +| :--- | :--- | :--- | +| storage.path | Set an optional location in the file system to store streams and chunks of data. If this parameter is not set, Input plugins can only use in-memory buffering. | | +| storage.sync | Configure the synchronization mode used to store the data into the file system. It can take the values _normal_ or _full_. | normal | +| storage.checksum | Enable the data integrity check when writing and reading data from the filesystem. The storage layer uses the CRC32 algorithm. | Off | +| storage.backlog.mem\_limit | If _storage.path_ is set, Fluent Bit will look for data chunks that were not delivered and are still in the storage layer, these are called _backlog_ data. This option configure a hint of maximum value of memory to use when processing these records. | 5M | a Service section will look like this: -``` +```text [SERVICE] flush 1 log_Level info @@ -36,19 +36,19 @@ a Service section will look like this: storage.backlog.mem_limit 5M ``` -that configuration configure an optional buffering mechanism where it root for data is _/var/log/flb-storage/_, it will use _normal_ synchronization mode, without checksum and up to a maximum of 5MB of memory when processing backlog data. +that configuration configure an optional buffering mechanism where it root for data is _/var/log/flb-storage/_, it will use _normal_ synchronization mode, without checksum and up to a maximum of 5MB of memory when processing backlog data. ### Input Section Configuration Optionally, any Input plugin can configure their storage preference, the following table describe the options available: -| Key | Description | Default | -| ------------ | ------------------------------------------------------------ | ------- | -| storage.type | Specify the buffering mechanism to use. It can be _memory_ or _filesystem_. | memory | +| Key | Description | Default | +| :--- | :--- | :--- | +| storage.type | Specify the buffering mechanism to use. It can be _memory_ or _filesystem_. | memory | -The following example configure a service that offers filesystem buffering capabilities and two Input plugins being the first based in memory and the second with the filesystem: +The following example configure a service that offers filesystem buffering capabilities and two Input plugins being the first based in memory and the second with the filesystem: -``` +```text [SERVICE] flush 1 log_Level info @@ -56,11 +56,11 @@ The following example configure a service that offers filesystem buffering capab storage.sync normal storage.checksum off storage.backlog.mem_limit 5M - + [INPUT] name cpu storage.type filesystem - + [INPUT] name mem storage.type memory diff --git a/administration/configuring-fluent-bit/commands.md b/administration/configuring-fluent-bit/commands.md index ebf550e46..66c8c6b17 100644 --- a/administration/configuring-fluent-bit/commands.md +++ b/administration/configuring-fluent-bit/commands.md @@ -1,4 +1,4 @@ -# Configuration Commands +# Commands Configuration files must be flexible enough for any deployment need, but they must keep a clean and readable format. @@ -9,7 +9,7 @@ Fluent Bit _Commands_ extends a configuration file with specific built-in featur | [@INCLUDE](commands.md#cmd_include) | @INCLUDE FILE | Include a configuration file | | [@SET](commands.md#cmd_set) | @SET KEY=VAL | Set a configuration variable | -## @INCLUDE Command {#cmd_include} +## @INCLUDE Command Configuring a logging pipeline might lead to an extensive configuration file. In order to maintain a human-readable configuration, it's suggested to split the configuration in multiple files. @@ -60,7 +60,7 @@ Note that despites the order of inclusion, Fluent Bit will **ALWAYS** respect th * Filters * Outputs -## @SET Command {#cmd_set} +## @SET Command Fluent Bit supports [configuration variables](variables.md), one way to expose this variables to Fluent Bit is through setting a Shell environment variable, the other is through the _@SET_ command. diff --git a/administration/configuring-fluent-bit/configuration-file.md b/administration/configuring-fluent-bit/configuration-file.md index 1f65a34e2..9ba1a3ba9 100644 --- a/administration/configuring-fluent-bit/configuration-file.md +++ b/administration/configuring-fluent-bit/configuration-file.md @@ -6,16 +6,16 @@ Fluent Bit allows to use one configuration file which works at a global scope an The configuration file supports four types of sections: -* [Service](file.md#config_section) -* [Input](file.md#config_input) -* [Filter](file.md#config_filter) -* [Output](file.md#config_output) +* [Service](https://github.com/fluent/fluent-bit-docs/tree/5f926fd1330690179b8c1edab90d672699599ec7/administration/configuring-fluent-bit/file.md#config_section) +* [Input](https://github.com/fluent/fluent-bit-docs/tree/5f926fd1330690179b8c1edab90d672699599ec7/administration/configuring-fluent-bit/file.md#config_input) +* [Filter](https://github.com/fluent/fluent-bit-docs/tree/5f926fd1330690179b8c1edab90d672699599ec7/administration/configuring-fluent-bit/file.md#config_filter) +* [Output](https://github.com/fluent/fluent-bit-docs/tree/5f926fd1330690179b8c1edab90d672699599ec7/administration/configuring-fluent-bit/file.md#config_output) In addition there is an additional feature to include external files: -* [Include File](file.md#config_include_file) +* [Include File](https://github.com/fluent/fluent-bit-docs/tree/5f926fd1330690179b8c1edab90d672699599ec7/administration/configuring-fluent-bit/file.md#config_include_file) -## Service {#config_section} +## Service The _Service_ section defines global properties of the service, the keys available as of this version are described in the following table: @@ -26,12 +26,12 @@ The _Service_ section defines global properties of the service, the keys availab | Log\_File | Absolute path for an optional log file. | | | Log\_Level | Set the logging verbosity level. Allowed values are: error, info, debug and trace. Values are accumulative, e.g: if 'debug' is set, it will include error, info and debug. Note that _trace_ mode is only available if Fluent Bit was built with the _WITH\_TRACE_ option enabled. | info | | Parsers\_File | Path for a _parsers_ configuration file. Multiple Parsers\_File entries can be used. | | -| Plugins\_File | Path for a _plugins_ configuration file. A _plugins_ configuration file allows to define paths for external plugins, for an example [see here](https://github.com/fluent/fluent-bit/blob/master/conf/plugins.conf). || -| Streams\_File | Path for the Stream Processor configuration file. For details about the format of SP configuration file [see here](stream_processor.md). || +| Plugins\_File | Path for a _plugins_ configuration file. A _plugins_ configuration file allows to define paths for external plugins, for an example [see here](https://github.com/fluent/fluent-bit/blob/master/conf/plugins.conf). | | +| Streams\_File | Path for the Stream Processor configuration file. For details about the format of SP configuration file [see here](https://github.com/fluent/fluent-bit-docs/tree/5f926fd1330690179b8c1edab90d672699599ec7/administration/configuring-fluent-bit/stream_processor.md). | | | HTTP\_Server | Enable built-in HTTP Server | Off | | HTTP\_Listen | Set listening interface for HTTP Server when it's enabled | 0.0.0.0 | | HTTP\_Port | Set TCP Port for the HTTP Server | 2020 | -| Coro_Stack_Size | Set the coroutines stack size in bytes. The value must be greater than the page size of the running system. Don't set too small value (say 4096), or coroutine threads can overrun the stack buffer. | 24576 | +| Coro\_Stack\_Size | Set the coroutines stack size in bytes. The value must be greater than the page size of the running system. Don't set too small value \(say 4096\), or coroutine threads can overrun the stack buffer. | 24576 | ### Example @@ -44,7 +44,7 @@ The following is an example of a _SERVICE_ section: Log_Level debug ``` -## Input {#config_input} +## Input An _INPUT_ section defines a source \(related to an input plugin\), here we will describe the base configuration for each _INPUT_ section. Note that each input plugin may add it own configuration keys: @@ -65,7 +65,7 @@ The following is an example of an _INPUT_ section: Tag my_cpu ``` -## Filter {#config_filter} +## Filter A _FILTER_ section defines a filter \(related to an filter plugin\), here we will describe the base configuration for each _FILTER_ section. Note that each filter plugin may add it own configuration keys: @@ -73,9 +73,9 @@ A _FILTER_ section defines a filter \(related to an filter plugin\), here we wil | :--- | :--- | :--- | | Name | Name of the filter plugin. | | | Match | A pattern to match against the tags of incoming records. It's case sensitive and support the star \(\*\) character as a wildcard. | | -| Match_Regex | A regular expression to match against the tags of incoming records. Use this option if you want to use the full regex syntax. | | +| Match\_Regex | A regular expression to match against the tags of incoming records. Use this option if you want to use the full regex syntax. | | -The _Name_ is mandatory and it let Fluent Bit know which filter plugin should be loaded. The _Match_ or _Match_Regex_ is mandatory for all plugins. If both are specified, _Match_Regex_ takes precedence. +The _Name_ is mandatory and it let Fluent Bit know which filter plugin should be loaded. The _Match_ or _Match\_Regex_ is mandatory for all plugins. If both are specified, _Match\_Regex_ takes precedence. ### Example @@ -87,15 +87,15 @@ The following is an example of an _FILTER_ section: Match * ``` -## Output {#config_output} +## Output The _OUTPUT_ section specify a destination that certain records should follow after a Tag match. The configuration support the following keys: -| Key | Description | -| :--- | :--- | -| Name | Name of the output plugin. | +| Key | Description | | +| :--- | :--- | :--- | +| Name | Name of the output plugin. | | | Match | A pattern to match against the tags of incoming records. It's case sensitive and support the star \(\*\) character as a wildcard. | | -| Match_Regex | A regular expression to match against the tags of incoming records. Use this option if you want to use the full regex syntax. | | +| Match\_Regex | A regular expression to match against the tags of incoming records. Use this option if you want to use the full regex syntax. | | ### Example @@ -126,7 +126,7 @@ The following configuration file example demonstrates how to collect CPU metrics Match my*cpu ``` -## Include File {#config_include_file} +## Include File To avoid complicated long configuration files is better to split specific parts in different files and call them \(include\) from one main file. @@ -149,3 +149,4 @@ Wildcard character \(\*\) is supported to include multiple files, e.g: ```text @INCLUDE input_*.conf ``` + diff --git a/administration/configuring-fluent-bit/files-schema-structure.md b/administration/configuring-fluent-bit/files-schema-structure.md index 1fe7f3d96..7cfd367cf 100644 --- a/administration/configuring-fluent-bit/files-schema-structure.md +++ b/administration/configuring-fluent-bit/files-schema-structure.md @@ -1,55 +1,2 @@ -# Configuration Schema - -Fluent Bit may optionally use a configuration file to define how the service will behave, and before proceeding we need to understand how the configuration schema works. The schema is defined by three concepts: - -* [Sections](schema.md#sections) -* [Entries: Key/Value](schema.md#entries_kv) -* [Indented Configuration Mode](schema.md#indented_mode) - -A simple example of a configuration file is as follows: - -```python -[SERVICE] - # This is a commented line - Daemon off - log_level debug -``` - -## Sections {#sections} - -A section is defined by a name or title inside brackets. Looking at the example above, a Service section has been set using **\[SERVICE\]** definition. Section rules: - -* All section content must be indented \(4 spaces ideally\). -* Multiple sections can exist on the same file. -* A section is expected to have comments and entries, it cannot be empty. -* Any commented line under a section, must be indented too. - -## Entries: Key/Value {#entries_kv} - -A section may contain **Entries**, an entry is defined by a line of text that contains a **Key** and a **Value**, using the above example, the **\[SERVICE\]** section contains two entries, one is the key **Daemon** with value **off** and the other is the key **Log\_Level** with the value **debug**. -Entries rules: - -* An entry is defined by a key and a value. -* A key must be indented. -* A key must contain a value which ends in the breakline. -* Multiple keys with the same name can exist. - -Also commented lines are set prefixing the **\#** character, those lines are not processed but they must be indented too. - -## Indented Configuration Mode {#indented_mode} - -Fluent Bit configuration files are based in a strict **Indented Mode**, that means that each configuration file must follow the same pattern of alignment from left to right when writing text. By default an indentation level of four spaces from left to right is suggested. Example: - -```python -[FIRST_SECTION] - # This is a commented line - Key1 some value - Key2 another value - # more comments - -[SECOND_SECTION] - KeyN 3.14 -``` - -As you can see there are two sections with multiple entries and comments, note also that empty lines are allowed and they do not need to be indented. +# Files Schema / Structure diff --git a/administration/configuring-fluent-bit/unit-sizes.md b/administration/configuring-fluent-bit/unit-sizes.md index f16a2c670..391c53a06 100644 --- a/administration/configuring-fluent-bit/unit-sizes.md +++ b/administration/configuring-fluent-bit/unit-sizes.md @@ -1,6 +1,6 @@ # Unit Sizes -Certain configuration directives in Fluent Bit refer to unit sizes such as when defining the size of a buffer or specific limits, we can find these in plugins like [Tail Input](../input/tail.md), [Forward Input](../input/forward.md) or in generic properties like [Mem\_Buf\_Limit](backpressure.md). +Certain configuration directives in Fluent Bit refer to unit sizes such as when defining the size of a buffer or specific limits, we can find these in plugins like [Tail Input](https://github.com/fluent/fluent-bit-docs/tree/5f926fd1330690179b8c1edab90d672699599ec7/administration/input/tail.md), [Forward Input](https://github.com/fluent/fluent-bit-docs/tree/5f926fd1330690179b8c1edab90d672699599ec7/administration/input/forward.md) or in generic properties like [Mem\_Buf\_Limit](https://github.com/fluent/fluent-bit-docs/tree/5f926fd1330690179b8c1edab90d672699599ec7/administration/configuring-fluent-bit/backpressure.md). Starting from [Fluent Bit](http://fluentbit.io) v0.11.10, all unit sizes have been standardized across the core and plugins, the following table describes the options that can be used and what they mean: diff --git a/administration/configuring-fluent-bit/upstream-servers.md b/administration/configuring-fluent-bit/upstream-servers.md index 520619e92..ad405d0c4 100644 --- a/administration/configuring-fluent-bit/upstream-servers.md +++ b/administration/configuring-fluent-bit/upstream-servers.md @@ -1,41 +1,41 @@ # Upstream Servers -It's common that Fluent Bit [output plugins](../output/) aims to connect to external services to deliver the logs over the network, this is the case of [HTTP](../output/http.md), [Elasticsearch](../output/elasticsearch.md) and [Forward](../output/forward.md) within others. Being able to connect to one node (host) is normal and enough for more of the use cases, but there are other scenarios where balancing across different nodes is required. The _Upstream_ feature provides such capability. +It's common that Fluent Bit [output plugins](https://github.com/fluent/fluent-bit-docs/tree/5f926fd1330690179b8c1edab90d672699599ec7/administration/output/README.md) aims to connect to external services to deliver the logs over the network, this is the case of [HTTP](https://github.com/fluent/fluent-bit-docs/tree/5f926fd1330690179b8c1edab90d672699599ec7/administration/output/http.md), [Elasticsearch](https://github.com/fluent/fluent-bit-docs/tree/5f926fd1330690179b8c1edab90d672699599ec7/administration/output/elasticsearch.md) and [Forward](https://github.com/fluent/fluent-bit-docs/tree/5f926fd1330690179b8c1edab90d672699599ec7/administration/output/forward.md) within others. Being able to connect to one node \(host\) is normal and enough for more of the use cases, but there are other scenarios where balancing across different nodes is required. The _Upstream_ feature provides such capability. -An _Upstream_ defines a set of nodes that will be targeted by an output plugin, by the nature of the implementation an output plugin __must__ support the _Upstream_ feature. The following plugin(s) have _Upstream_ support: +An _Upstream_ defines a set of nodes that will be targeted by an output plugin, by the nature of the implementation an output plugin **must** support the _Upstream_ feature. The following plugin\(s\) have _Upstream_ support: -- [Forward](../output/forward.md) +* [Forward](https://github.com/fluent/fluent-bit-docs/tree/5f926fd1330690179b8c1edab90d672699599ec7/administration/output/forward.md) -The current balancing mode implemented is _round-robin_. +The current balancing mode implemented is _round-robin_. ## Configuration To define an _Upstream_ it's required to create an specific configuration file that contains an UPSTREAM and one or multiple NODE sections. The following table describe the properties associated to each section. Note that all of them are mandatory: -| Section | Key | Description | -| -------- | ---- | ---------------------------------------------- | +| Section | Key | Description | +| :--- | :--- | :--- | | UPSTREAM | name | Defines a name for the _Upstream_ in question. | -| NODE | name | Defines a name for the _Node_ in question. | -| | host | IP address or hostname of the target host. | -| | port | TCP port of the target service. | +| NODE | name | Defines a name for the _Node_ in question. | +| | host | IP address or hostname of the target host. | +| | port | TCP port of the target service. | ### Nodes and specific plugin configuration -A _Node_ might contain additional configuration keys required by the plugin, on that way we provide enough flexibility for the output plugin, a common use case is Forward output where if TLS is enabled, it requires a shared key (more details in the example below). +A _Node_ might contain additional configuration keys required by the plugin, on that way we provide enough flexibility for the output plugin, a common use case is Forward output where if TLS is enabled, it requires a shared key \(more details in the example below\). -### Nodes and TLS (Transport Layer Security) +### Nodes and TLS \(Transport Layer Security\) In addition to the properties defined in the table above, the network operations against a defined node can optionally be done through the use of TLS for further encryption and certificates use. -The TLS options available are described in the [TLS/SSL](tls_ssl.md) section and can be added to the any _Node_ section. +The TLS options available are described in the [TLS/SSL](https://github.com/fluent/fluent-bit-docs/tree/5f926fd1330690179b8c1edab90d672699599ec7/administration/configuring-fluent-bit/tls_ssl.md) section and can be added to the any _Node_ section. ### Configuration File Example The following example defines an _Upstream_ called forward-balancing which aims to be used by Forward output plugin, it register three _Nodes_: -- node-1: connects to 127.0.0.1:43000 -- node-2: connects to 127.0.0.1:44000 -- node-3: connects to 127.0.0.1:45000 using TLS without verification. It also defines a specific configuration option required by Forward output called _shared_key_. +* node-1: connects to 127.0.0.1:43000 +* node-2: connects to 127.0.0.1:44000 +* node-3: connects to 127.0.0.1:45000 using TLS without verification. It also defines a specific configuration option required by Forward output called _shared\_key_. ```python [UPSTREAM] @@ -58,7 +58,7 @@ The following example defines an _Upstream_ called forward-balancing which aims tls on tls.verify off shared_key secret - ``` -Note that every _Upstream_ definition __must__ exists on it own configuration file in the file system. Adding multiple _Upstreams_ in the same file or different files is not allowed. \ No newline at end of file +Note that every _Upstream_ definition **must** exists on it own configuration file in the file system. Adding multiple _Upstreams_ in the same file or different files is not allowed. + diff --git a/administration/configuring-fluent-bit/variables.md b/administration/configuring-fluent-bit/variables.md index 85155e02c..b69a9418e 100644 --- a/administration/configuring-fluent-bit/variables.md +++ b/administration/configuring-fluent-bit/variables.md @@ -1,4 +1,4 @@ -# Configuration Variables +# Variables Fluent Bit supports the usage of environment variables in any value associated to a key when using a configuration file. @@ -12,7 +12,7 @@ When Fluent Bit starts, the configuration reader will detect any request for `${ ## Example -Create the following configuration file (`fluent-bit.conf`): +Create the following configuration file \(`fluent-bit.conf`\): ```text [SERVICE] diff --git a/administration/memory-management.md b/administration/memory-management.md index 813e4eeb0..6a5f150aa 100644 --- a/administration/memory-management.md +++ b/administration/memory-management.md @@ -1,4 +1,4 @@ -# Memory Usage +# Memory Management In certain scenarios would be ideal to estimate how much memory Fluent Bit could be using, this is very useful for containerized environments where memory limits are a must. @@ -8,7 +8,7 @@ In order to estimate we will assume that the input plugins have set the **Mem\_B Input plugins append data independently, so in order to do an estimation a limit should be imposed through the **Mem\_Buf\_Limit** option. If the limit was set to _10MB_ we need to estimate that in the worse case, the output plugin likely could use _20MB_. -Fluent Bit has an internal binary representation for the data being processed, but when this data reach an output plugin, this one will likely create their own representation in a new memory buffer for processing. The best example are the [InfluxDB](../output/influxdb.md) and [Elasticsearch](../output/elasticsearch.md) output plugins, both needs to convert the binary representation to their respective-custom JSON formats before to talk to their backend servers. +Fluent Bit has an internal binary representation for the data being processed, but when this data reach an output plugin, this one will likely create their own representation in a new memory buffer for processing. The best example are the [InfluxDB](https://github.com/fluent/fluent-bit-docs/tree/b78cfe98123e74e165f2b6669229da009258f34e/output/influxdb.md) and [Elasticsearch](https://github.com/fluent/fluent-bit-docs/tree/b78cfe98123e74e165f2b6669229da009258f34e/output/elasticsearch.md) output plugins, both needs to convert the binary representation to their respective-custom JSON formats before to talk to their backend servers. So, if we impose a limit of _10MB_ for the input plugins and considering the worse case scenario of the output plugin consuming _20MB_ extra, as a minimum we need \(_30MB_ x 1.2\) = **36MB**. diff --git a/administration/monitoring.md b/administration/monitoring.md index 00be4462a..37640e269 100644 --- a/administration/monitoring.md +++ b/administration/monitoring.md @@ -2,7 +2,7 @@ Fluent Bit comes with a built-in HTTP Server that can be used to query internal information and monitor metrics of each running plugin. -## Getting Started {#getting_started} +## Getting Started To get started, the first step is to enable the HTTP Server from the configuration file: @@ -62,7 +62,7 @@ $ curl -s http://127.0.0.1:2020 | jq Note that we are sending the _curl_ command output to the _jq_ program which helps to make the JSON data easy to read from the terminal. Fluent Bit don't aim to do JSON pretty-printing. -## REST API Interface {#rest_api} +## REST API Interface Fluent Bit aims to expose useful interfaces for monitoring, as of Fluent Bit v0.14 the following end points are available: @@ -77,18 +77,17 @@ Fluent Bit aims to expose useful interfaces for monitoring, as of Fluent Bit v0. Query the service uptime with the following command: -``` +```text $ curl -s http://127.0.0.1:2020/api/v1/uptime | jq ``` it should print a similar output like this: -```json +```javascript { "uptime_sec": 8950000, "uptime_hr": "Fluent Bit has been running: 103 days, 14 hours, 6 minutes and 40 seconds" } - ``` ## Metrics Examples @@ -101,7 +100,7 @@ $ curl -s http://127.0.0.1:2020/api/v1/metrics | jq it should print a similar output like this: -```json +```javascript { "input": { "cpu.0": { @@ -131,7 +130,7 @@ $ curl -s http://127.0.0.1:2020/api/v1/metrics/prometheus this time the same metrics will be in Prometheus format instead of JSON: -``` +```text fluentbit_input_records_total{name="cpu.0"} 57 1509150350542 fluentbit_input_bytes_total{name="cpu.0"} 18069 1509150350542 fluentbit_output_proc_records_total{name="stdout.0"} 54 1509150350542 @@ -141,15 +140,13 @@ fluentbit_output_retries_total{name="stdout.0"} 0 1509150350542 fluentbit_output_retries_failed_total{name="stdout.0"} 0 1509150350542 ``` - - ### Configuring Aliases -By default configured plugins on runtime get an internal name in the format _plugin_name.ID_. For monitoring purposes this can be confusing if many plugins of the same type were configured. To make a distinction each configured input or output section can get an _alias_ that will be used as the parent name for the metric. +By default configured plugins on runtime get an internal name in the format _plugin\_name.ID_. For monitoring purposes this can be confusing if many plugins of the same type were configured. To make a distinction each configured input or output section can get an _alias_ that will be used as the parent name for the metric. -The following example set an alias to the INPUT section which is using the [CPU](../input/cpu.md) input plugin: +The following example set an alias to the INPUT section which is using the [CPU](https://github.com/fluent/fluent-bit-docs/tree/b78cfe98123e74e165f2b6669229da009258f34e/input/cpu.md) input plugin: -``` +```text [SERVICE] HTTP_Server On HTTP_Listen 0.0.0.0 @@ -158,7 +155,7 @@ The following example set an alias to the INPUT section which is using the [CPU] [INPUT] Name cpu Alias server1_cpu - + [OUTPUT] Name stdout Alias raw_output @@ -167,7 +164,7 @@ The following example set an alias to the INPUT section which is using the [CPU] Now when querying the metrics we get the aliases in place instead of the plugin name: -```json +```javascript { "input": { "server1_cpu": { @@ -187,5 +184,3 @@ Now when querying the metrics we get the aliases in place instead of the plugin } ``` - - diff --git a/administration/scheduling-and-retries.md b/administration/scheduling-and-retries.md index 01c0d2479..7603f601f 100644 --- a/administration/scheduling-and-retries.md +++ b/administration/scheduling-and-retries.md @@ -1,31 +1,29 @@ -# Scheduler +# Scheduling and Retries -[Fluent Bit](https://fluentbit.io) has an Engine that helps to coordinate the data ingestion from input plugins and call the _Scheduler_ to decide when is time to flush the data through one or multiple output plugins. The Scheduler flush new data every a fixed time of seconds and Schedule retries when asked. +[Fluent Bit](https://fluentbit.io) has an Engine that helps to coordinate the data ingestion from input plugins and call the _Scheduler_ to decide when is time to flush the data through one or multiple output plugins. The Scheduler flush new data every a fixed time of seconds and Schedule retries when asked. Once an output plugin gets call to flush some data, after processing that data it can notify the Engine three possible return statuses: -- OK -- Retry -- Error +* OK +* Retry +* Error -If the return status was __OK__, it means it was successfully able to process and flush the data, if it returned an __Error__ status, means that an unrecoverable error happened and the engine should not try to flush that data again. If a __Retry__ was requested, the _Engine_ will ask the _Scheduler_ to retry to flush that data, the Scheduler will decide how many seconds to wait before that happen. +If the return status was **OK**, it means it was successfully able to process and flush the data, if it returned an **Error** status, means that an unrecoverable error happened and the engine should not try to flush that data again. If a **Retry** was requested, the _Engine_ will ask the _Scheduler_ to retry to flush that data, the Scheduler will decide how many seconds to wait before that happen. ## Configuring Retries -The Scheduler provides a simple configuration option called __Retry_Limit__ which can be set independently on each output section. This option allows to disable retries or impose a limit to try N times and then discard the data after reaching that limit: - -| | Value | Description | -| ----------- | ----- | ------------------------------------------------------------ | -| Retry_Limit | N | Integer value to set the maximum number of retries allowed. N must be >= 1 (default: 2) | -| Retry_Limit | False | When Retry_Limit is set to False, means that there is not limit for the number of retries that the Scheduler can do. | - +The Scheduler provides a simple configuration option called **Retry\_Limit** which can be set independently on each output section. This option allows to disable retries or impose a limit to try N times and then discard the data after reaching that limit: +| | Value | Description | +| :--- | :--- | :--- | +| Retry\_Limit | N | Integer value to set the maximum number of retries allowed. N must be >= 1 \(default: 2\) | +| Retry\_Limit | False | When Retry\_Limit is set to False, means that there is not limit for the number of retries that the Scheduler can do. | ### Example The following example configure two outputs where the HTTP plugin have an unlimited number of retries and the Elasticsearch plugin have a limit of 5 times: -``` +```text [OUTPUT] Name http Host 192.168.5.6 diff --git a/administration/security.md b/administration/security.md index 65e53c9ab..e79e4b8b7 100644 --- a/administration/security.md +++ b/administration/security.md @@ -1,4 +1,4 @@ -# TLS / SSL +# Security Fluent Bit provides integrated support for _Transport Layer Security_ \(TLS\) and it predecessor _Secure Sockets Layer_ \(SSL\) respectively. In this section we will refer as TLS only for both implementations. @@ -10,19 +10,19 @@ Each output plugin that requires to perform Network I/O can optionally enable TL | tls.verify | force certificate validation | On | | tls.debug | Set TLS debug verbosity level. It accept the following values: 0 \(No debug\), 1 \(Error\), 2 \(State change\), 3 \(Informational\) and 4 Verbose | 1 | | tls.ca\_file | absolute path to CA certificate file | | -| tls.ca\_path | absolute path to scan for certificate files | | +| tls.ca\_path | absolute path to scan for certificate files | | | tls.crt\_file | absolute path to Certificate file | | | tls.key\_file | absolute path to private Key file | | | tls.key\_passwd | optional password for tls.key\_file file | | -| tls.vhost | hostname to be used for TLS SNI extension | | +| tls.vhost | hostname to be used for TLS SNI extension | | The listed properties can be enabled in the configuration file, specifically on each output plugin section or directly through the command line. The following **output** plugins can take advantage of the TLS feature: -* [Elasticsearch](../output/elasticsearch.md) -* [Forward](../output/forward.md) -* [GELF](../output/gelf.md) -* [HTTP](../output/http.md) -* [Splunk](../output/splunk.md) +* [Elasticsearch](https://github.com/fluent/fluent-bit-docs/tree/b78cfe98123e74e165f2b6669229da009258f34e/output/elasticsearch.md) +* [Forward](https://github.com/fluent/fluent-bit-docs/tree/b78cfe98123e74e165f2b6669229da009258f34e/output/forward.md) +* [GELF](https://github.com/fluent/fluent-bit-docs/tree/b78cfe98123e74e165f2b6669229da009258f34e/output/gelf.md) +* [HTTP](https://github.com/fluent/fluent-bit-docs/tree/b78cfe98123e74e165f2b6669229da009258f34e/output/http.md) +* [Splunk](https://github.com/fluent/fluent-bit-docs/tree/b78cfe98123e74e165f2b6669229da009258f34e/output/splunk.md) ## Example: enable TLS on HTTP output @@ -58,9 +58,9 @@ The same behavior can be accomplished using a configuration file: ### Connect to virtual servers using TLS -Fluent Bit supports [TLS server name indication](https://en.wikipedia.org/wiki/Server_Name_Indication). If you are serving multiple hostnames on a single IP address (a.k.a. virtual hosting), you can make use of `tls.vhost` to connect to a specific hostname. +Fluent Bit supports [TLS server name indication](https://en.wikipedia.org/wiki/Server_Name_Indication). If you are serving multiple hostnames on a single IP address \(a.k.a. virtual hosting\), you can make use of `tls.vhost` to connect to a specific hostname. -``` +```text [INPUT] Name cpu Tag cpu @@ -75,3 +75,4 @@ Fluent Bit supports [TLS server name indication](https://en.wikipedia.org/wiki/S tls.ca_file /etc/certs/fluent.crt tls.vhost fluent.example.com ``` + diff --git a/concepts/buffering.md b/concepts/buffering.md index adeeeb15b..5a31b79d7 100644 --- a/concepts/buffering.md +++ b/concepts/buffering.md @@ -1,9 +1,16 @@ +--- +description: Performance and data safety +--- + # Buffering - -The end-goal of [Fluent Bit](https://fluentbit.io/) is to collect, parse, filter and ship logs to a central place. In this workflow there are many phases and one of the critical pieces is the ability to do _buffering_ : a mechanism to place processed data into a temporal location until is ready to be shipped. +When [Fluent Bit](https://fluentbit.io) process data, it uses the system memory \(heap\) as a primary and temporal place to store the record logs before they get delivered, on this private memory area the records are processed. + +Buffering refers to the ability to store the records somewhere, and while they are processed and delivered, still be able to store more. Buffering in memory is the fastest mechanism, but there are certain scenarios where the mechanism requires special strategies to deal with [backpressure](../administration/backpressure.md), data safety or reduce memory consumption by the service in constraint environments. + +Fluent Bit as buffering strategies, offers a primary buffering mechanism in **memory** and an optional secondary one using the **file system**. With this hybrid solution you can adjust to any use case safety and keep a high performance while processing your data. -By default when Fluent Bit process data, it uses Memory as a primary and temporal place to store the record logs, but there are certain scenarios where would be ideal to have a persistent buffering mechanism based in the filesystem to provide aggregation and data safety capabilities. +Both mechanisms are not exclusive and when the data is ready to be processed or delivered it will be always **in memory**, while other data in the queue might be in the file system until is ready to be processed and moved up to memory. -Starting with Fluent Bit v1.0, we introduced a new _storage layer_ that can either work in memory or in the file system. Input plugins can be configured to use one or the other upon demand at start time. +To learn more about the buffering configuration in Fluent Bit, please jump to the [Buffering & Storage](../administration/buffering-and-storage.md) section. diff --git a/concepts/data-pipeline/filter.md b/concepts/data-pipeline/filter.md index 898fb935c..644dc5935 100644 --- a/concepts/data-pipeline/filter.md +++ b/concepts/data-pipeline/filter.md @@ -8,5 +8,5 @@ Filtering is implemented through plugins, so each filter available could be used Very similar to the input plugins, Filters run in an instance context, which has its own independent configuration. Configuration keys are often called **properties**. -For more details about the Filters available and their usage, please refer to the [Filters](../filter/) section. +For more details about the Filters available and their usage, please refer to the [Filters](https://github.com/fluent/fluent-bit-docs/tree/31ef18ea4f94004badcc169d0e12e60967d50ef9/concepts/filter/README.md) section. diff --git a/concepts/data-pipeline/input.md b/concepts/data-pipeline/input.md index eace881fd..15ddf0408 100644 --- a/concepts/data-pipeline/input.md +++ b/concepts/data-pipeline/input.md @@ -8,5 +8,5 @@ When an input plugin is loaded, an internal _instance_ is created. Every instanc Every input plugin has its own documentation section where it's specified how it can be used and what properties are available. -For more details, please refer to the [Input Plugins](../input/) section. +For more details, please refer to the [Input Plugins](https://github.com/fluent/fluent-bit-docs/tree/31ef18ea4f94004badcc169d0e12e60967d50ef9/concepts/input/README.md) section. diff --git a/concepts/data-pipeline/output.md b/concepts/data-pipeline/output.md index cf9298324..1a9febe42 100644 --- a/concepts/data-pipeline/output.md +++ b/concepts/data-pipeline/output.md @@ -8,5 +8,5 @@ When an output plugin is loaded, an internal _instance_ is created. Every instan Every output plugin has its own documentation section specifying how it can be used and what properties are available. -For more details, please refer to the [Output Plugins](../output/) section. +For more details, please refer to the [Output Plugins](https://github.com/fluent/fluent-bit-docs/tree/31ef18ea4f94004badcc169d0e12e60967d50ef9/concepts/output/README.md) section. diff --git a/concepts/data-pipeline/parser.md b/concepts/data-pipeline/parser.md index ffd8bdf99..8bfb686c7 100644 --- a/concepts/data-pipeline/parser.md +++ b/concepts/data-pipeline/parser.md @@ -25,5 +25,5 @@ The above log line is a raw string without format, ideally we would like to give } ``` -Parsers are fully configurable and are independently and optionally handled by each input plugin, for more details please refer to the [Parsers](../parser/) section. +Parsers are fully configurable and are independently and optionally handled by each input plugin, for more details please refer to the [Parsers](https://github.com/fluent/fluent-bit-docs/tree/b6af8ea32759e6e8250d853b3d21e44b4a5427f8/concepts/parser/README.md) section. diff --git a/concepts/data-pipeline/router.md b/concepts/data-pipeline/router.md index e0d20a7a3..45fd7557e 100644 --- a/concepts/data-pipeline/router.md +++ b/concepts/data-pipeline/router.md @@ -1,4 +1,4 @@ -# Routing +# Router Routing is a core feature that allows to **route** your data through Filters and finally to one or multiple destinations. diff --git a/concepts/key-concepts.md b/concepts/key-concepts.md index cb2dcde49..d1fe79374 100644 --- a/concepts/key-concepts.md +++ b/concepts/key-concepts.md @@ -15,9 +15,9 @@ Before diving into [Fluent Bit](https://fluentbit.io) it’s good to get acquain * Match * Structured Message -### Event or Record +## Event or Record -Every incoming piece of data that belongs to a log or a metric that is retrieved by Fluent Bit is considered an Event or a Record. +Every incoming piece of data that belongs to a log or a metric that is retrieved by Fluent Bit is considered an Event or a Record. As an example consider the following content of a Syslog file: @@ -28,7 +28,7 @@ Jan 18 12:52:16 flb systemd[2222]: Started GNOME Terminal Server. Jan 18 12:52:16 flb gsd-media-keys[2640]: # watch_fast: "/org/gnome/terminal/legacy/" (establishing: 0, active: 0) ``` -It contains four lines and all of them represents **four** independent Events. +It contains four lines and all of them represents **four** independent Events. Internally, an Event always has two components \(in an array form\): @@ -36,9 +36,9 @@ Internally, an Event always has two components \(in an array form\): [TIMESTAMP, MESSAGE] ``` -### Filtering +## Filtering -In some cases is required to perform modifications on the Events content, the process to alter, enrich or drop Events is called Filtering. +In some cases is required to perform modifications on the Events content, the process to alter, enrich or drop Events is called Filtering. There are many use cases when Filtering is required like: @@ -46,7 +46,7 @@ There are many use cases when Filtering is required like: * Select a specific piece of the Event content. * Drop Events that matches certain pattern. -### Tag +## Tag Every Event that gets into Fluent Bit gets assigned a Tag. This tag is an internal string that is used in a later stage by the Router to decide which Filter or Output phase it must go through. @@ -56,7 +56,7 @@ Most of the tags are assigned manually in the configuration. If a tag is not spe The only input plugin that **don't** assign Tags is Forward input. This plugin speaks the Fluentd wire protocol called Forward where every Event already comes with a Tag associated. Fluent Bit will always use the incoming Tag set by the client. {% endhint %} -### Timestamp +## Timestamp The Timestamp represents the _time_ when an Event was created. Every Event contains a Timestamp associated. The Timestamp is a numeric fractional integer in the format: @@ -64,11 +64,11 @@ The Timestamp represents the _time_ when an Event was created. Every Event conta SECONDS.NANOSECONDS ``` -#### Seconds +### Seconds It is the number of seconds that have elapsed since the _Unix epoch._ -#### Nanoseconds +### Nanoseconds Fractional second or one thousand-millionth of a second. @@ -76,23 +76,23 @@ Fractional second or one thousand-millionth of a second. A timestamp always exists, either set by the Input plugin or discovered through a data parsing process. {% endhint %} -### Match +## Match Fluent Bit allows to deliver your collected and processed Events to one or multiple destinations, this is done through a routing phase. A Match represent a simple rule to select Events where it Tags maches a defined rule. FIXME: More about Tag and Matchs in the Routing section. -### Structured Message +## Structured Message Events can have or not have a structure. A structure defines a set of _keys_ and _values_ inside the Event message. As an example consider the following two messages: -#### No structured message +### No structured message ```javascript "Project Fluent Bit created on 1398289291" ``` -#### Structured Message +### Structured Message ```javascript {"project": "Fluent Bit", "created": 1398289291} @@ -101,8 +101,8 @@ Events can have or not have a structure. A structure defines a set of _keys_ and At a low level both are just an array of bytes, but the Structured message defines _keys_ and _values_, having a structure helps to implement faster operations on data modifications. {% hint style="info" %} -Fluent Bit **always** handle every Event message as a structured message. For performance reasons, we use a binary serialization data format called [MessagePack](https://msgpack.org/). +Fluent Bit **always** handle every Event message as a structured message. For performance reasons, we use a binary serialization data format called [MessagePack](https://msgpack.org/). -Consider [MessagePack](https://msgpack.org/) as a binary version of JSON on steroids. +Consider [MessagePack](https://msgpack.org/) as a binary version of JSON on steroids. {% endhint %} diff --git a/development/golang-output-plugins.md b/development/golang-output-plugins.md index 2b15b66d4..764dc2d57 100644 --- a/development/golang-output-plugins.md +++ b/development/golang-output-plugins.md @@ -1,44 +1,40 @@ -# Fluent Bit and Golang Plugins +# Golang Output Plugins -Fluent Bit currently supports integration of Golang plugins built as shared -objects for output plugins only. The interface for the Golang plugins is -currently under development but is functional. +Fluent Bit currently supports integration of Golang plugins built as shared objects for output plugins only. The interface for the Golang plugins is currently under development but is functional. ## Getting Started Compile Fluent Bit with Golang support, e.g: -``` +```text $ cd build/ $ cmake -DFLB_DEBUG=On -DFLB_PROXY_GO=On ../ $ make ``` -Once compiled, we can see a new option in the binary `-e` which stands for -_external plugin_, e.g: +Once compiled, we can see a new option in the binary `-e` which stands for _external plugin_, e.g: -``` +```text $ bin/fluent-bit -h Usage: fluent-bit [OPTION] Available Options - -c --config=FILE specify an optional configuration file - -d, --daemon run Fluent Bit in background mode - -f, --flush=SECONDS flush timeout in seconds (default: 5) - -i, --input=INPUT set an input - -m, --match=MATCH set plugin match, same as '-p match=abc' - -o, --output=OUTPUT set an output - -p, --prop="A=B" set plugin configuration property - -e, --plugin=FILE load an external plugin (shared lib) + -c --config=FILE specify an optional configuration file + -d, --daemon run Fluent Bit in background mode + -f, --flush=SECONDS flush timeout in seconds (default: 5) + -i, --input=INPUT set an input + -m, --match=MATCH set plugin match, same as '-p match=abc' + -o, --output=OUTPUT set an output + -p, --prop="A=B" set plugin configuration property + -e, --plugin=FILE load an external plugin (shared lib) ... ``` ## Build a Go Plugin -The _fluent-bit-go_ package is available to assist developers in creating Go -plugins. +The _fluent-bit-go_ package is available to assist developers in creating Go plugins. -https://github.com/fluent/fluent-bit-go +[https://github.com/fluent/fluent-bit-go](https://github.com/fluent/fluent-bit-go) At a minimum, a Go plugin looks like this: @@ -50,7 +46,7 @@ import "github.com/fluent/fluent-bit-go/output" //export FLBPluginRegister func FLBPluginRegister(def unsafe.Pointer) int { // Gets called only once when the plugin.so is loaded - return output.FLBPluginRegister(ctx, "gstdout", "Stdout GO!") + return output.FLBPluginRegister(ctx, "gstdout", "Stdout GO!") } //export FLBPluginInit @@ -67,16 +63,14 @@ func FLBPluginFlushCtx(ctx, data unsafe.Pointer, length C.int, tag *C.char) int //export FLBPluginExit func FLBPluginExit() int { - return output.FLB_OK + return output.FLB_OK } func main() { } ``` -the code above is a template to write an output plugin, it's really important -to keep the package name as `main` and add an explicit `main()` function. -This is a requirement as the code will be build as a shared library. +the code above is a template to write an output plugin, it's really important to keep the package name as `main` and add an explicit `main()` function. This is a requirement as the code will be build as a shared library. To build the code above, use the following line: @@ -84,16 +78,14 @@ To build the code above, use the following line: $ go build -buildmode=c-shared -o out_gstdout.so out_gstdout.go ``` -Once built, a shared library called `out\_gstdout.so` will be available. It's -really important to double check the final .so file is what we expect. Doing a -`ldd` over the library we should see something similar to this: +Once built, a shared library called `out\_gstdout.so` will be available. It's really important to double check the final .so file is what we expect. Doing a `ldd` over the library we should see something similar to this: -``` +```text $ ldd out_gstdout.so - linux-vdso.so.1 => (0x00007fff561dd000) - libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fc4aeef0000) - libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fc4aeb27000) - /lib64/ld-linux-x86-64.so.2 (0x000055751a4fd000) + linux-vdso.so.1 => (0x00007fff561dd000) + libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fc4aeef0000) + libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fc4aeb27000) + /lib64/ld-linux-x86-64.so.2 (0x000055751a4fd000) ``` ## Run Fluent Bit with the new plugin @@ -101,3 +93,4 @@ $ ldd out_gstdout.so ```bash $ bin/fluent-bit -e /path/to/out_gstdout.so -i cpu -o gstdout ``` + diff --git a/development/library_api.md b/development/library_api.md index 4562e0cc1..1e95bbb2e 100644 --- a/development/library_api.md +++ b/development/library_api.md @@ -1,4 +1,4 @@ -# Library API +# C Library API [Fluent Bit](http://fluentbit.io) library is written in C language and can be used from any C or C++ application. Before digging into the specification it is recommended to understand the workflow involved in the runtime. @@ -283,5 +283,5 @@ On success, it returns the number of bytes written; on error it returns -1. **Usage** -For more details and an example about how to use this function properly please refer to the next section [Ingest Records Manually](ingest_records_manually.md). +For more details and an example about how to use this function properly please refer to the next section [Ingest Records Manually](https://github.com/fluent/fluent-bit-docs/tree/f3e2d207298adf4011a0a7f4a82331362f9d3197/development/ingest_records_manually.md). diff --git a/imgs/logo_documentation_1.3.png b/imgs/logo_documentation_1.3.png new file mode 100644 index 000000000..9ae6297e5 Binary files /dev/null and b/imgs/logo_documentation_1.3.png differ diff --git a/installation/docker.md b/installation/docker.md index 592a7dec7..3e707520e 100644 --- a/installation/docker.md +++ b/installation/docker.md @@ -1,4 +1,4 @@ -# Docker Images +# Docker Fluent Bit container images are available on Docker Hub ready for production usage. Current available images can be deployed in multiple architectures. @@ -6,27 +6,27 @@ Fluent Bit container images are available on Docker Hub ready for production usa The following table describe the tags are available on Docker Hub [fluent/fluent-bit](https://hub.docker.com/r/fluent/fluent-bit/) repository: -| Tag\(s\) | Manifest Architectures | Description | -| :--------------------- | ------------------------ | ----------------------------------------------------------- | -| 1.3 | x86_64, arm64v8, arm32v7 | Latest release of 1.3.x series. | -| 1.3.0 | x86_64, arm64v8, arm32v7 | Release [v1.3.0](https://fluentbit.io/announcements/v1.3.0) | -| 1.3-debug, 1.3.0-debug | x86_64 | v1.3.x releases + Busybox | +| Tag\(s\) | Manifest Architectures | Description | +| :--- | :--- | :--- | +| 1.3 | x86\_64, arm64v8, arm32v7 | Latest release of 1.3.x series. | +| 1.3.0 | x86\_64, arm64v8, arm32v7 | Release [v1.3.0](https://fluentbit.io/announcements/v1.3.0) | +| 1.3-debug, 1.3.0-debug | x86\_64 | v1.3.x releases + Busybox | It's strongly suggested that you always use the latest image of Fluent Bit. ## Multi Architecture Images -Our x86_64 stable image is based in [Distroless](https://github.com/GoogleContainerTools/distroless) focusing on security containing just the Fluent Bit binary and minimal system libraries and basic configuration. Optionally, we provide _debug_ images for x86_64 which contains Busybox that can be used to troubleshoot or testing purposes. +Our x86_64 stable image is based in_ [_Distroless_](https://github.com/GoogleContainerTools/distroless) _focusing on security containing just the Fluent Bit binary and minimal system libraries and basic configuration. Optionally, we provide \_debug_ images for x86\_64 which contains Busybox that can be used to troubleshoot or testing purposes. -In addition, the main manifest provides images for arm64v8 and arm32v7 architctectures. From a deployment perspective there is no need to specify an architecture, the container client tool that pulls the image gets the proper layer for the running architecture. +In addition, the main manifest provides images for arm64v8 and arm32v7 architctectures. From a deployment perspective there is no need to specify an architecture, the container client tool that pulls the image gets the proper layer for the running architecture. For every architecture we build the layers using the following base images: -| Architecture | Base Image | -| ------------ | ------------------------------------------------------------ | -| x86_64 | [Distroless](https://github.com/GoogleContainerTools/distroless) | -| arm64v8 | arm64v8/debian:buster-slim | -| arm32v7 | arm32v7/debian:buster-slim | +| Architecture | Base Image | +| :--- | :--- | +| x86\_64 | [Distroless](https://github.com/GoogleContainerTools/distroless) | +| arm64v8 | arm64v8/debian:buster-slim | +| arm32v7 | arm32v7/debian:buster-slim | ## Getting Started @@ -58,14 +58,14 @@ Copyright (C) Treasure Data Alpine Linux uses Musl C library instead of Glibc. Musl is not fully compatible with Glibc which generated many issues in the following areas when used with Fluent Bit: -- Memory Allocator: to run Fluent Bit properly in high-load environments, we use Jemalloc as a default memory allocator which reduce fragmentation and provides better performance for our needs. Jemalloc cannot run smoothly with Musl and requires extra work. -- Alpine Linux Musl functions bootstrap have a compatibility issue when loading Golang shared libraries, this generate problems when trying to load Golang output plugins in Fluent Bit. -- Alpine Linux Musl Time format parser does not support Glibc extensions -- Maintainers preference in terms of base image due to security and maintenance reasons are Distroless and Debian. +* Memory Allocator: to run Fluent Bit properly in high-load environments, we use Jemalloc as a default memory allocator which reduce fragmentation and provides better performance for our needs. Jemalloc cannot run smoothly with Musl and requires extra work. +* Alpine Linux Musl functions bootstrap have a compatibility issue when loading Golang shared libraries, this generate problems when trying to load Golang output plugins in Fluent Bit. +* Alpine Linux Musl Time format parser does not support Glibc extensions +* Maintainers preference in terms of base image due to security and maintenance reasons are Distroless and Debian. ### Where 'latest' Tag points to ? -Our Docker containers images are deployed thousands of times per day, we take security and stability very seriously. +Our Docker containers images are deployed thousands of times per day, we take security and stability very seriously. -The _latest_ tag _most of the time_ points to the latest stable image. When we release a major update to Fluent Bit like for example from v1.2.x to v1.3.0, we don't move _latest_ tag until 2 weeks after the release. That give us extra time to verify with our community that everything works as expected. +The _latest_ tag _most of the time_ points to the latest stable image. When we release a major update to Fluent Bit like for example from v1.2.x to v1.3.0, we don't move _latest_ tag until 2 weeks after the release. That give us extra time to verify with our community that everything works as expected. diff --git a/installation/kubernetes.md b/installation/kubernetes.md index 45aab1c4f..413098b8d 100644 --- a/installation/kubernetes.md +++ b/installation/kubernetes.md @@ -13,7 +13,7 @@ Content: * [Concepts](kubernetes.md#concepts) * [Installation Steps](kubernetes.md#installation) -## Concepts {#concepts} +## Concepts Before geting started it is important to understand how Fluent Bit will be deployed. Kubernetes manages a cluster of _nodes_, so our log agent tool will need to run on every node to collect logs from every _POD_, hence Fluent Bit is deployed as a DaemonSet \(a POD that runs on every _node_ of the cluster\). @@ -30,7 +30,7 @@ To obtain these information, a built-in filter plugin called _kubernetes_ talks > Our Kubernetes Filter plugin is fully inspired on the [Fluentd Kubernetes Metadata Filter](https://github.com/fabric8io/fluent-plugin-kubernetes_metadata_filter) written by [Jimmi Dyson](https://github.com/jimmidyson). -## Installation {#installation} +## Installation [Fluent Bit](http://fluentbit.io) must be deployed as a DaemonSet, so on that way it will be available on every node of your Kubernetes cluster. To get started run the following commands to create the namespace, service account and role setup: @@ -49,9 +49,9 @@ $ kubectl create -f https://raw.githubusercontent.com/fluent/fluent-bit-kubernet ### Note for Kubernetes v1.16 -Starting from Kubernetes v1.16, DaemonSet resources are not longer served from ```extensions/v1beta``` . Our current Daemonset Yaml files uses the old ```apiVersion```. +Starting from Kubernetes v1.16, DaemonSet resources are not longer served from `extensions/v1beta` . Our current Daemonset Yaml files uses the old `apiVersion`. -If you are using Kubernetes v1.16, grab manually a copy of your Daemonset Yaml file and replace the value of ```apiVersion``` from: +If you are using Kubernetes v1.16, grab manually a copy of your Daemonset Yaml file and replace the value of `apiVersion` from: ```yaml apiVersion: extensions/v1beta1 @@ -65,7 +65,7 @@ apiVersion: apps/v1 You can read more about this deprecation on Kubernetes v1.14 Changelog here: -https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.14.md#deprecations +[https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.14.md\#deprecations](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.14.md#deprecations) ### Fluent Bit to Elasticsearch diff --git a/installation/linux/debian.md b/installation/linux/debian.md index cc537634d..39cad4833 100644 --- a/installation/linux/debian.md +++ b/installation/linux/debian.md @@ -1,6 +1,6 @@ -# Debian Packages +# Debian -Fluent Bit is distributed as **td-agent-bit** package and is available for the latest (and old) stable Debian systems: Buster, Stretch and Jessie. This stable Fluent Bit distribution package is maintained by [Treasure Data, Inc](https://www.treasuredata.com). +Fluent Bit is distributed as **td-agent-bit** package and is available for the latest \(and old\) stable Debian systems: Buster, Stretch and Jessie. This stable Fluent Bit distribution package is maintained by [Treasure Data, Inc](https://www.treasuredata.com). ## Server GPG key @@ -14,9 +14,9 @@ $ wget -qO - https://packages.fluentbit.io/fluentbit.key | sudo apt-key add - On Debian, you need to add our APT server entry to your sources lists, please add the following content at bottom of your **/etc/apt/sources.list** file: -#### Debian 10 (Buster) +#### Debian 10 \(Buster\) -``` +```text deb https://packages.fluentbit.io/debian/buster buster main ``` diff --git a/installation/linux/raspbian-raspberry-pi.md b/installation/linux/raspbian-raspberry-pi.md index 6714d0ee8..ed56b2dd9 100644 --- a/installation/linux/raspbian-raspberry-pi.md +++ b/installation/linux/raspbian-raspberry-pi.md @@ -1,4 +1,4 @@ -# Raspberry Pi +# Raspbian / Raspberry Pi Fluent Bit is distributed as **td-agent-bit** package and is available for the Raspberry, specifically for [Raspbian 8](http://raspbian.org). This stable Fluent Bit distribution package is maintained by [Treasure Data, Inc](https://www.treasuredata.com). @@ -65,3 +65,4 @@ sudo service td-agent-bit status ``` The default configuration of **td-agent-bit** is collecting metrics of CPU usage and sending the records to the standard output, you can see the outgoing data in your _/var/log/syslog_ file. + diff --git a/installation/linux/redhat-centos.md b/installation/linux/redhat-centos.md index 9af516a87..62436fe1a 100644 --- a/installation/linux/redhat-centos.md +++ b/installation/linux/redhat-centos.md @@ -1,4 +1,4 @@ -# CentOS Packages +# Redhat / CentOS ## Install on Redhat / CentOS @@ -17,7 +17,7 @@ gpgkey=https://packages.fluentbit.io/fluentbit.key enabled=1 ``` -note: we encourage you always enable the _gpgcheck_ for security reasons. All our packages are signed. +note: we encourage you always enable the _gpgcheck_ for security reasons. All our packages are signed. The GPG Key fingerprint is `F209 D876 2A60 CD49 E680 633B 4FF8 368B 6EA0 722A` diff --git a/installation/linux/ubuntu.md b/installation/linux/ubuntu.md index 14b5a226d..70884f321 100644 --- a/installation/linux/ubuntu.md +++ b/installation/linux/ubuntu.md @@ -1,4 +1,4 @@ -# Ubuntu Packages +# Ubuntu Fluent Bit is distributed as **td-agent-bit** package and is available for the latest stable Ubuntu system: Xenial Xerus. This stable Fluent Bit distribution package is maintained by [Treasure Data, Inc](https://www.treasuredata.com). diff --git a/installation/requirements.md b/installation/requirements.md index 62b5618fd..3fb0fbaed 100644 --- a/installation/requirements.md +++ b/installation/requirements.md @@ -4,7 +4,8 @@ * Compiler: GCC or clang * CMake -* Flex (only if Stream Processor is enabled) -* Bison (only if Stream Processor is enabled) +* Flex \(only if Stream Processor is enabled\) +* Bison \(only if Stream Processor is enabled\) There are not other dependencies besides _libc_ and _pthreads_ in the most basic mode. For certain features that depends on third party components, those are included in the main source code repository. + diff --git a/installation/sources/build-and-install.md b/installation/sources/build-and-install.md index 36e443bc0..da7a162be 100644 --- a/installation/sources/build-and-install.md +++ b/installation/sources/build-and-install.md @@ -65,7 +65,7 @@ it's likely you may need root privileges so you can try to prefixing the command ## Build Options -Fluent Bit provides certain options to CMake that can be enabled or disabled when configuring, please refer to the following tables under the _General Options_, _Development Options_, Input Plugins_ and _Output Plugins_ sections. +Fluent Bit provides certain options to CMake that can be enabled or disabled when configuring, please refer to the following tables under the _General Options_, _Development Options_, Input Plugins _and \_Output Plugins_ sections. ### General Options @@ -78,12 +78,12 @@ Fluent Bit provides certain options to CMake that can be enabled or disabled whe | FLB\_EXAMPLES | Build examples | Yes | | FLB\_SHARED\_LIB | Build shared library | Yes | | FLB\_MTRACE | Enable mtrace support | No | -| FLB_INOTIFY | Enable Inotify support | Yes | +| FLB\_INOTIFY | Enable Inotify support | Yes | | FLB\_POSIX\_TLS | Force POSIX thread storage | No | -| FLB_SQLDB | Enable SQL embedded database support | No | -| FLB_HTTP_SERVER | Enable HTTP Server | No | -| FLB_LUAJIT | Enable Lua scripting support | Yes | -| FLB_STATIC_CONF | Build binary using static configuration files. The value of this option must be a directory containing configuration files. | | +| FLB\_SQLDB | Enable SQL embedded database support | No | +| FLB\_HTTP\_SERVER | Enable HTTP Server | No | +| FLB\_LUAJIT | Enable Lua scripting support | Yes | +| FLB\_STATIC\_CONF | Build binary using static configuration files. The value of this option must be a directory containing configuration files. | | ### Development Options @@ -93,10 +93,10 @@ Fluent Bit provides certain options to CMake that can be enabled or disabled whe | FLB\_VALGRIND | Enable Valgrind support | No | | FLB\_TRACE | Enable trace mode | No | | FLB\_SMALL | Minimise binary size | No | -| FLB_TESTS_RUNTIME | Enable runtime tests | No | -| FLB_TESTS_INTERNAL | Enable internal tests | No | +| FLB\_TESTS\_RUNTIME | Enable runtime tests | No | +| FLB\_TESTS\_INTERNAL | Enable internal tests | No | | FLB\_TESTS | Enable tests | No | -| FLB_BACKTRACE | Enable backtrace/stacktrace support | Yes | +| FLB\_BACKTRACE | Enable backtrace/stacktrace support | Yes | ### Input Plugins @@ -104,18 +104,18 @@ The _input plugins_ provides certain features to gather information from a speci | option | description | default | | :--- | :--- | :--- | -| [FLB\_IN\_CPU](../input/cpu.md) | Enable CPU input plugin | On | -| [FLB\_IN\_FORWARD](../input/forward.md) | Enable Forward input plugin | On | -| [FLB\_IN\_HEAD](../input/head.md) | Enable Head input plugin | On | -| [FLB\_IN\_HEALTH](../input/health.md) | Enable Health input plugin | On | -| [FLB\_IN\_KMSG](../input/kmsg.md) | Enable Kernel log input plugin | On | -| [FLB\_IN\_MEM](../input/mem.md) | Enable Memory input plugin | On | +| [FLB\_IN\_CPU](https://github.com/fluent/fluent-bit-docs/tree/8ab2f4cda8dfdd8def7fa0cf5c7ffc23069e5a70/installation/input/cpu.md) | Enable CPU input plugin | On | +| [FLB\_IN\_FORWARD](https://github.com/fluent/fluent-bit-docs/tree/8ab2f4cda8dfdd8def7fa0cf5c7ffc23069e5a70/installation/input/forward.md) | Enable Forward input plugin | On | +| [FLB\_IN\_HEAD](https://github.com/fluent/fluent-bit-docs/tree/8ab2f4cda8dfdd8def7fa0cf5c7ffc23069e5a70/installation/input/head.md) | Enable Head input plugin | On | +| [FLB\_IN\_HEALTH](https://github.com/fluent/fluent-bit-docs/tree/8ab2f4cda8dfdd8def7fa0cf5c7ffc23069e5a70/installation/input/health.md) | Enable Health input plugin | On | +| [FLB\_IN\_KMSG](https://github.com/fluent/fluent-bit-docs/tree/8ab2f4cda8dfdd8def7fa0cf5c7ffc23069e5a70/installation/input/kmsg.md) | Enable Kernel log input plugin | On | +| [FLB\_IN\_MEM](https://github.com/fluent/fluent-bit-docs/tree/8ab2f4cda8dfdd8def7fa0cf5c7ffc23069e5a70/installation/input/mem.md) | Enable Memory input plugin | On | | FLB\_IN\_RANDOM | Enable Random input plugin | On | -| [FLB\_IN\_SERIAL](../input/serial.md) | Enable Serial input plugin | On | -| [FLB\_IN\_STDIN](../input/stdin.md) | Enable Standard input plugin | On | +| [FLB\_IN\_SERIAL](https://github.com/fluent/fluent-bit-docs/tree/8ab2f4cda8dfdd8def7fa0cf5c7ffc23069e5a70/installation/input/serial.md) | Enable Serial input plugin | On | +| [FLB\_IN\_STDIN](https://github.com/fluent/fluent-bit-docs/tree/8ab2f4cda8dfdd8def7fa0cf5c7ffc23069e5a70/installation/input/stdin.md) | Enable Standard input plugin | On | | FLB\_IN\_TCP | Enable TCP input plugin | On | -| [FLB\_IN\_THERMAL](../input/thermal.md) | Enable system temperature(s) input plugin | On | -| [FLB\_IN\_MQTT](../input/mqtt.md) | Enable MQTT input plugin | On | +| [FLB\_IN\_THERMAL](https://github.com/fluent/fluent-bit-docs/tree/8ab2f4cda8dfdd8def7fa0cf5c7ffc23069e5a70/installation/input/thermal.md) | Enable system temperature\(s\) input plugin | On | +| [FLB\_IN\_MQTT](https://github.com/fluent/fluent-bit-docs/tree/8ab2f4cda8dfdd8def7fa0cf5c7ffc23069e5a70/installation/input/mqtt.md) | Enable MQTT input plugin | On | | [FLB\_IN\_XBEE](https://github.com/fluent/fluent-bit-docs/tree/ad9d80e5490bd5d79c86955c5689db1cb4cf89db/input/xbee.md) | Enable Xbee input plugin | Off | ### Output Plugins @@ -124,11 +124,12 @@ The _output plugins_ gives the capacity to flush the information to some externa | option | description | default | | :--- | :--- | :--- | -| [FLB\_OUT\_ES](../output/elasticsearch.md) | Enable [Elastic Search](http://www.elastic.co) output plugin | On | -| [FLB\_OUT\_FORWARD](../output/forward.md) | Enable [Fluentd](http://www.fluentd.org) output plugin | On | -| [FLB\_OUT\_HTTP](../output/http.md) | Enable HTTP output plugin | On | -| [FLB\_OUT\_NATS](../output/nats.md) | Enable [NATS](http://www.nats.io) output plugin | Off | +| [FLB\_OUT\_ES](https://github.com/fluent/fluent-bit-docs/tree/8ab2f4cda8dfdd8def7fa0cf5c7ffc23069e5a70/installation/output/elasticsearch.md) | Enable [Elastic Search](http://www.elastic.co) output plugin | On | +| [FLB\_OUT\_FORWARD](https://github.com/fluent/fluent-bit-docs/tree/8ab2f4cda8dfdd8def7fa0cf5c7ffc23069e5a70/installation/output/forward.md) | Enable [Fluentd](http://www.fluentd.org) output plugin | On | +| [FLB\_OUT\_HTTP](https://github.com/fluent/fluent-bit-docs/tree/8ab2f4cda8dfdd8def7fa0cf5c7ffc23069e5a70/installation/output/http.md) | Enable HTTP output plugin | On | +| [FLB\_OUT\_NATS](https://github.com/fluent/fluent-bit-docs/tree/8ab2f4cda8dfdd8def7fa0cf5c7ffc23069e5a70/installation/output/nats.md) | Enable [NATS](http://www.nats.io) output plugin | Off | | FLB\_OUT\_PLOT | Enable Plot output plugin | On | -| [FLB\_OUT\_STDOUT](../output/stdout.md) | Enable STDOUT output plugin | On | -| [FLB\_OUT\_TD](../output/td.md) | Enable [Treasure Data](http://www.treasuredata.com) output plugin | On | +| [FLB\_OUT\_STDOUT](https://github.com/fluent/fluent-bit-docs/tree/8ab2f4cda8dfdd8def7fa0cf5c7ffc23069e5a70/installation/output/stdout.md) | Enable STDOUT output plugin | On | +| [FLB\_OUT\_TD](https://github.com/fluent/fluent-bit-docs/tree/8ab2f4cda8dfdd8def7fa0cf5c7ffc23069e5a70/installation/output/td.md) | Enable [Treasure Data](http://www.treasuredata.com) output plugin | On | | FLB\_OUT\_NULL | Enable /dev/null output plugin | On | + diff --git a/installation/sources/build-with-static-configuration.md b/installation/sources/build-with-static-configuration.md index f542e4fd2..9e1b9dd7f 100644 --- a/installation/sources/build-with-static-configuration.md +++ b/installation/sources/build-with-static-configuration.md @@ -1,6 +1,6 @@ # Build with Static Configuration -[Fluent Bit](https://fluentbit.io) in normal operation mode allows to be configurable through [text files](../configuration/file.md) or using specific arguments in the command line, while this is the ideal deployment case, there are scenarios where a more restricted configuration is required: static configuration mode. +[Fluent Bit](https://fluentbit.io) in normal operation mode allows to be configurable through [text files](https://github.com/fluent/fluent-bit-docs/tree/8ab2f4cda8dfdd8def7fa0cf5c7ffc23069e5a70/installation/configuration/file.md) or using specific arguments in the command line, while this is the ideal deployment case, there are scenarios where a more restricted configuration is required: static configuration mode. Static configuration mode aims to include a built-in configuration in the final binary of Fluent Bit, disabling the usage of external files or flags at runtime. @@ -8,21 +8,21 @@ Static configuration mode aims to include a built-in configuration in the final ### Requirements -The following steps assumes you are familiar with configuring Fluent Bit using text files and you have experience building it from scratch as described in the [Build and Install](build_install.md) section. +The following steps assumes you are familiar with configuring Fluent Bit using text files and you have experience building it from scratch as described in the [Build and Install](https://github.com/fluent/fluent-bit-docs/tree/8ab2f4cda8dfdd8def7fa0cf5c7ffc23069e5a70/installation/sources/build_install.md) section. #### Configuration Directory -In your file system prepare a specific directory that will be used as an entry point for the build system to lookup and parse the configuration files. It is mandatory that this directory contain as a minimum one configuration file called _fluent-bit.conf_ containing the required [SERVICE](../configuration/file.md#config_section), [INPUT](configuration/file.md#config_input) and [OUTPUT](../configuration/file.md#config_output) sections. As an example create a new _fluent-bit.conf_ file with the following content: +In your file system prepare a specific directory that will be used as an entry point for the build system to lookup and parse the configuration files. It is mandatory that this directory contain as a minimum one configuration file called _fluent-bit.conf_ containing the required [SERVICE](https://github.com/fluent/fluent-bit-docs/tree/8ab2f4cda8dfdd8def7fa0cf5c7ffc23069e5a70/installation/configuration/file.md#config_section), [INPUT](https://github.com/fluent/fluent-bit-docs/tree/8ab2f4cda8dfdd8def7fa0cf5c7ffc23069e5a70/installation/sources/configuration/file.md#config_input) and [OUTPUT](https://github.com/fluent/fluent-bit-docs/tree/8ab2f4cda8dfdd8def7fa0cf5c7ffc23069e5a70/installation/configuration/file.md#config_output) sections. As an example create a new _fluent-bit.conf_ file with the following content: ```python [SERVICE] Flush 1 Daemon off Log_Level info - + [INPUT] Name cpu - + [OUTPUT] Name stdout Match * @@ -32,7 +32,7 @@ the configuration provided above will calculate CPU metrics from the running sys #### Build with Custom Configuration - Inside Fluent Bit source code, get into the build/ directory and run CMake appending the FLB_STATIC_CONF option pointing the configuration directory recently created, e.g: +Inside Fluent Bit source code, get into the build/ directory and run CMake appending the FLB\_STATIC\_CONF option pointing the configuration directory recently created, e.g: ```bash $ cd fluent-bit/build/ @@ -54,6 +54,5 @@ Copyright (C) Treasure Data [2018/10/19 15:32:31] [ info] [engine] started (pid=15186) [0] cpu.local: [1539984752.000347547, {"cpu_p"=>0.750000, "user_p"=>0.500000, "system_p"=>0.250000, "cpu0.p_cpu"=>1.000000, "cpu0.p_user"=>1.000000, "cpu0.p_system"=>0.000000, "cpu1.p_cpu"=>0.000000, "cpu1.p_user"=>0.000000, "cpu1.p_system"=>0.000000, "cpu2.p_cpu"=>0.000000, "cpu2.p_user"=>0.000000, "cpu2.p_system"=>0.000000, "cpu3.p_cpu"=>1.000000, "cpu3.p_user"=>1.000000, "cpu3.p_system"=>0.000000}] - ``` diff --git a/installation/supported-platforms.md b/installation/supported-platforms.md index fc75ef790..49e390fc2 100644 --- a/installation/supported-platforms.md +++ b/installation/supported-platforms.md @@ -2,16 +2,17 @@ The following operating systems and architectures are supported in Fluent Bit. -| Operating System | Distribution | Architecture | -| ---------------- | ---------------------------- | ------------ | -| Linux | Centos 7 | x86_64 | -| | Debian 8 (Jessie) | x86_64 | -| | Debian 9 (Stretch) | x86_64 | -| | Raspbian 8 (Debian Jessie) | AArch32 | -| | Raspbian 9 (Debian Stretch) | AArch32 | -| | Ubuntu 16.04 (Xenial Xerus) | x86_64 | -| | Ubuntu 18.04 (Bionic Beaver) | x86_64 | +| Operating System | Distribution | Architecture | +| :--- | :--- | :--- | +| Linux | Centos 7 | x86\_64 | +| | Debian 8 \(Jessie\) | x86\_64 | +| | Debian 9 \(Stretch\) | x86\_64 | +| | Raspbian 8 \(Debian Jessie\) | AArch32 | +| | Raspbian 9 \(Debian Stretch\) | AArch32 | +| | Ubuntu 16.04 \(Xenial Xerus\) | x86\_64 | +| | Ubuntu 18.04 \(Bionic Beaver\) | x86\_64 | -From an architecture support perspective, Fluent Bit is fully functional on x86, x86_64, AArch32 and AArch64 based processors. +From an architecture support perspective, Fluent Bit is fully functional on x86, x86\_64, AArch32 and AArch64 based processors. + +Fluent Bit can work also on OSX and \*BSD systems, but not all plugins will be available on all platforms. Official support will be expanding based on community demand. -Fluent Bit can work also on OSX and *BSD systems, but not all plugins will be available on all platforms. Official support will be expanding based on community demand. \ No newline at end of file diff --git a/installation/upgrade-notes.md b/installation/upgrade-notes.md index 9fc23f8bd..58f14dbcd 100644 --- a/installation/upgrade-notes.md +++ b/installation/upgrade-notes.md @@ -1,6 +1,6 @@ -# Upgrading Notes +# Upgrade Notes -The following article cover the relevant notes for users upgrading from previous Fluent Bit versions. We aim to cover compatibility changes that you must be aware of. +The following article cover the relevant notes for users upgrading from previous Fluent Bit versions. We aim to cover compatibility changes that you must be aware of. For more details about changes on each release please refer to the [Official Release Notes](https://fluentbit.io/announcements/). @@ -12,9 +12,9 @@ If you are migrating from Fluent Bit v1.2 to v1.3, there are not breaking change ### Docker, JSON, Parsers and Decoders -On Fluent Bit v1.2 we have fixed many issues associated with JSON encoding and decoding, for hence when parsing Docker logs __is no longer necessary__ to use decoders. The new Docker parser looks like this: +On Fluent Bit v1.2 we have fixed many issues associated with JSON encoding and decoding, for hence when parsing Docker logs **is no longer necessary** to use decoders. The new Docker parser looks like this: -``` +```text [PARSER] Name docker Format json @@ -31,7 +31,7 @@ We have done improvements also on how Kubernetes Filter handle the stringified _ In addition, we have fixed and improved the option called _Merge\_Log\_Key_. If a merge log succeed, all new keys will be packaged under the key specified by this option, a suggested configuration is as follows: -``` +```text [FILTER] Name Kubernetes Match kube.* @@ -42,27 +42,25 @@ In addition, we have fixed and improved the option called _Merge\_Log\_Key_. If As an example, if the original log content is the following map: -```json +```javascript {"key1": "val1", "key2": "val2"} ``` the final record will be composed as follows: -```json +```javascript { - "log": "{\"key1\": \"val1\", \"key2\": \"val2\"}", - "log_processed": { - "key1": "val1", - "key2": "val2" - } + "log": "{\"key1\": \"val1\", \"key2\": \"val2\"}", + "log_processed": { + "key1": "val1", + "key2": "val2" + } } ``` - - ## Fluent Bit v1.1 -If you are upgrading from **Fluent Bit <= 1.0.x** you should take in consideration the following relevant changes when switching to **Fluent Bit v1.1** series: +If you are upgrading from **Fluent Bit <= 1.0.x** you should take in consideration the following relevant changes when switching to **Fluent Bit v1.1** series: ### Kubernetes Filter @@ -70,7 +68,7 @@ We introduced a new configuration property called _Kube\_Tag\_Prefix_ to help Ta During 1.0.x release cycle, a commit in Tail input plugin changed the default behavior on how the Tag was composed when using the wildcard for expansion generating breaking compatibility with other services. Consider the following configuration example: -``` +```text [INPUT] Name tail Path /var/log/containers/*.log @@ -79,13 +77,13 @@ During 1.0.x release cycle, a commit in Tail input plugin changed the default be The expected behavior is that Tag will be expanded to: -``` +```text kube.var.log.containers.apache.log ``` but the change introduced in 1.0 series switched from absolute path to the base file name only: -``` +```text kube.apache.log ``` @@ -95,7 +93,7 @@ On Fluent Bit v1.1 release we restored to our default behavior and now the Tag i This behavior switch in Tail input plugin affects how Filter Kubernetes operates. As you know when the filter is used it needs to perform local metadata lookup that comes from the file names when using Tail as a source. Now with the new _Kube\_Tag\_Prefix_ option you can specify what's the prefix used in Tail input plugin, for the configuration example above the new configuration will look as follows: -``` +```text [INPUT] Name tail Path /var/log/containers/*.log diff --git a/installation/windows.md b/installation/windows.md index eeedb7c09..c6f0c130e 100644 --- a/installation/windows.md +++ b/installation/windows.md @@ -1,13 +1,13 @@ -# Windows Packages +# Windows -Fluent Bit is distributed as **td-agent-bit** package for Windows. Fluent Bit has two flavours of Windows installers: a ZIP archive (for quick testing) and an EXE installer (for system installation). +Fluent Bit is distributed as **td-agent-bit** package for Windows. Fluent Bit has two flavours of Windows installers: a ZIP archive \(for quick testing\) and an EXE installer \(for system installation\). ## Installation Packages The latest stable version is 1.2.2. -| INSTALLERS | SHA256 CHECKSUMS | -| ---------------------------- | ---------------------------------------------------------------- | +| INSTALLERS | SHA256 CHECKSUMS | +| :--- | :--- | | td-agent-bit-1.2.2-win32.exe | b9e2695e6cc1b15e0e47d20624d6509cecbdd1767b6681751190f54e52832b6a | | td-agent-bit-1.2.2-win32.zip | 4212e28fb6cb970ce9f27439f8dce3281ab544a4f7c9ae71991480a7e7a64afd | | td-agent-bit-1.2.2-win64.exe | 6059e2f4892031125aac30325be4c167daed89742705fa883d34d91dc306645e | @@ -34,16 +34,16 @@ The ZIP package contains the following set of files. ```text td-agent-bit ├── bin -│   ├── fluent-bit.dll -│   └── fluent-bit.exe +│ ├── fluent-bit.dll +│ └── fluent-bit.exe ├── conf -│   ├── fluent-bit.conf -│   ├── parsers.conf -│   └── plugins.conf +│ ├── fluent-bit.conf +│ ├── parsers.conf +│ └── plugins.conf └── include - │   ├── flb_api.h - │   ├── ... - │   └── flb_worker.h + │ ├── flb_api.h + │ ├── ... + │ └── flb_worker.h └── fluent-bit.h ``` @@ -79,10 +79,11 @@ Download an EXE installer from the [download page](https://fluentbit.io/download Then, double-click the EXE installer you've downloaded. Installation wizard will automatically start. -![](../imgs/windows_installer.png) +![](https://github.com/fluent/fluent-bit-docs/tree/8ab2f4cda8dfdd8def7fa0cf5c7ffc23069e5a70/imgs/windows_installer.png) Click Next and proceed. By default, Fluent Bit is installed into `C:\Program Files\td-agent-bit\`, so you should be able to launch fluent-bit as follow after installation. ```text PS> C:\Program Files\td-agent-bit\bin\fluent-bit.exe -i dummy -o stdout ``` + diff --git a/installation/yocto-embedded-linux.md b/installation/yocto-embedded-linux.md index 56aceb409..5180de708 100644 --- a/installation/yocto-embedded-linux.md +++ b/installation/yocto-embedded-linux.md @@ -1,16 +1,17 @@ -# Yocto Project +# Yocto / Embedded Linux -[Fluent Bit](https://fluentbit.io) source code provides Bitbake recipes to configure, build and package the software for a Yocto based image. Note that specific steps of usage of these recipes in your Yocto environment (Poky) is out of the scope of this documentation. +[Fluent Bit](https://fluentbit.io) source code provides Bitbake recipes to configure, build and package the software for a Yocto based image. Note that specific steps of usage of these recipes in your Yocto environment \(Poky\) is out of the scope of this documentation. We distribute two main recipes, one for testing/dev purposes and other with the latest stable release. -| Version | Recipe | Description | -| ------- | ------------------------------------------------------------ | ------------------------------------------------------------ | -| devel | [fluent-bit_git.bb](https://github.com/fluent/fluent-bit/blob/master/fluent-bit_git.bb) | Build Fluent Bit from GIT master. This recipe aims to be used for development and testing purposes only. | -| v1.2.0 | [fluent-bit_1.2.0.bb](https://github.com/fluent/fluent-bit/blob/1.2/fluent-bit_1.2.0.bb) | Build latest stable version of Fluent Bit. | +| Version | Recipe | Description | +| :--- | :--- | :--- | +| devel | [fluent-bit\_git.bb](https://github.com/fluent/fluent-bit/blob/master/fluent-bit_git.bb) | Build Fluent Bit from GIT master. This recipe aims to be used for development and testing purposes only. | +| v1.2.0 | [fluent-bit\_1.2.0.bb](https://github.com/fluent/fluent-bit/blob/1.2/fluent-bit_1.2.0.bb) | Build latest stable version of Fluent Bit. | It's strongly recommended to always use the stable release of Fluent Bit recipe and not the one from GIT master for production deployments. -### Fluent Bit v1.1 and native ARMv8 (aarch64) support +## Fluent Bit v1.1 and native ARMv8 \(aarch64\) support + +Fluent Bit >= v1.1.x already integrates native AArch64 support where stack switches for co-routines are done through native ASM calls, on this scenario there is no issues as the one faced in previous series. -Fluent Bit >= v1.1.x already integrates native AArch64 support where stack switches for co-routines are done through native ASM calls, on this scenario there is no issues as the one faced in previous series. diff --git a/pipeline/filters/aws-metadata.md b/pipeline/filters/aws-metadata.md index fec395b2e..d235959b5 100644 --- a/pipeline/filters/aws-metadata.md +++ b/pipeline/filters/aws-metadata.md @@ -1,4 +1,4 @@ -# AWS +# AWS Metadata The _AWS Filter_ Enriches logs with AWS Metadata. Currently the plugin adds the EC2 instance ID and availability zone to log records. To use this plugin, you must be running in EC2 and have the [instance metadata service enabled](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-instance-metadata-service.html). @@ -8,9 +8,9 @@ The plugin supports the following configuration parameters: | Key | Description | Default | | :--- | :--- | :--- | -| imds_version | Specify which version of the instance metadata service to use. Valid values are 'v1' or 'v2'. | v2 | +| imds\_version | Specify which version of the instance metadata service to use. Valid values are 'v1' or 'v2'. | v2 | -Note: *If you run Fluent Bit in a container, you may have to use instance metadata v1.* The plugin behaves the same regardless of which version is used. +Note: _If you run Fluent Bit in a container, you may have to use instance metadata v1._ The plugin behaves the same regardless of which version is used. ## Usage @@ -21,11 +21,11 @@ Currently, the plugin only adds the instance ID and availability zone. AWS plans | Key | Value | | :--- | :--- | | az | The [availability zone](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html); for example, "us-east-1a". | -| ec2_instance_id | The EC2 instance ID. | +| ec2\_instance\_id | The EC2 instance ID. | ### Command Line -``` +```text $ bin/fluent-bit -i dummy -F aws -m '*' -o stdout [2020/01/17 07:57:17] [ info] [engine] started (pid=32744) @@ -35,7 +35,7 @@ $ bin/fluent-bit -i dummy -F aws -m '*' -o stdout ### Configuration File -``` +```text [INPUT] Name dummy Tag dummy @@ -49,3 +49,4 @@ $ bin/fluent-bit -i dummy -F aws -m '*' -o stdout Name stdout Match * ``` + diff --git a/pipeline/filters/grep.md b/pipeline/filters/grep.md index 71dc24193..fca790063 100644 --- a/pipeline/filters/grep.md +++ b/pipeline/filters/grep.md @@ -59,7 +59,7 @@ The filter allows to use multiple rules which are applied in order, you can have Currently nested fields are not supported. If you have records in the following format -```json +```javascript { "kubernetes": { "pod_name": "myapp-0", @@ -75,7 +75,7 @@ Currently nested fields are not supported. If you have records in the following } ``` -and if you want to exclude records that match given nested field (for example `kubernetes.labels.app`), you could use combination of [nest](https://docs.fluentbit.io/manual/v/1.0/filter/nest) and grep filters. Here is an example that will exclude records that match `kubernetes.labels.app: myapp`: +and if you want to exclude records that match given nested field \(for example `kubernetes.labels.app`\), you could use combination of [nest](https://docs.fluentbit.io/manual/v/1.0/filter/nest) and grep filters. Here is an example that will exclude records that match `kubernetes.labels.app: myapp`: ```python [FILTER] @@ -95,3 +95,4 @@ and if you want to exclude records that match given nested field (for example `k Match * Exclude app myapp ``` + diff --git a/pipeline/filters/kubernetes.md b/pipeline/filters/kubernetes.md index 16aadfa6c..602c77250 100644 --- a/pipeline/filters/kubernetes.md +++ b/pipeline/filters/kubernetes.md @@ -2,7 +2,7 @@ Fluent Bit _Kubernetes Filter_ allows to enrich your log files with Kubernetes metadata. -When Fluent Bit is deployed in Kubernetes as a DaemonSet and configured to read the log files from the containers (using tail or systemd input plugins), this filter aims to perform the following operations: +When Fluent Bit is deployed in Kubernetes as a DaemonSet and configured to read the log files from the containers \(using tail or systemd input plugins\), this filter aims to perform the following operations: * Analyze the Tag and extract the following metadata: * Pod Name @@ -22,33 +22,33 @@ The plugin supports the following configuration parameters: | Key | Description | Default | | :--- | :--- | :--- | -| Buffer\_Size | Set the buffer size for HTTP client when reading responses from Kubernetes API server. The value must be according to the [Unit Size](../configuration/unit_sizes.md) specification. | 32k | -| Kube\_URL | API Server end-point | https://kubernetes.default.svc:443 | -| Kube\_CA\_File | CA certificate file | /var/run/secrets/kubernetes.io/serviceaccount/ca.crt| +| Buffer\_Size | Set the buffer size for HTTP client when reading responses from Kubernetes API server. The value must be according to the [Unit Size](https://github.com/fluent/fluent-bit-docs/tree/b6af8ea32759e6e8250d853b3d21e44b4a5427f8/pipeline/configuration/unit_sizes.md) specification. | 32k | +| Kube\_URL | API Server end-point | [https://kubernetes.default.svc:443](https://kubernetes.default.svc:443) | +| Kube\_CA\_File | CA certificate file | /var/run/secrets/kubernetes.io/serviceaccount/ca.crt | | Kube\_CA\_Path | Absolute path to scan for certificate files | | | Kube\_Token\_File | Token file | /var/run/secrets/kubernetes.io/serviceaccount/token | -| Kube_Tag_Prefix | When the source records comes from Tail input plugin, this option allows to specify what's the prefix used in Tail configuration. | kube.var.log.containers. | +| Kube\_Tag\_Prefix | When the source records comes from Tail input plugin, this option allows to specify what's the prefix used in Tail configuration. | kube.var.log.containers. | | Merge\_Log | When enabled, it checks if the `log` field content is a JSON string map, if so, it append the map fields as part of the log structure. | Off | | Merge\_Log\_Key | When `Merge_Log` is enabled, the filter tries to assume the `log` field from the incoming message is a JSON string message and make a structured representation of it at the same level of the `log` field in the map. Now if `Merge_Log_Key` is set \(a string name\), all the new structured fields taken from the original `log` content are inserted under the new key. | | -| Merge\_Log\_Trim | When `Merge_Log` is enabled, trim (remove possible \n or \r) field values. | On | -| Merge\_Parser | Optional parser name to specify how to parse the data contained in the _log_ key. Recommended use is for developers or testing only. | | -| Keep\_Log | When `Keep_Log` is disabled, the `log` field is removed from the incoming message once it has been successfully merged (`Merge_Log` must be enabled as well). | On | +| Merge\_Log\_Trim | When `Merge_Log` is enabled, trim \(remove possible \n or \r\) field values. | On | +| Merge\_Parser | Optional parser name to specify how to parse the data contained in the _log_ key. Recommended use is for developers or testing only. | | +| Keep\_Log | When `Keep_Log` is disabled, the `log` field is removed from the incoming message once it has been successfully merged \(`Merge_Log` must be enabled as well\). | On | | tls.debug | Debug level between 0 \(nothing\) and 4 \(every detail\). | -1 | | tls.verify | When enabled, turns on certificate validation when connecting to the Kubernetes API server. | On | | Use\_Journal | When enabled, the filter reads logs coming in Journald format. | Off | | Regex\_Parser | Set an alternative Parser to process record Tag and extract pod\_name, namespace\_name, container\_name and docker\_id. The parser must be registered in a [parsers file](https://github.com/fluent/fluent-bit/blob/master/conf/parsers.conf) \(refer to parser _filter-kube-test_ as an example\). | | -| K8S-Logging.Parser | Allow Kubernetes Pods to suggest a pre-defined Parser (read more about it in Kubernetes Annotations section) | Off | -| K8S-Logging.Exclude | Allow Kubernetes Pods to exclude their logs from the log processor (read more about it in Kubernetes Annotations section). | Off | +| K8S-Logging.Parser | Allow Kubernetes Pods to suggest a pre-defined Parser \(read more about it in Kubernetes Annotations section\) | Off | +| K8S-Logging.Exclude | Allow Kubernetes Pods to exclude their logs from the log processor \(read more about it in Kubernetes Annotations section\). | Off | | Labels | Include Kubernetes resource labels in the extra metadata. | On | | Annotations | Include Kubernetes resource annotations in the extra metadata. | On | -| Kube\_meta_preload_cache_dir | If set, Kubernetes meta-data can be cached/pre-loaded from files in JSON format in this directory, named as namespace-pod.meta | | -| Dummy\_Meta | If set, use dummy-meta data (for test/dev purposes) | Off | +| Kube\_meta\_preload\_cache\_dir | If set, Kubernetes meta-data can be cached/pre-loaded from files in JSON format in this directory, named as namespace-pod.meta | | +| Dummy\_Meta | If set, use dummy-meta data \(for test/dev purposes\) | Off | ## Processing the 'log' value Kubernetes Filter aims to provide several ways to process the data contained in the _log_ key. The following explanation of the workflow assumes that your original Docker parser defined in _parsers.conf_ is as follows: -``` +```text [PARSER] Name docker Format json @@ -57,29 +57,29 @@ Kubernetes Filter aims to provide several ways to process the data contained in Time_Keep On ``` -> Since Fluent Bit v1.2 we are not suggesting the use of decoders (Decode\_Field\_As) if you are using Elasticsearch database in the output to avoid data type conflicts. +> Since Fluent Bit v1.2 we are not suggesting the use of decoders \(Decode\_Field\_As\) if you are using Elasticsearch database in the output to avoid data type conflicts. -To perform processing of the _log_ key, it's __mandatory to enable__ the _Merge\_Log_ configuration property in this filter, then the following processing order will be done: +To perform processing of the _log_ key, it's **mandatory to enable** the _Merge\_Log_ configuration property in this filter, then the following processing order will be done: -- If a Pod suggest a parser, the filter will use that parser to process the content of _log_. -- If the option _Merge\_Parser_ was set and the Pod did not suggest a parser, process the _log_ content using the suggested parser in the configuration. -- If no Pod was suggested and no _Merge\_Parser_ is set, try to handle the content as JSON. +* If a Pod suggest a parser, the filter will use that parser to process the content of _log_. +* If the option _Merge\_Parser_ was set and the Pod did not suggest a parser, process the _log_ content using the suggested parser in the configuration. +* If no Pod was suggested and no _Merge\_Parser_ is set, try to handle the content as JSON. -If _log_ value processing fails, the value is untouched. The order above is not chained, meaning it's exclusive and the filter will try only one of the options above, __not__ all of them. +If _log_ value processing fails, the value is untouched. The order above is not chained, meaning it's exclusive and the filter will try only one of the options above, **not** all of them. ## Kubernetes Annotations A flexible feature of Fluent Bit Kubernetes filter is that allow Kubernetes Pods to suggest certain behaviors for the log processor pipeline when processing the records. At the moment it support: -- Suggest a pre-defined parser -- Request to exclude logs +* Suggest a pre-defined parser +* Request to exclude logs The following annotations are available: -| Annotation | Description | Default | -| ----------------------------------------- | ------------------------------------------------------------ | ------- | -| fluentbit.io/parser[_stream][-container] | Suggest a pre-defined parser. The parser must be registered already by Fluent Bit. This option will only be processed if Fluent Bit configuration (Kubernetes Filter) have enabled the option _K8S-Logging.Parser_. If present, the stream (stdout or stderr) will restrict that specific stream. If present, the container can override a specific container in a Pod. | | -| fluentbit.io/exclude[_stream][-container] | Request to Fluent Bit to exclude or not the logs generated by the Pod. This option will only be processed if Fluent Bit configuration (Kubernetes Filter) have enabled the option _K8S-Logging.Exclude_. | False | +| Annotation | Description | Default | +| :--- | :--- | :--- | +| fluentbit.io/parser\[\_stream\]\[-container\] | Suggest a pre-defined parser. The parser must be registered already by Fluent Bit. This option will only be processed if Fluent Bit configuration \(Kubernetes Filter\) have enabled the option _K8S-Logging.Parser_. If present, the stream \(stdout or stderr\) will restrict that specific stream. If present, the container can override a specific container in a Pod. | | +| fluentbit.io/exclude\[\_stream\]\[-container\] | Request to Fluent Bit to exclude or not the logs generated by the Pod. This option will only be processed if Fluent Bit configuration \(Kubernetes Filter\) have enabled the option _K8S-Logging.Exclude_. | False | ### Annotation Examples in Pod definition @@ -121,13 +121,13 @@ spec: image: edsiper/apache_logs ``` -Note that the annotation value is boolean which can take a _true_ or _false_ and __must__ be quoted. +Note that the annotation value is boolean which can take a _true_ or _false_ and **must** be quoted. ## Workflow of Tail + Kubernetes Filter -Kubernetes Filter depends on either [Tail](../input/tail.md) or [Systemd](../input/systemd.md) input plugins to process and enrich records with Kubernetes metadata. Here we will explain the workflow of Tail and how it configuration is correlated with Kubernetes filter. Consider the following configuration example (just for demo purposes, not production): +Kubernetes Filter depends on either [Tail](https://github.com/fluent/fluent-bit-docs/tree/b6af8ea32759e6e8250d853b3d21e44b4a5427f8/pipeline/input/tail.md) or [Systemd](https://github.com/fluent/fluent-bit-docs/tree/b6af8ea32759e6e8250d853b3d21e44b4a5427f8/pipeline/input/systemd.md) input plugins to process and enrich records with Kubernetes metadata. Here we will explain the workflow of Tail and how it configuration is correlated with Kubernetes filter. Consider the following configuration example \(just for demo purposes, not production\): -``` +```text [INPUT] Name tail Tag kube.* @@ -145,35 +145,35 @@ Kubernetes Filter depends on either [Tail](../input/tail.md) or [Systemd](../inp Merge_Log_Key log_processed ``` -In the input section, the [Tail](../input/tail.md) plugin will monitor all files ending in _.log_ in path _/var/log/containers/_. For every file it will read every line and apply the docker parser. Then the records are emitted to the next step with an expanded tag. +In the input section, the [Tail](https://github.com/fluent/fluent-bit-docs/tree/b6af8ea32759e6e8250d853b3d21e44b4a5427f8/pipeline/input/tail.md) plugin will monitor all files ending in _.log_ in path _/var/log/containers/_. For every file it will read every line and apply the docker parser. Then the records are emitted to the next step with an expanded tag. -Tail support Tags expansion, which means that if a tag have a star character (*), it will replace the value with the absolute path of the monitored file, so if you file name and path is: +Tail support Tags expansion, which means that if a tag have a star character \(\*\), it will replace the value with the absolute path of the monitored file, so if you file name and path is: -``` +```text /var/log/container/apache-logs-annotated_default_apache-aeeccc7a9f00f6e4e066aeff0434cf80621215071f1b20a51e8340aa7c35eac6.log ``` then the Tag for every record of that file becomes: -``` +```text kube.var.log.containers.apache-logs-annotated_default_apache-aeeccc7a9f00f6e4e066aeff0434cf80621215071f1b20a51e8340aa7c35eac6.log ``` > note that slashes are replaced with dots. -When [Kubernetes Filter](kubernetes.md) runs, it will try to match all records that starts with _kube._ (note the ending dot), so records from the file mentioned above will hit the matching rule and the filter will try to enrich the records +When [Kubernetes Filter](kubernetes.md) runs, it will try to match all records that starts with _kube._ \(note the ending dot\), so records from the file mentioned above will hit the matching rule and the filter will try to enrich the records Kubernetes Filter do not care from where the logs comes from, but it cares about the absolute name of the monitored file, because that information contains the pod name and namespace name that are used to retrieve associated metadata to the running Pod from the Kubernetes Master/API Server. -If the configuration property __Kube_Tag_Prefix__ was configured (available on Fluent Bit >= 1.1.x), it will use that value to remove the prefix that was appended to the Tag in the previous Input section. Note that the configuration property defaults to _kube._var.logs.containers. , so the previous Tag content will be transformed from: +If the configuration property **Kube\_Tag\_Prefix** was configured \(available on Fluent Bit >= 1.1.x\), it will use that value to remove the prefix that was appended to the Tag in the previous Input section. Note that the configuration property defaults to \_kube.\_var.logs.containers. , so the previous Tag content will be transformed from: -``` +```text kube.var.log.containers.apache-logs-annotated_default_apache-aeeccc7a9f00f6e4e066aeff0434cf80621215071f1b20a51e8340aa7c35eac6.log ``` to: -``` +```text apache-logs-annotated_default_apache-aeeccc7a9f00f6e4e066aeff0434cf80621215071f1b20a51e8340aa7c35eac6.log ``` @@ -181,20 +181,21 @@ apache-logs-annotated_default_apache-aeeccc7a9f00f6e4e066aeff0434cf80621215071f1 that new value is used by the filter to lookup the pod name and namespace, for that purpose it uses an internal Regular expression: -``` +```text (?[a-z0-9](?:[-a-z0-9]*[a-z0-9])?(?:\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?[^_]+)_(?.+)-(?[a-z0-9]{64})\.log$ ``` -> If you want to know more details, check the source code of that definition [here](). +> If you want to know more details, check the source code of that definition [here](https://github.com/fluent/fluent-bit/blob/master/plugins/filter_kubernetes/kube_regex.h#L26>). You can see on [Rublar.com](https://rubular.com/r/HZz3tYAahj6JCd) web site how this operation is performed, check the following demo link: -- [https://rubular.com/r/HZz3tYAahj6JCd](https://rubular.com/r/HZz3tYAahj6JCd) +* [https://rubular.com/r/HZz3tYAahj6JCd](https://rubular.com/r/HZz3tYAahj6JCd) #### Custom Regex -Under certain and not common conditions, a user would want to alter that hard-coded regular expression, for that purpose the option __Regex_Parser__ can be used (documented on top). +Under certain and not common conditions, a user would want to alter that hard-coded regular expression, for that purpose the option **Regex\_Parser** can be used \(documented on top\). #### Final Comments -So at this point the filter is able to gather the values of _pod_name_ and _namespace_, with that information it will check in the local cache (internal hash table) if some metadata for that key pair exists, if so, it will enrich the record with the metadata value, otherwise it will connect to the Kubernetes Master/API Server and retrieve that information. +So at this point the filter is able to gather the values of _pod\_name_ and _namespace_, with that information it will check in the local cache \(internal hash table\) if some metadata for that key pair exists, if so, it will enrich the record with the metadata value, otherwise it will connect to the Kubernetes Master/API Server and retrieve that information. + diff --git a/pipeline/filters/lua.md b/pipeline/filters/lua.md index d76512389..60f657f68 100644 --- a/pipeline/filters/lua.md +++ b/pipeline/filters/lua.md @@ -13,7 +13,7 @@ Content: * [Getting Started](lua.md#getting_started) * [Lua Script Filter API](lua.md#lua_api) -## Configuration Parameters {#config} +## Configuration Parameters The plugin supports the following configuration parameters: @@ -21,11 +21,11 @@ The plugin supports the following configuration parameters: | :--- | :--- | | Script | Path to the Lua script that will be used. | | Call | Lua function name that will be triggered to do filtering. It's assumed that the function is declared inside the Script defined above. | -| Type_int_key | If these keys are matched, the fields are converted to integer. If more than one key, delimit by space | +| Type\_int\_key | If these keys are matched, the fields are converted to integer. If more than one key, delimit by space | -## Getting Started {#getting_started} +## Getting Started -In order to test the filter, you can run the plugin from the command line or through the configuration file. The following examples uses the [dummy](../input/dummy.md) input plugin for data ingestion, invoke Lua filter using the [test.lua](https://github.com/fluent/fluent-bit/blob/master/scripts/test.lua) script and calls the [cb\_print\(\)](https://github.com/fluent/fluent-bit/blob/master/scripts/test.lua#L29) function which only print the same information to the standard output: +In order to test the filter, you can run the plugin from the command line or through the configuration file. The following examples uses the [dummy](https://github.com/fluent/fluent-bit-docs/tree/2a0c790c69100939636c7ed50bebe2fa06a3a57f/pipeline/input/dummy.md) input plugin for data ingestion, invoke Lua filter using the [test.lua](https://github.com/fluent/fluent-bit/blob/master/scripts/test.lua) script and calls the [cb\_print\(\)](https://github.com/fluent/fluent-bit/blob/master/scripts/test.lua#L29) function which only print the same information to the standard output: ### Command Line @@ -54,7 +54,7 @@ In your main configuration file append the following _Input_, _Filter_ & _Output Match * ``` -## Lua Script Filter API {#lua_script} +## Lua Script Filter API The life cycle of a filter have the following steps: @@ -99,4 +99,5 @@ For functional examples of this interface, please refer to the code samples prov ### Number Type -In Lua, Fluent Bit treats number as double. It means an integer field (e.g. IDs, log levels) will be converted double. To avoid type conversion, **Type_int_key** property is available. \ No newline at end of file +In Lua, Fluent Bit treats number as double. It means an integer field \(e.g. IDs, log levels\) will be converted double. To avoid type conversion, **Type\_int\_key** property is available. + diff --git a/pipeline/filters/modify.md b/pipeline/filters/modify.md index f1f295988..65c44da65 100644 --- a/pipeline/filters/modify.md +++ b/pipeline/filters/modify.md @@ -6,8 +6,8 @@ The _Modify Filter_ plugin allows you to change records using rules and conditio As an example using JSON notation to, - - Rename `Key2` to `RenamedKey` - - Add a key `OtherKey` with value `Value3` if `OtherKey` does not yet exist +* Rename `Key2` to `RenamedKey` +* Add a key `OtherKey` with value `Value3` if `OtherKey` does not yet exist _Example \(input\)_ @@ -34,45 +34,43 @@ _Example \(output\)_ The plugin supports the following rules: -| Operation | Parameter 1 | Parameter 2 | Description | -|------------------|----------------|-------------------|---------------| -| Set | STRING:KEY | STRING:VALUE | Add a key/value pair with key `KEY` and value `VALUE`. If `KEY` already exists, *this field is overwritten* | -| Add | STRING:KEY | STRING:VALUE | Add a key/value pair with key `KEY` and value `VALUE` if `KEY` does not exist | -| Remove | STRING:KEY | NONE | Remove a key/value pair with key `KEY` if it exists | -| Remove\_wildcard | WILDCARD:KEY | NONE | Remove all key/value pairs with key matching wildcard `KEY` | -| Remove\_regex | REGEXP:KEY | NONE | Remove all key/value pairs with key matching regexp `KEY` | -| Rename | STRING:KEY | STRING:RENAMED\_KEY | Rename a key/value pair with key `KEY` to `RENAMED_KEY` if `KEY` exists AND `RENAMED_KEY` *does not exist* | -| Hard\_rename | STRING:KEY | STRING:RENAMED\_KEY | Rename a key/value pair with key `KEY` to `RENAMED_KEY` if `KEY` exists. If `RENAMED_KEY` already exists, *this field is overwritten* | -| Copy | STRING:KEY | STRING:COPIED\_KEY | Copy a key/value pair with key `KEY` to `COPIED_KEY` if `KEY` exists AND `COPIED_KEY` *does not exist* | -| Hard\_copy | STRING:KEY | STRING:COPIED\_KEY | Copy a key/value pair with key `KEY` to `COPIED_KEY` if `KEY` exists. If `COPIED_KEY` already exists, *this field is overwritten* | - - - - Rules are case insensitive, parameters are not - - Any number of rules can be set in a filter instance. - - Rules are applied in the order they appear, with each rule operating on the result of the previous rule. - +| Operation | Parameter 1 | Parameter 2 | Description | +| :--- | :--- | :--- | :--- | +| Set | STRING:KEY | STRING:VALUE | Add a key/value pair with key `KEY` and value `VALUE`. If `KEY` already exists, _this field is overwritten_ | +| Add | STRING:KEY | STRING:VALUE | Add a key/value pair with key `KEY` and value `VALUE` if `KEY` does not exist | +| Remove | STRING:KEY | NONE | Remove a key/value pair with key `KEY` if it exists | +| Remove\_wildcard | WILDCARD:KEY | NONE | Remove all key/value pairs with key matching wildcard `KEY` | +| Remove\_regex | REGEXP:KEY | NONE | Remove all key/value pairs with key matching regexp `KEY` | +| Rename | STRING:KEY | STRING:RENAMED\_KEY | Rename a key/value pair with key `KEY` to `RENAMED_KEY` if `KEY` exists AND `RENAMED_KEY` _does not exist_ | +| Hard\_rename | STRING:KEY | STRING:RENAMED\_KEY | Rename a key/value pair with key `KEY` to `RENAMED_KEY` if `KEY` exists. If `RENAMED_KEY` already exists, _this field is overwritten_ | +| Copy | STRING:KEY | STRING:COPIED\_KEY | Copy a key/value pair with key `KEY` to `COPIED_KEY` if `KEY` exists AND `COPIED_KEY` _does not exist_ | +| Hard\_copy | STRING:KEY | STRING:COPIED\_KEY | Copy a key/value pair with key `KEY` to `COPIED_KEY` if `KEY` exists. If `COPIED_KEY` already exists, _this field is overwritten_ | + +* Rules are case insensitive, parameters are not +* Any number of rules can be set in a filter instance. +* Rules are applied in the order they appear, with each rule operating on the result of the previous rule. ### Conditions The plugin supports the following conditions: -| Condition | Parameter | Parameter 2 | Description | -|-------------------------------------------------|------------|--------------|---------------| -| Key\_exists | STRING:KEY | NONE | Is `true` if `KEY` exists | -| Key\_does\_not\_exist | STRING:KEY | STRING:VALUE | Is `true` if `KEY` does not exist | -| A\_key\_matches | REGEXP:KEY | NONE | Is `true` if a key matches regex `KEY` | -| No\_key\_matches | REGEXP:KEY | NONE | Is `true` if no key matches regex `KEY` | -| Key\_value\_equals | STRING:KEY | STRING:VALUE | Is `true` if `KEY` exists and its value is `VALUE` | -| Key\_value\_does\_not\_equal | STRING:KEY | STRING:VALUE | Is `true` if `KEY` exists and its value is not `VALUE` | -| Key\_value\_matches | STRING:KEY | REGEXP:VALUE | Is `true` if key `KEY` exists and its value matches `VALUE` | -| Key\_value\_does\_not\_match | STRING:KEY | REGEXP:VALUE | Is `true` if key `KEY` exists and its value does not match `VALUE` | -| Matching\_keys\_have\_matching\_values | REGEXP:KEY | REGEXP:VALUE | Is `true` if all keys matching `KEY` have values that match `VALUE` | +| Condition | Parameter | Parameter 2 | Description | +| :--- | :--- | :--- | :--- | +| Key\_exists | STRING:KEY | NONE | Is `true` if `KEY` exists | +| Key\_does\_not\_exist | STRING:KEY | STRING:VALUE | Is `true` if `KEY` does not exist | +| A\_key\_matches | REGEXP:KEY | NONE | Is `true` if a key matches regex `KEY` | +| No\_key\_matches | REGEXP:KEY | NONE | Is `true` if no key matches regex `KEY` | +| Key\_value\_equals | STRING:KEY | STRING:VALUE | Is `true` if `KEY` exists and its value is `VALUE` | +| Key\_value\_does\_not\_equal | STRING:KEY | STRING:VALUE | Is `true` if `KEY` exists and its value is not `VALUE` | +| Key\_value\_matches | STRING:KEY | REGEXP:VALUE | Is `true` if key `KEY` exists and its value matches `VALUE` | +| Key\_value\_does\_not\_match | STRING:KEY | REGEXP:VALUE | Is `true` if key `KEY` exists and its value does not match `VALUE` | +| Matching\_keys\_have\_matching\_values | REGEXP:KEY | REGEXP:VALUE | Is `true` if all keys matching `KEY` have values that match `VALUE` | | Matching\_keys\_do\_not\_have\_matching\_values | REGEXP:KEY | REGEXP:VALUE | Is `true` if all keys matching `KEY` have values that do not match `VALUE` | - - Conditions are case insensitive, parameters are not - - Any number of conditions can be set. - - Conditions apply to the whole filter instance and all its rules. *Not* to individual rules. - - All conditions have to be `true` for the rules to be applied. +* Conditions are case insensitive, parameters are not +* Any number of conditions can be set. +* Conditions apply to the whole filter instance and all its rules. _Not_ to individual rules. +* All conditions have to be `true` for the rules to be applied. ## Example \#1 - Add and Rename diff --git a/pipeline/filters/nest.md b/pipeline/filters/nest.md index b20006042..5df41ee3d 100644 --- a/pipeline/filters/nest.md +++ b/pipeline/filters/nest.md @@ -61,14 +61,14 @@ _Example \(output\)_ The plugin supports the following configuration parameters: -| Key | Value Format | Operation | Description | -|----------------|-----------------------|-------------|-------------------| -| Operation | ENUM [`nest` or `lift`] |   | Select the operation `nest` or `lift` | -| Wildcard | FIELD WILDCARD | `nest` | Nest records which field matches the wildcard | -| Nest\_under | FIELD STRING | `nest` | Nest records matching the `Wildcard` under this key | -| Nested\_under | FIELD STRING | `lift` | Lift records nested under the `Nested_under` key | -| Add\_prefix | FIELD STRING | ANY | Prefix affected keys with this string | -| Remove\_prefix | FIELD STRING | ANY | Remove prefix from affected keys if it matches this string | +| Key | Value Format | Operation | Description | +| :--- | :--- | :--- | :--- | +| Operation | ENUM \[`nest` or `lift`\] | | Select the operation `nest` or `lift` | +| Wildcard | FIELD WILDCARD | `nest` | Nest records which field matches the wildcard | +| Nest\_under | FIELD STRING | `nest` | Nest records matching the `Wildcard` under this key | +| Nested\_under | FIELD STRING | `lift` | Lift records nested under the `Nested_under` key | +| Add\_prefix | FIELD STRING | ANY | Prefix affected keys with this string | +| Remove\_prefix | FIELD STRING | ANY | Remove prefix from affected keys if it matches this string | ## Getting Started @@ -86,7 +86,7 @@ In order to start filtering records, you can run the filter from the command lin The following command will load the _mem_ plugin. Then the _nest_ filter will match the wildcard rule to the keys and nest the keys matching `Mem.*` under the new key `NEST`. -``` +```text $ bin/fluent-bit -i mem -p 'tag=mem.local' -F nest -p 'Operation=nest' -p 'Wildcard=Mem.*' -p 'Nest_under=Memstats' -p 'Remove_prefix=Mem.' -m '*' -o stdout ``` @@ -119,11 +119,12 @@ The output of both the command line and configuration invocations should be iden [0] mem.local: [1522978514.007359767, {"Swap.total"=>1046524, "Swap.used"=>0, "Swap.free"=>1046524, "Memstats"=>{"total"=>4050908, "used"=>714984, "free"=>3335924}}] ``` -## Example #1 - nest and lift undo +## Example \#1 - nest and lift undo This example nests all `Mem.*` and `Swap,*` items under the `Stats` key and then reverses these actions with a `lift` operation. The output appears unchanged. ### Configuration File + ```python [INPUT] Name mem @@ -152,7 +153,7 @@ This example nests all `Mem.*` and `Swap,*` items under the `Stats` key and then ### Result -``` +```text [2018/06/21 17:42:37] [ info] [engine] started (pid=17285) [0] mem.local: [1529566958.000940636, {"Mem.total"=>8053656, "Mem.used"=>6940380, "Mem.free"=>1113276, "Swap.total"=>16532988, "Swap.used"=>1286772, "Swap.free"=>15246216}] ``` diff --git a/pipeline/filters/parser.md b/pipeline/filters/parser.md index 131318f58..9a2e6d822 100644 --- a/pipeline/filters/parser.md +++ b/pipeline/filters/parser.md @@ -9,7 +9,7 @@ The plugin supports the following configuration parameters: | Key | Description | Default | | :--- | :--- | :--- | | Key\_Name | Specify field name in record to parse. | | -| Parser | Specify the parser name to interpret the field. Multiple _Parser_ entries are allowed (one per line). | | +| Parser | Specify the parser name to interpret the field. Multiple _Parser_ entries are allowed \(one per line\). | | | Preserve\_Key | Keep original `Key_Name` field in the parsed result. If false, the field will be removed. | False | | Reserve\_Data | Keep all other original fields in the parsed result. If false, all other original fields will be removed. | False | | Unescape\_Key | If the key is a escaped string \(e.g: stringify JSON\), unescape the string before to apply the parser. | False | @@ -111,8 +111,7 @@ Copyright (C) Treasure Data [3] dummy.data: [1499347996.001320284, {"INT"=>"100", "FLOAT"=>"0.5", "BOOL"=>"true", "STRING"=>"This is example"}, "key1":"value1", "key2":"value2"] ``` -If you enable `Reserved_Data` and `Preserve_Key`, the original key field will be -preserved as well: +If you enable `Reserved_Data` and `Preserve_Key`, the original key field will be preserved as well: ```python [PARSER] diff --git a/pipeline/filters/rewrite-tag.md b/pipeline/filters/rewrite-tag.md index 3cdfb8c25..e32954ec5 100644 --- a/pipeline/filters/rewrite-tag.md +++ b/pipeline/filters/rewrite-tag.md @@ -4,9 +4,9 @@ description: Powerful and flexible routing # Rewrite Tag -Tags are what makes [routing]( /concepts/data-pipeline/router.md) possible. Tags are set in the configuration of the Input definitions where the records are generated, but there are certain scenarios where might be useful to modify the Tag in the pipeline so we can perform more advanced and flexible routing. +Tags are what makes [routing](../../concepts/data-pipeline/router.md) possible. Tags are set in the configuration of the Input definitions where the records are generated, but there are certain scenarios where might be useful to modify the Tag in the pipeline so we can perform more advanced and flexible routing. -The ```rewrite_tag``` filter, allows to re-emit a record under a new Tag. Once a record has been re-emitted, the original record can be preserved or discarded. +The `rewrite_tag` filter, allows to re-emit a record under a new Tag. Once a record has been re-emitted, the original record can be preserved or discarded. ## How it Works @@ -14,34 +14,34 @@ The way it works is defining rules that matches specific record key content agai The new Tag to define can be composed by: -- Alphabet characters & Numbers -- Original Tag string or part of it -- Regular Expressions groups capture -- Any key or sub-key of the processed record -- Environment variables +* Alphabet characters & Numbers +* Original Tag string or part of it +* Regular Expressions groups capture +* Any key or sub-key of the processed record +* Environment variables ## Configuration Parameters -The ```rewrite_tag``` filter supports the following configuration parameters: +The `rewrite_tag` filter supports the following configuration parameters: | Key | Description | | :--- | :--- | -| Rule | Defines the matching criteria and the format of the Tag for the matching record. The Rule format have four components: ```KEY REGEX NEW_TAG KEEP```. For more specific details of the Rule format and it composition read the next section. | -| Emitter_Name | When the filter emits a record under the new Tag, there is an internal emitter plugin that takes care of the job. Since this emitter expose metrics as any other component of the pipeline, you can use this property to configure an optional name for it. | +| Rule | Defines the matching criteria and the format of the Tag for the matching record. The Rule format have four components: `KEY REGEX NEW_TAG KEEP`. For more specific details of the Rule format and it composition read the next section. | +| Emitter\_Name | When the filter emits a record under the new Tag, there is an internal emitter plugin that takes care of the job. Since this emitter expose metrics as any other component of the pipeline, you can use this property to configure an optional name for it. | ## Rules A rule aims to define matching criteria and specify how to create a new Tag for a record. You can define one or multiple rules in the same configuration section. The rules have the following format: -``` +```text $KEY REGEX NEW_TAG KEEP ``` ### Key -The key represents the name of the *record key* that holds the *value* that we want to use to match our regular expression. A key name is specified and prefixed with a ```$```. Consider the following structured record (formatted for readability): +The key represents the name of the _record key_ that holds the _value_ that we want to use to match our regular expression. A key name is specified and prefixed with a `$`. Consider the following structured record \(formatted for readability\): -```json +```javascript { "name": "abc-123", "ss": { @@ -52,10 +52,10 @@ The key represents the name of the *record key* that holds the *value* that we w } ``` -If we wanted to match against the value of the key ```name``` we must use ```$name```. The key selector is flexible enough to allow to match nested levels of sub-maps from the structure. If we wanted to check the value of the nested key ```s2``` we can do it specifying ```$ss['s1']['s2']```, for short: +If we wanted to match against the value of the key `name` we must use `$name`. The key selector is flexible enough to allow to match nested levels of sub-maps from the structure. If we wanted to check the value of the nested key `s2` we can do it specifying `$ss['s1']['s2']`, for short: -- ```$name``` = "abc-123" -- ```$ss['s1']['s2']``` = "flb" +* `$name` = "abc-123" +* `$ss['s1']['s2']` = "flb" Note that a key must point a value that contains a string, it's **not valid** for numbers, booleans, maps or arrays. @@ -63,43 +63,43 @@ Note that a key must point a value that contains a string, it's **not valid** fo Using a simple regular expression we can specify a matching pattern to use against the value of the key specified above, also we can take advantage of group capturing to create custom placeholder values. -If we wanted to match any record that it ```$name``` contains a value of the format ```string-number``` like the example provided above, we might use: +If we wanted to match any record that it `$name` contains a value of the format `string-number` like the example provided above, we might use: -```regex +```text ^([a-z]+)-([0-9]+)$ ``` -Note that in our example we are using parentheses, this teams that we are specifying groups of data. If the pattern matches the value a placeholder will be created that can be consumed by the NEW_TAG section. +Note that in our example we are using parentheses, this teams that we are specifying groups of data. If the pattern matches the value a placeholder will be created that can be consumed by the NEW\_TAG section. -If ```$name``` equals ```abc-123``` , then the following **placeholders** will be created: +If `$name` equals `abc-123` , then the following **placeholders** will be created: -- ```$0``` = "abc-123" -- ```$1``` = "abc" -- ```$2``` = "123" +* `$0` = "abc-123" +* `$1` = "abc" +* `$2` = "123" -If the Regular expression do not matches an incoming record, the rule will be skipped and the next rule (if any) will be processed. +If the Regular expression do not matches an incoming record, the rule will be skipped and the next rule \(if any\) will be processed. ### New Tag -If a regular expression has matched the value of the defined key in the rule, we are ready to compose a new Tag for that specific record. The tag is a concatenated string that can contain any of the following characters: ```a-z```,```A-Z```, ```0-9``` and ```.-, ```. +If a regular expression has matched the value of the defined key in the rule, we are ready to compose a new Tag for that specific record. The tag is a concatenated string that can contain any of the following characters: `a-z`,`A-Z`, `0-9` and `.-,`. -A Tag can take any string value from the matching record, the original tag it self, environment variable or general placeholder. +A Tag can take any string value from the matching record, the original tag it self, environment variable or general placeholder. Consider the following incoming data on the rule: -- Tag = aa.bb.cc -- Record = ```{"name": "abc-123", "ss": {"s1": {"s2": "flb"}}}``` -- Environment variable $HOSTNAME = fluent +* Tag = aa.bb.cc +* Record = `{"name": "abc-123", "ss": {"s1": {"s2": "flb"}}}` +* Environment variable $HOSTNAME = fluent With such information we could create a very custom Tag for our record like the following: -``` +```text newtag.$TAG.$TAG[1].$1.$ss['s1']['s2'].out.${HOSTNAME} ``` the expected Tag to generated will be: -``` +```text newtag.aa.bb.cc.bb.abc.flb.out.fluent ``` @@ -107,13 +107,13 @@ We make use of placeholders, record content and environment variables. ### Keep -If a rule matches the criteria the filter will emit a copy of the record with the new defined Tag. The property keep takes a boolean value to define if the original record with the old Tag must be preserved and continue in the pipeline or just be discarded. +If a rule matches the criteria the filter will emit a copy of the record with the new defined Tag. The property keep takes a boolean value to define if the original record with the old Tag must be preserved and continue in the pipeline or just be discarded. -You can use ```true``` or ```false``` to decide the expected behavior. There is no default value and this is a mandatory field in the rule. +You can use `true` or `false` to decide the expected behavior. There is no default value and this is a mandatory field in the rule. ## Configuration Example -The following configuration example will emit a dummy (hand-crafted) record, the filter will rewrite the tag, discard the old record and print the new record to the standard output interface: +The following configuration example will emit a dummy \(hand-crafted\) record, the filter will rewrite the tag, discard the old record and print the new record to the standard output interface: ```python [SERVICE] @@ -136,7 +136,7 @@ The following configuration example will emit a dummy (hand-crafted) record, the Match from.* ``` -The original tag ```test_tag``` will be rewritten as ```from.test_tag.new.fluent.bit.out```: +The original tag `test_tag` will be rewritten as `from.test_tag.new.fluent.bit.out`: ```bash $ bin/fluent-bit -c example.conf @@ -145,14 +145,13 @@ Copyright (C) Treasure Data ... [0] from.test_tag.new.fluent.bit.out: [1580436933.000050569, {"tool"=>"fluent", "sub"=>{"s1"=>{"s2"=>"bit"}}}] - ``` ## Monitoring -As described in the [Monitoring](/administration/monitoring) section, every component of the pipeline of Fluent Bit exposes metrics. The basic metrics exposed by this filter are ```drop_records``` and ```add_records```, they summarize the total of dropped records from the incoming data chunk or the new records added. +As described in the [Monitoring](https://github.com/fluent/fluent-bit-docs/tree/d0cf0e2ce3020b2d5d7fcd2781b2e2510b8e4e9e/administration/monitoring/README.md) section, every component of the pipeline of Fluent Bit exposes metrics. The basic metrics exposed by this filter are `drop_records` and `add_records`, they summarize the total of dropped records from the incoming data chunk or the new records added. -Since ```rewrite_tag``` emit new records that goes through the beginning of the pipeline, it exposes an additional metric called ```emit_records``` that summarize the total number of emitted records. +Since `rewrite_tag` emit new records that goes through the beginning of the pipeline, it exposes an additional metric called `emit_records` that summarize the total number of emitted records. ### Understanding the Metrics @@ -166,7 +165,7 @@ $ curl http://127.0.0.1:2020/api/v1/metrics/ | jq Metrics output: -```json +```javascript { "input": { "dummy.0": { @@ -197,13 +196,13 @@ Metrics output: } ``` -The *dummy* input generated two records, the filter dropped two from the chunks and emitted two new ones under a different Tag. +The _dummy_ input generated two records, the filter dropped two from the chunks and emitted two new ones under a different Tag. -The records generated are handled by the internal Emitter, so the new records are summarized in the Emitter metrics, take a look at the entry called ```emitter_for_rewrite_tag.0```. +The records generated are handled by the internal Emitter, so the new records are summarized in the Emitter metrics, take a look at the entry called `emitter_for_rewrite_tag.0`. ### What is the Emitter ? -The Emitter is an internal Fluent Bit plugin that allows other components of the pipeline to emit custom records. On this case ```rewrite_tag``` creates an Emitter instance to use it exclusively to emit records, on that way we can have a granular control of *who* is emitting what. +The Emitter is an internal Fluent Bit plugin that allows other components of the pipeline to emit custom records. On this case `rewrite_tag` creates an Emitter instance to use it exclusively to emit records, on that way we can have a granular control of _who_ is emitting what. -The Emitter name in the metrics can be changed setting up the ```Emitter_Name``` configuration property described above. +The Emitter name in the metrics can be changed setting up the `Emitter_Name` configuration property described above. diff --git a/pipeline/inputs/collectd.md b/pipeline/inputs/collectd.md index 125d3b11c..4b3c2ac3a 100644 --- a/pipeline/inputs/collectd.md +++ b/pipeline/inputs/collectd.md @@ -7,17 +7,17 @@ Content: * [Configuration Parameters](collectd.md#config) * [Configuration Examples](collectd.md#config_example) -## Configuration Parameters {#config} +## Configuration Parameters The plugin supports the following configuration parameters: -| Key | Description | Default | -| :------- | :------------------------------ | :--------------------------- | -| Listen | Set the address to listen to | 0.0.0.0 | -| Port | Set the port to listen to | 25826 | -| TypesDB | Set the data specification file | /usr/share/collectd/types.db | +| Key | Description | Default | +| :--- | :--- | :--- | +| Listen | Set the address to listen to | 0.0.0.0 | +| Port | Set the port to listen to | 25826 | +| TypesDB | Set the data specification file | /usr/share/collectd/types.db | -## Configuration Examples {#config_example} +## Configuration Examples Here is a basic configuration example. @@ -33,8 +33,7 @@ Here is a basic configuration example. Match * ``` -With this configuration, fluent-bit listens to `0.0.0.0:25826`, and -outputs incoming datagram packets to stdout. +With this configuration, fluent-bit listens to `0.0.0.0:25826`, and outputs incoming datagram packets to stdout. + +You must set the same types.db files that your collectd server uses. Otherwise, fluent-bit may not be able to interpret the payload properly. -You must set the same types.db files that your collectd server uses. -Otherwise, fluent-bit may not be able to interpret the payload properly. diff --git a/pipeline/inputs/cpu-metrics.md b/pipeline/inputs/cpu-metrics.md index bcba369a5..d574e2f9b 100644 --- a/pipeline/inputs/cpu-metrics.md +++ b/pipeline/inputs/cpu-metrics.md @@ -1,6 +1,6 @@ -# CPU Usage +# CPU Metrics -The **cpu** input plugin, measures the CPU usage of a process or the whole system by default (considering per CPU core). It reports values in percentage unit for every interval of time set. At the moment this plugin is only available for Linux. +The **cpu** input plugin, measures the CPU usage of a process or the whole system by default \(considering per CPU core\). It reports values in percentage unit for every interval of time set. At the moment this plugin is only available for Linux. The following tables describes the information generated by the plugin. The keys below represent the data used by the overall system, all values associated to the keys are in a percentage unit \(0 to 100%\): @@ -23,10 +23,10 @@ In addition to the keys reported in the above table, a similar content is create The plugin supports the following configuration parameters: | Key | Description | Default | -| :--- | :--- | ---- | +| :--- | :--- | :--- | | Interval\_Sec | Polling interval in seconds | 1 | | Interval\_NSec | Polling interval in nanoseconds | 0 | -| PID | Specify the ID (PID) of a running process in the system. By default the plugin monitors the whole system but if this option is set, it will only monitor the given process ID. | | +| PID | Specify the ID \(PID\) of a running process in the system. By default the plugin monitors the whole system but if this option is set, it will only monitor the given process ID. | | ## Getting Started diff --git a/pipeline/inputs/disk-io-metrics.md b/pipeline/inputs/disk-io-metrics.md index bb9077f7a..d0a1da3e5 100644 --- a/pipeline/inputs/disk-io-metrics.md +++ b/pipeline/inputs/disk-io-metrics.md @@ -1,4 +1,4 @@ -# Disk Throughput +# Disk I/O Metrics The **disk** input plugin, gathers the information about the disk throughput of the running system every certain interval of time and reports them. diff --git a/pipeline/inputs/exec.md b/pipeline/inputs/exec.md index 6de3c401b..a9260342c 100644 --- a/pipeline/inputs/exec.md +++ b/pipeline/inputs/exec.md @@ -12,7 +12,7 @@ The plugin supports the following configuration parameters: | Parser | Specify the name of a parser to interpret the entry as a structured message. | | Interval\_Sec | Polling interval \(seconds\). | | Interval\_NSec | Polling interval \(nanosecond\). | -| Buf\_Size | Size of the buffer (check [unit sizes](https://docs.fluentbit.io/manual/configuration/unit_sizes) for allowed values) | +| Buf\_Size | Size of the buffer \(check [unit sizes](https://docs.fluentbit.io/manual/configuration/unit_sizes) for allowed values\) | ## Getting Started diff --git a/pipeline/inputs/kernel-logs.md b/pipeline/inputs/kernel-logs.md index 66ae15762..581900a02 100644 --- a/pipeline/inputs/kernel-logs.md +++ b/pipeline/inputs/kernel-logs.md @@ -1,4 +1,4 @@ -# Kernel Log Buffer +# Kernel Logs The **kmsg** input plugin reads the Linux Kernel log buffer since the beginning, it gets every record and parse it field as priority, sequence, seconds, useconds, and message. diff --git a/pipeline/inputs/memory-metrics.md b/pipeline/inputs/memory-metrics.md index 3a9622b9c..c7a67f7dc 100644 --- a/pipeline/inputs/memory-metrics.md +++ b/pipeline/inputs/memory-metrics.md @@ -1,4 +1,4 @@ -# Memory Usage +# Memory Metrics The **mem** input plugin, gathers the information about the memory and swap usage of the running system every certain interval of time and reports the total amount of memory and the amount of free available. diff --git a/pipeline/inputs/network-io-metrics.md b/pipeline/inputs/network-io-metrics.md index bc87c2853..0a0fd094e 100644 --- a/pipeline/inputs/network-io-metrics.md +++ b/pipeline/inputs/network-io-metrics.md @@ -1,4 +1,4 @@ -# Network Throughput +# Network I/O Metrics The **netif** input plugin gathers network traffic information of the running system every certain interval of time, and reports them. diff --git a/pipeline/inputs/syslog.md b/pipeline/inputs/syslog.md index 7a9eb0f9d..a79cfce8d 100644 --- a/pipeline/inputs/syslog.md +++ b/pipeline/inputs/syslog.md @@ -8,20 +8,19 @@ The plugin supports the following configuration parameters: | Key | Description | Default | | :--- | :--- | :--- | -| Mode | Defines transport protocol mode: unix\_udp \(UDP over Unix socket\), unix\_tcp \(TCP over Unix socket\), tcp or udp| unix\_udp | +| Mode | Defines transport protocol mode: unix\_udp \(UDP over Unix socket\), unix\_tcp \(TCP over Unix socket\), tcp or udp | unix\_udp | | Listen | If _Mode_ is set to _tcp_, specify the network interface to bind. | 0.0.0.0 | | Port | If _Mode_ is set to _tcp_, specify the TCP port to listen for incoming connections. | 5140 | | Path | If _Mode_ is set to _unix\_tcp_ or _unix\_udp_, set the absolute path to the Unix socket file. | | -| Unix_Perm | If _Mode_ is set to _unix\_tcp_ or _unix\_udp_, set the permission of the Unix socket file. | 0644 | +| Unix\_Perm | If _Mode_ is set to _unix\_tcp_ or _unix\_udp_, set the permission of the Unix socket file. | 0644 | | Parser | Specify an alternative parser for the message. By default, the plugin uses the parser _syslog-rfc3164_. If your syslog messages have fractional seconds set this Parser value to _syslog-rfc5424_ instead. | | | Buffer\_Chunk\_Size | By default the buffer to store the incoming Syslog messages, do not allocate the maximum memory allowed, instead it allocate memory when is required. The rounds of allocations are set by _Buffer\_Chunk\_Size_. If not set, _Buffer\_Chunk\_Size_ is equal to 32000 bytes \(32KB\). Read considerations below when using _udp_ or _unix\_udp_ mode. | | -| Buffer\_Max_Size | Specify the maximum buffer size to receive a Syslog message. If not set, the default size will be the value of _Buffer\_Chunk\_Size_. | | +| Buffer\_Max\_Size | Specify the maximum buffer size to receive a Syslog message. If not set, the default size will be the value of _Buffer\_Chunk\_Size_. | | ### Considerations -- When using Syslog input plugin, Fluent Bit requires access to the _parsers.conf_ file, the path to this file can be specified with the option _-R_ or through the _Parsers\_File_ key on the \[SERVER\] section \(more details below\). - -- When _udp_ or _unix\_udp_ is used, the buffer size to receive messages is configurable __only__ through the _Buffer\_Chunk\_Size_ option which defaults to 32kb. +* When using Syslog input plugin, Fluent Bit requires access to the _parsers.conf_ file, the path to this file can be specified with the option _-R_ or through the _Parsers\_File_ key on the \[SERVER\] section \(more details below\). +* When _udp_ or _unix\_udp_ is used, the buffer size to receive messages is configurable **only** through the _Buffer\_Chunk\_Size_ option which defaults to 32kb. ## Getting Started @@ -81,7 +80,7 @@ Copyright (C) Treasure Data The following content aims to provide configuration examples for different use cases to integrate Fluent Bit and make it listen for Syslog messages from your systems. -### Rsyslog to Fluent Bit: Network mode over TCP {#rsyslog_to_fluentbit_network} +### Rsyslog to Fluent Bit: Network mode over TCP #### Fluent Bit Configuration @@ -155,4 +154,5 @@ $OMUxSockSocket /tmp/fluent-bit.sock *.* :omuxsock: ``` -Make sure that the socket file is readable by rsyslog (tweak the `Unix_Perm` option shown above). +Make sure that the socket file is readable by rsyslog \(tweak the `Unix_Perm` option shown above\). + diff --git a/pipeline/inputs/systemd.md b/pipeline/inputs/systemd.md index 5a382dde5..d5e7836f0 100644 --- a/pipeline/inputs/systemd.md +++ b/pipeline/inputs/systemd.md @@ -9,14 +9,14 @@ The plugin supports the following configuration parameters: | Key | Description | Default | | :--- | :--- | :--- | | Path | Optional path to the Systemd journal directory, if not set, the plugin will use default paths to read local-only logs. | | -| Max\_Fields | Set a maximum number of fields (keys) allowed per record. | 8000 | +| Max\_Fields | Set a maximum number of fields \(keys\) allowed per record. | 8000 | | Max\_Entries | When Fluent Bit starts, the Journal might have a high number of logs in the queue. In order to avoid delays and reduce memory usage, this option allows to specify the maximum number of log entries that can be processed per round. Once the limit is reached, Fluent Bit will continue processing the remaining log entries once Journald performs the notification. | 5000 | | Systemd\_Filter | Allows to perform a query over logs that contains a specific Journald key/value pairs, e.g: \_SYSTEMD\_UNIT=UNIT. The Systemd\_Filter option can be specified multiple times in the input section to apply multiple filters as required. | | -| Systemd\_Filter\_Type | Define the filter type when *Systemd_Filter* is specified multiple times. Allowed values are _And_ and _Or_. With _And_ a record is matched only when all of the *Systemd_Filter* have a match. With _Or_ a record is matched when any of the *Systemd_Filter* has a match. | Or | +| Systemd\_Filter\_Type | Define the filter type when _Systemd\_Filter_ is specified multiple times. Allowed values are _And_ and _Or_. With _And_ a record is matched only when all of the _Systemd\_Filter_ have a match. With _Or_ a record is matched when any of the _Systemd\_Filter_ has a match. | Or | | Tag | The tag is used to route messages but on Systemd plugin there is an extra functionality: if the tag includes a star/wildcard, it will be expanded with the Systemd Unit file \(e.g: host.\* => host.UNIT\_NAME\). | | | DB | Specify the absolute path of a database file to keep track of Journald cursor. | | | Read\_From\_Tail | Start reading new entries. Skip entries already stored in Journald. | Off | -| Strip\_Underscores | Remove the leading underscore of the Journald field (key). For example the Journald field *_PID* becomes the key *PID*. | Off | +| Strip\_Underscores | Remove the leading underscore of the Journald field \(key\). For example the Journald field _\_PID_ becomes the key _PID_. | Off | ## Getting Started @@ -53,3 +53,4 @@ In your main configuration file append the following _Input_ & _Output_ sections Name stdout Match * ``` + diff --git a/pipeline/inputs/tail.md b/pipeline/inputs/tail.md index f1f142903..b6c99d8aa 100644 --- a/pipeline/inputs/tail.md +++ b/pipeline/inputs/tail.md @@ -12,14 +12,14 @@ Content: * [Getting Started](tail.md#getting_started) * [Tailing Files Keeping State](tail.md#keep_state) -## Configuration Parameters {#config} +## Configuration Parameters The plugin supports the following configuration parameters: | Key | Description | Default | | :--- | :--- | :--- | -| Buffer\_Chunk\_Size | Set the initial buffer size to read files data. This value is used too to increase buffer size. The value must be according to the [Unit Size](../configuration/unit_sizes.md) specification. | 32k | -| Buffer\_Max\_Size | Set the limit of the buffer size per monitored file. When a buffer needs to be increased \(e.g: very long lines\), this value is used to restrict how much the memory buffer can grow. If reading a file exceed this limit, the file is removed from the monitored file list. The value must be according to the [Unit Size](../configuration/unit_sizes.md) specification. | Buffer\_Chunk\_Size | +| Buffer\_Chunk\_Size | Set the initial buffer size to read files data. This value is used too to increase buffer size. The value must be according to the [Unit Size](https://github.com/fluent/fluent-bit-docs/tree/00bb8cbd96cc06988ff3e51b4933e16e49206c70/pipeline/configuration/unit_sizes.md) specification. | 32k | +| Buffer\_Max\_Size | Set the limit of the buffer size per monitored file. When a buffer needs to be increased \(e.g: very long lines\), this value is used to restrict how much the memory buffer can grow. If reading a file exceed this limit, the file is removed from the monitored file list. The value must be according to the [Unit Size](https://github.com/fluent/fluent-bit-docs/tree/00bb8cbd96cc06988ff3e51b4933e16e49206c70/pipeline/configuration/unit_sizes.md) specification. | Buffer\_Chunk\_Size | | Path | Pattern specifying a specific log files or multiple ones through the use of common wildcards. | | | Path\_Key | If enabled, it appends the name of the monitored file as part of the record. The value assigned becomes the key in the map. | | | Exclude\_Path | Set one or multiple shell patterns separated by commas to exclude files matching a certain criteria, e.g: exclude\_path=\*.gz,\*.zip | | @@ -32,12 +32,12 @@ The plugin supports the following configuration parameters: | Mem\_Buf\_Limit | Set a limit of memory that Tail plugin can use when appending data to the Engine. If the limit is reach, it will be paused; when the data is flushed it resumes. | | | Parser | Specify the name of a parser to interpret the entry as a structured message. | | | Key | When a message is unstructured \(no parser applied\), it's appended as a string under the key name _log_. This option allows to define an alternative name for that key. | log | -| Tag | Set a tag (with regex-extract fields) that will be placed on lines read. E.g. `kube...`. Note that "tag expansion" is supported: if the tag includes an asterisk (*), that asterisk will be replaced with the absolute path of the monitored file (also see [Workflow of Tail + Kubernetes Filter](../filter/kubernetes.md)). | | -| Tag_Regex | Set a regex to exctract fields from the file. E.g. `(?[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?[^_]+)_(?.+)-` | | +| Tag | Set a tag \(with regex-extract fields\) that will be placed on lines read. E.g. `kube...`. Note that "tag expansion" is supported: if the tag includes an asterisk \(\*\), that asterisk will be replaced with the absolute path of the monitored file \(also see [Workflow of Tail + Kubernetes Filter](https://github.com/fluent/fluent-bit-docs/tree/00bb8cbd96cc06988ff3e51b4933e16e49206c70/pipeline/filter/kubernetes.md)\). | | +| Tag\_Regex | Set a regex to exctract fields from the file. E.g. `(?[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?[^_]+)_(?.+)-` | | Note that if the database parameter _db_ is **not** specified, by default the plugin will start reading each target file from the beginning. -### Multiline Configuration Parameters {#multiline} +### Multiline Configuration Parameters Additionally the following options exists to configure the handling of multi-lines files: @@ -48,7 +48,7 @@ Additionally the following options exists to configure the handling of multi-lin | Parser\_Firstline | Name of the parser that matchs the beginning of a multiline message. Note that the regular expression defined in the parser must include a group name \(named capture\) | | | Parser\_N | Optional-extra parser to interpret and structure multiline entries. This option can be used to define multiple parsers, e.g: Parser\_1 ab1, Parser\_2 ab2, Parser\_N abN. | | -### Docker Mode Configuration Parameters {#docker_mode} +### Docker Mode Configuration Parameters Docker mode exists to recombine JSON log lines split by the Docker daemon due to its line length limit. To use this feature, configure the tail plugin with the corresponding parser and then enable Docker mode: @@ -57,7 +57,7 @@ Docker mode exists to recombine JSON log lines split by the Docker daemon due to | Docker\_Mode | If enabled, the plugin will recombine split Docker log lines before passing them to any parser as configured above. This mode cannot be used at the same time as Multiline. | Off | | Docker\_Mode\_Flush | Wait period time in seconds to flush queued unfinished split lines. | 4 | -## Getting Started {#getting_started} +## Getting Started In order to tail text or log files, you can run the plugin from the command line or through the configuration file: @@ -83,7 +83,7 @@ In your main configuration file append the following _Input_ & _Output_ sections Match * ``` -## Tailing files keeping state {#keep_state} +## Tailing files keeping state The _tail_ input plugin a feature to save the state of the tracked files, is strongly suggested you enabled this. For this purpose the **db** property is available, e.g: @@ -121,3 +121,4 @@ By default SQLite client tool do not format the columns in a human read-way, so ## Files Rotation Files rotation are properly handled, including logrotate _copytruncate_ mode. + diff --git a/pipeline/inputs/tcp.md b/pipeline/inputs/tcp.md index ca985d1b0..c4b4bb177 100644 --- a/pipeline/inputs/tcp.md +++ b/pipeline/inputs/tcp.md @@ -9,11 +9,12 @@ The plugin supports the following configuration parameters: | Key | Description | Default | | :--- | :--- | :--- | | Listen | Listener network interface. | 0.0.0.0 | -| Port | TCP port where listening for connections |5170| -| Buffer\_Size | Specify the maximum buffer size in KB to receive a JSON message. If not set, the default size will be the value of _Chunk\_Size_. || -| Chunk\_Size | By default the buffer to store the incoming JSON messages, do not allocate the maximum memory allowed, instead it allocate memory when is required. The rounds of allocations are set by _Chunk\_Size_ in KB. If not set, _Chunk\_Size_ is equal to 32 \(32KB\). |32| -| Format | Specify the expected payload format. It support the options _json_ and _none_. When using _json_, it expects JSON maps, when is set to _none_, it will split every record using the defined _Separator_ (option below). |json| -| Separator | When the expected _Format_ is set to _none_, Fluent Bit needs a separator string to split the records. By default it uses the breakline character ```\n``` (LF or 0x10). |\n| +| Port | TCP port where listening for connections | 5170 | +| Buffer\_Size | Specify the maximum buffer size in KB to receive a JSON message. If not set, the default size will be the value of _Chunk\_Size_. | | +| Chunk\_Size | By default the buffer to store the incoming JSON messages, do not allocate the maximum memory allowed, instead it allocate memory when is required. The rounds of allocations are set by _Chunk\_Size_ in KB. If not set, _Chunk\_Size_ is equal to 32 \(32KB\). | 32 | +| Format | Specify the expected payload format. It support the options _json_ and _none_. When using _json_, it expects JSON maps, when is set to _none_, it will split every record using the defined _Separator_ \(option below\). | json | +| Separator | When the expected _Format_ is set to _none_, Fluent Bit needs a separator string to split the records. By default it uses the breakline character `\n` \(LF or 0x10\). | \n | + ## Getting Started In order to receive JSON messages over TCP, you can run the plugin from the command line or through the configuration file: @@ -79,4 +80,5 @@ Copyright (C) Treasure Data When receiving payloads in JSON format, there are high performance penalties. Parsing JSON is a very expensive task so you could expect your CPU usage increase under high load environments. -To get faster data ingestion, consider to use the option ```Format none``` to avoid JSON parsing if not needed. \ No newline at end of file +To get faster data ingestion, consider to use the option `Format none` to avoid JSON parsing if not needed. + diff --git a/pipeline/inputs/thermal.md b/pipeline/inputs/thermal.md index ee1c7586e..db1463810 100644 --- a/pipeline/inputs/thermal.md +++ b/pipeline/inputs/thermal.md @@ -1,7 +1,6 @@ -# Thermal Information +# Thermal -The **thermal** input plugin reports system temperatures periodically -- each second by default. -Currently this plugin is only available for Linux. +The **thermal** input plugin reports system temperatures periodically -- each second by default. Currently this plugin is only available for Linux. The following tables describes the information generated by the plugin. @@ -17,14 +16,14 @@ The plugin supports the following configuration parameters: | Key | Description | | :--- | :--- | -| Interval\_Sec | Polling interval \(seconds\). default: 1 | +| Interval\_Sec | Polling interval \(seconds\). default: 1 | | Interval\_NSec | Polling interval \(nanoseconds\). default: 0 | -| name\_regex | Optional name filter regex. default: None | -| type\_regex | Optional type filter regex. default: None | +| name\_regex | Optional name filter regex. default: None | +| type\_regex | Optional type filter regex. default: None | ## Getting Started -In order to get temperature(s) of your system, you can run the plugin from the command line or through the configuration file: +In order to get temperature\(s\) of your system, you can run the plugin from the command line or through the configuration file: ### Command Line @@ -40,7 +39,7 @@ Copyright (C) Treasure Data [2] my_thermal: [1566099586.000083156, {"name"=>"thermal_zone0", "type"=>"x86_pkg_temp", "temp"=>59.000000}] ``` -Some systems provide multiple thermal zones. In this example monitor only _thermal\_zone0_ by name, once per minute. +Some systems provide multiple thermal zones. In this example monitor only _thermal\_zone0_ by name, once per minute. ```bash $ bin/fluent-bit -i thermal -t my_thermal -p "interval_sec=60" -p "name_regex=thermal_zone0" -o stdout -m '*' diff --git a/pipeline/inputs/windows-event-log.md b/pipeline/inputs/windows-event-log.md index 9bafea960..fcca07f40 100644 --- a/pipeline/inputs/windows-event-log.md +++ b/pipeline/inputs/windows-event-log.md @@ -1,25 +1,25 @@ -# Winlog +# Windows Event Log The **winlog** input plugin allows you to read Windows Event Log. Content: -* [Configuration Parameters](winlog.md#config) -* [Configuration Examples](winlog.md#config_example) +* [Configuration Parameters](https://github.com/fluent/fluent-bit-docs/tree/00bb8cbd96cc06988ff3e51b4933e16e49206c70/pipeline/inputs/winlog.md#config) +* [Configuration Examples](https://github.com/fluent/fluent-bit-docs/tree/00bb8cbd96cc06988ff3e51b4933e16e49206c70/pipeline/inputs/winlog.md#config_example) -## Configuration Parameters {#config} +## Configuration Parameters The plugin supports the following configuration parameters: -| Key | Description | Default | -| :------------------ | :------------------------------------------------------- | :------ | -| Channels | A comma-separated list of channels to read from. | | -| Interval\_Sec | Set the polling interval for each channel. (optional) | 1 | -| DB | Set the path to save the read offsets. (optional) | | +| Key | Description | Default | +| :--- | :--- | :--- | +| Channels | A comma-separated list of channels to read from. | | +| Interval\_Sec | Set the polling interval for each channel. \(optional\) | 1 | +| DB | Set the path to save the read offsets. \(optional\) | | Note that if you do not set _db_, the plugin will read channels from the beginning on each startup. -## Configuration Examples {#config_example} +## Configuration Examples ### Configuration File @@ -37,7 +37,7 @@ Here is a minimum configuration example. Match * ``` -Note that some Windows Event Log channels (like `Security`) requires an admin privilege for reading. In this case, you need to run fluent-bit as an administrator. +Note that some Windows Event Log channels \(like `Security`\) requires an admin privilege for reading. In this case, you need to run fluent-bit as an administrator. ### Command Line @@ -46,3 +46,4 @@ If you want to do a quick test, you can run this plugin from the command line. ```bash $ fluent-bit -i winlog -p 'channels=Setup' -o stdout ``` + diff --git a/pipeline/outputs/bigquery.md b/pipeline/outputs/bigquery.md index c466ae40f..899b59940 100644 --- a/pipeline/outputs/bigquery.md +++ b/pipeline/outputs/bigquery.md @@ -2,9 +2,9 @@ BigQuery output plugin is and _experimental_ plugin that allows you to stream records into [Google Cloud BigQuery](https://cloud.google.com/bigquery/) service. The implementation does not support the following, which would be expected in a full production version: - * [Application Default Credentials](https://cloud.google.com/docs/authentication/production). - * [Data deduplication](https://cloud.google.com/bigquery/streaming-data-into-bigquery) using `insertId`. - * [Template tables](https://cloud.google.com/bigquery/streaming-data-into-bigquery) using `templateSuffix`. +* [Application Default Credentials](https://cloud.google.com/docs/authentication/production). +* [Data deduplication](https://cloud.google.com/bigquery/streaming-data-into-bigquery) using `insertId`. +* [Template tables](https://cloud.google.com/bigquery/streaming-data-into-bigquery) using `templateSuffix`. ## Google Cloud Configuration @@ -22,7 +22,7 @@ Fluent Bit does not create datasets or tables for your data, so you must create * [Creating and using datasets](https://cloud.google.com/bigquery/docs/datasets) -Within the dataset you will need to create a table for the data to reside in. You can follow the following instructions for creating your table. Pay close attention to the schema. It must match the schema of your output JSON. Unfortunately, since BigQuery does not allow dots in field names, you will need to use a filter to change the fields for many of the standard inputs (e.g, mem or cpu). +Within the dataset you will need to create a table for the data to reside in. You can follow the following instructions for creating your table. Pay close attention to the schema. It must match the schema of your output JSON. Unfortunately, since BigQuery does not allow dots in field names, you will need to use a filter to change the fields for many of the standard inputs \(e.g, mem or cpu\). * [Creating and using tables](https://cloud.google.com/bigquery/docs/tables) @@ -37,9 +37,9 @@ Fluent Bit BigQuery output plugin uses a JSON credentials file for authenticatio | Key | Description | default | | :--- | :--- | :--- | | google\_service\_credentials | Absolute path to a Google Cloud credentials JSON file | Value of the environment variable _$GOOGLE\_SERVICE\_CREDENTIALS_ | -| project_id | The project id containing the BigQuery dataset to stream into. | The value of the `project_id` in the credentials file | -| dataset_id | The dataset id of the BigQuery dataset to write into. This dataset must exist in your project. | | -| table_id | The table id of the BigQuery table to write into. This table must exist in the specified dataset and the schema must match the output. | | +| project\_id | The project id containing the BigQuery dataset to stream into. | The value of the `project_id` in the credentials file | +| dataset\_id | The dataset id of the BigQuery dataset to write into. This dataset must exist in your project. | | +| table\_id | The table id of the BigQuery table to write into. This table must exist in the specified dataset and the schema must match the output. | | ## Configuration File @@ -55,4 +55,5 @@ If you are using a _Google Cloud Credentials File_, the following configuration Match * dataset_id my_dataset table_id dummy_table -``` \ No newline at end of file +``` + diff --git a/pipeline/outputs/datadog.md b/pipeline/outputs/datadog.md index cfc353a06..af0847d70 100644 --- a/pipeline/outputs/datadog.md +++ b/pipeline/outputs/datadog.md @@ -2,19 +2,19 @@ The Datadog output plugin allows to ingest your logs into [Datadog](https://app.datadoghq.com/signup). -Before you begin, you need a [Datadog account](https://app.datadoghq.com/signup), a [Datadog API key](https://docs.datadoghq.com/account_management/api-app-keys/), and you need to [activate Datadog Logs Management](https://app.datadoghq.com/logs/activation). +Before you begin, you need a [Datadog account](https://app.datadoghq.com/signup), a [Datadog API key](https://docs.datadoghq.com/account_management/api-app-keys/), and you need to [activate Datadog Logs Management](https://app.datadoghq.com/logs/activation). ## Configuration Parameters | Key | Description | Default | -|------------|---------------------------------------------------------------------------------------------|----------------------------------| -| Host | _Required_ - The Datadog server where you are sending your logs. | `http-intake.logs.datadoghq.com` | +| :--- | :--- | :--- | +| Host | _Required_ - The Datadog server where you are sending your logs. | `http-intake.logs.datadoghq.com` | | TLS | _Required_ - End-to-end security communications security protocol. Datadog recommends setting this to `on`. | `off` | | compress | _Recommended_ - compresses the payload in GZIP format, Datadog supports and recommends setting this to `gzip`. | | | apikey | _Required_ - Your [Datadog API key](https://app.datadoghq.com/account/settings#api). | | -| dd_service | _Recommended_ - The human readable name for your service generating the logs - the name of your application or database. | | -| dd_source | _Recommended_ - A human readable name for the underlying technology of your service. For example, `postgres` or `nginx`. | | -| dd_tags | _Optional_ - The [tags](https://docs.datadoghq.com/tagging/) you want to assign to your logs in Datadog. | | +| dd\_service | _Recommended_ - The human readable name for your service generating the logs - the name of your application or database. | | +| dd\_source | _Recommended_ - A human readable name for the underlying technology of your service. For example, `postgres` or `nginx`. | | +| dd\_tags | _Optional_ - The [tags](https://docs.datadoghq.com/tagging/) you want to assign to your logs in Datadog. | | ### Configuration File @@ -38,3 +38,4 @@ Get started quickly with this configuration file: ### 403 Forbidden If you get a `403 Forbidden` error response, double check that you have a valid [Datadog API key](https://docs.datadoghq.com/account_management/api-app-keys/) and that you have [activated Datadog Logs Management](https://app.datadoghq.com/logs/activation). + diff --git a/pipeline/outputs/elasticsearch.md b/pipeline/outputs/elasticsearch.md index 87933a089..7d784ef2d 100644 --- a/pipeline/outputs/elasticsearch.md +++ b/pipeline/outputs/elasticsearch.md @@ -9,7 +9,7 @@ The **es** output plugin, allows to ingest your records into a [Elasticsearch](h | Host | IP address or hostname of the target Elasticsearch instance | 127.0.0.1 | | Port | TCP port of the target Elasticsearch instance | 9200 | | Path | Elasticsearch accepts new data on HTTP query path "/\_bulk". But it is also possible to serve Elasticsearch behind a reverse proxy on a subpath. This option defines such path on the fluent-bit side. It simply adds a path prefix in the indexing HTTP POST URI. | Empty string | -| Buffer\_Size | Specify the buffer size used to read the response from the Elasticsearch HTTP service. This option is useful for debugging purposes where is required to read full responses, note that response size grows depending of the number of records inserted. To set an _unlimited_ amount of memory set this value to **False**, otherwise the value must be according to the [Unit Size](../configuration/unit_sizes.md) specification. | 4KB | +| Buffer\_Size | Specify the buffer size used to read the response from the Elasticsearch HTTP service. This option is useful for debugging purposes where is required to read full responses, note that response size grows depending of the number of records inserted. To set an _unlimited_ amount of memory set this value to **False**, otherwise the value must be according to the [Unit Size](https://github.com/fluent/fluent-bit-docs/tree/16f30161dc4c79d407cd9c586a0c6839d0969d97/pipeline/configuration/unit_sizes.md) specification. | 4KB | | Pipeline | Newer versions of Elasticsearch allows to setup filters called pipelines. This option allows to define which pipeline the database should use. For performance reasons is strongly suggested to do parsing and filtering on Fluent Bit side, avoid pipelines. | | | HTTP\_User | Optional username credential for Elastic X-Pack access | | | HTTP\_Passwd | Password for user defined in HTTP\_User | | @@ -24,15 +24,15 @@ The **es** output plugin, allows to ingest your records into a [Elasticsearch](h | Tag\_Key | When Include\_Tag\_Key is enabled, this property defines the key name for the tag. | \_flb-key | | Generate\_ID | When enabled, generate `_id` for outgoing records. This prevents duplicate records when retrying ES. | Off | | Replace\_Dots | When enabled, replace field name dots with underscore, required by Elasticsearch 2.0-2.3. | Off | -| Trace\_Output | When enabled print the elasticsearch API calls to stdout (for diag only) | Off | +| Trace\_Output | When enabled print the elasticsearch API calls to stdout \(for diag only\) | Off | | Current\_Time\_Index | Use current time for index generation instead of message record | Off | -| Logstash\_Prefix\_Key | When included: the value in the record that belongs to the key will be looked up and over-write the Logstash\_Prefix for index generation. If the key/value is not found in the record then the Logstash\_Prefix option will act as a fallback. Nested keys are not supported (if desired, you can use the nest filter plugin to remove nesting) | | +| Logstash\_Prefix\_Key | When included: the value in the record that belongs to the key will be looked up and over-write the Logstash\_Prefix for index generation. If the key/value is not found in the record then the Logstash\_Prefix option will act as a fallback. Nested keys are not supported \(if desired, you can use the nest filter plugin to remove nesting\) | | > The parameters _index_ and _type_ can be confusing if you are new to Elastic, if you have used a common relational database before, they can be compared to the _database_ and _table_ concepts. Also see [the FAQ below](elasticsearch.md#faq-multiple-types) ### TLS / SSL -Elasticsearch output plugin supports TTL/SSL, for more details about the properties available and general configuration, please refer to the [TLS/SSL](../configuration/tls_ssl.md) section. +Elasticsearch output plugin supports TTL/SSL, for more details about the properties available and general configuration, please refer to the [TLS/SSL](https://github.com/fluent/fluent-bit-docs/tree/16f30161dc4c79d407cd9c586a0c6839d0969d97/pipeline/configuration/tls_ssl.md) section. ## Getting Started @@ -94,11 +94,11 @@ becomes ## FAQ -### Elasticsearch rejects requests saying "the final mapping would have more than 1 type" {#faq-multiple-types} +### Elasticsearch rejects requests saying "the final mapping would have more than 1 type" Since Elasticsearch 6.0, you cannot create multiple types in a single index. This means that you cannot set up your configuration as below anymore. -``` +```text [OUTPUT] Name es Match foo.* @@ -114,18 +114,19 @@ Since Elasticsearch 6.0, you cannot create multiple types in a single index. Thi If you see an error message like below, you'll need to fix your configuration to use a single type on each index. -> Rejecting mapping update to [search] as the final mapping would have more than 1 type +> Rejecting mapping update to \[search\] as the final mapping would have more than 1 type For details, please read [the official blog post on that issue](https://www.elastic.co/guide/en/elasticsearch/reference/6.7/removal-of-types.html). ### Fluent Bit + AWS Elasticsearch -AWS Elasticsearch adds an extra security layer where the HTTP requests we must be signed with AWS Signv4, as of Fluent Bit v1.3 this is not yet supported. At the end of January 2020 with the release of Fluent Bit v1.4 we are adding such feature (among integration with other AWS Services ;) ) +AWS Elasticsearch adds an extra security layer where the HTTP requests we must be signed with AWS Signv4, as of Fluent Bit v1.3 this is not yet supported. At the end of January 2020 with the release of Fluent Bit v1.4 we are adding such feature \(among integration with other AWS Services ;\) \) As a workaround, you can use the following tool as a proxy: -- https://github.com/abutaha/aws-es-proxy +* [https://github.com/abutaha/aws-es-proxy](https://github.com/abutaha/aws-es-proxy) More details about this AWS requirement can be found here: -- https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/es-request-signing.html +* [https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/es-request-signing.html](https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/es-request-signing.html) + diff --git a/pipeline/outputs/file.md b/pipeline/outputs/file.md index cfb044f79..a3aa3f266 100644 --- a/pipeline/outputs/file.md +++ b/pipeline/outputs/file.md @@ -23,7 +23,7 @@ tag: [time, {"key1":"value1", "key2":"value2", "key3":"value3"}] ### plain format -Output the records as JSON (without additional `tag` and `timestamp` attributes). There is no configuration parameters for plain format. +Output the records as JSON \(without additional `tag` and `timestamp` attributes\). There is no configuration parameters for plain format. ```javascript {"key1":"value1", "key2":"value2", "key3":"value3"} @@ -58,16 +58,15 @@ field1[label_delimiter]value1[delimiter]field2[label_delimiter]value2\n Output the records using a custom format template. -| Key | Description | -| :------- | :--------------------------------------------- | +| Key | Description | +| :--- | :--- | | Template | The format string. Default: '{time} {message}' | -This accepts a formatting template and fills placeholders using corresponding -values in a record. +This accepts a formatting template and fills placeholders using corresponding values in a record. For example, if you set up the configuration as below: -``` +```text [INPUT] Name mem @@ -79,7 +78,7 @@ For example, if you set up the configuration as below: You will get the following output: -``` +```text 1564462620.000254 used=1045448 free=31760160 total=32805608 ``` diff --git a/pipeline/outputs/forward.md b/pipeline/outputs/forward.md index 64bd47bc0..f51e6d6af 100644 --- a/pipeline/outputs/forward.md +++ b/pipeline/outputs/forward.md @@ -16,14 +16,13 @@ The following parameters are mandatory for either Forward for Secure Forward mod | Host | Target host where Fluent-Bit or Fluentd are listening for Forward messages. | 127.0.0.1 | | Port | TCP Port of the target service. | 24224 | | Time\_as\_Integer | Set timestamps in integer format, it enable compatibility mode for Fluentd v0.12 series. | False | -| Upstream | If Forward will connect to an _Upstream_ instead of a simple host, this property defines the absolute path for the Upstream configuration file, for more details about this refer to the [Upstream Servers](../configuration/upstream_servers.md) documentation section. | | -| Send_options | Always send options (with "size"=count of messages) | False | -| Require_ack_response | Send "chunk"-option and wait for "ack" response from server. Enables at-least-once and receiving server can control rate of traffic. (Requires Fluentd v0.14.0+ server) | False | +| Upstream | If Forward will connect to an _Upstream_ instead of a simple host, this property defines the absolute path for the Upstream configuration file, for more details about this refer to the [Upstream Servers](https://github.com/fluent/fluent-bit-docs/tree/16f30161dc4c79d407cd9c586a0c6839d0969d97/pipeline/configuration/upstream_servers.md) documentation section. | | +| Send\_options | Always send options \(with "size"=count of messages\) | False | +| Require\_ack\_response | Send "chunk"-option and wait for "ack" response from server. Enables at-least-once and receiving server can control rate of traffic. \(Requires Fluentd v0.14.0+ server\) | False | +## Secure Forward Mode Configuration Parameters -## Secure Forward Mode Configuration Parameters - -When using Secure Forward mode, the [TLS](../configuration/tls_ssl.md) mode requires to be enabled. The following additional configuration parameters are available: +When using Secure Forward mode, the [TLS](https://github.com/fluent/fluent-bit-docs/tree/16f30161dc4c79d407cd9c586a0c6839d0969d97/pipeline/configuration/tls_ssl.md) mode requires to be enabled. The following additional configuration parameters are available: | Key | Description | Default | | :--- | :--- | :--- | @@ -95,7 +94,7 @@ $ fluentd -c test.conf 2017-03-23 11:50:43 -0600 [info]: listening fluent socket on 0.0.0.0:24224 ``` -## Fluent Bit + Forward Setup {#forward_setup} +## Fluent Bit + Forward Setup Now that [Fluentd](http://fluentd.org) is ready to receive messages, we need to specify where the **forward** output plugin will flush the information using the following format: @@ -105,7 +104,7 @@ bin/fluent-bit -i INPUT -o forward://HOST:PORT If the **TAG** parameter is not set, the plugin will set the tag as _fluent\_bit_. Keep in mind that **TAG** is important for routing rules inside [Fluentd](http://fluentd.org). -Using the [CPU](../input/cpu.md) input plugin as an example we will flush CPU metrics to [Fluentd](http://fluentd.org): +Using the [CPU](https://github.com/fluent/fluent-bit-docs/tree/16f30161dc4c79d407cd9c586a0c6839d0969d97/pipeline/input/cpu.md) input plugin as an example we will flush CPU metrics to [Fluentd](http://fluentd.org): ```bash $ bin/fluent-bit -i cpu -t fluent_bit -o forward://127.0.0.1:24224 @@ -120,13 +119,13 @@ Now on the [Fluentd](http://fluentd.org) side, you will see the CPU metrics gath 2017-03-23 11:53:09 -0600 fluent_bit: {"cpu_p":4.75,"user_p":3.5,"system_p":1.25,"cpu0.p_cpu":4.0,"cpu0.p_user":3.0,"cpu0.p_system":1.0,"cpu1.p_cpu":5.0,"cpu1.p_user":4.0,"cpu1.p_system":1.0,"cpu2.p_cpu":3.0,"cpu2.p_user":2.0,"cpu2.p_system":1.0,"cpu3.p_cpu":5.0,"cpu3.p_user":4.0,"cpu3.p_system":1.0} ``` -So we gathered [CPU](../input/cpu.md) metrics and flushed them out to [Fluentd](http://fluentd.org) properly. +So we gathered [CPU](https://github.com/fluent/fluent-bit-docs/tree/16f30161dc4c79d407cd9c586a0c6839d0969d97/pipeline/input/cpu.md) metrics and flushed them out to [Fluentd](http://fluentd.org) properly. -## Fluent Bit + Secure Forward Setup {#secure_forward_setup} +## Fluent Bit + Secure Forward Setup > DISCLAIMER: the following example do not consider the generation of certificates for a proper usage of production environments. -Secure Forward aims to provide a secure channel of communication with the remote Fluentd service using [TLS](../configuration/tls_ssl.md). Above there is a minimalist configuration for testing purposes. +Secure Forward aims to provide a secure channel of communication with the remote Fluentd service using [TLS](https://github.com/fluent/fluent-bit-docs/tree/16f30161dc4c79d407cd9c586a0c6839d0969d97/pipeline/configuration/tls_ssl.md). Above there is a minimalist configuration for testing purposes. ### Fluent Bit diff --git a/pipeline/outputs/gelf.md b/pipeline/outputs/gelf.md index d2f30dd34..a97608828 100644 --- a/pipeline/outputs/gelf.md +++ b/pipeline/outputs/gelf.md @@ -6,43 +6,37 @@ The following instructions assumes that you have a fully operational Graylog ser ## Configuration Parameters -According to [GELF Payload Specification](https://docs.graylog.org/en/latest/pages/gelf.html#gelf-payload-specification), there are some mandatory and optional fields which are used by Graylog in GELF format. These fields are determined with _Gelf\_*\_Key_ key in this plugin. - -| Key | Description | default | -|-------------|----------------------|-------------------| -| Match | Pattern to match which tags of logs to be outputted by this plugin | | -| Host | IP address or hostname of the target Graylog server | 127.0.0.1 | -| Port | The port that your Graylog GELF input is listening on | 12201 | -| Mode | The protocol to use (`tls`, `tcp` or `udp`) | udp | -| Gelf_Short_Message_Key | A short descriptive message (**MUST be set in GELF**) | short_message | -| Gelf_Timestamp_Key | Your log timestamp (_SHOULD be set in GELF_) |timestamp | -| Gelf_Host_Key | Key which its value is used as the name of the host, source or application that sent this message. (**MUST be set in GELF**) | host | -| Gelf_Full_Message_Key | Key to use as the long message that can i.e. contain a backtrace. (_Optional in GELF_) | full_message | -| Gelf_Level_Key | Key to be used as the log level. Its value must be in [standard syslog levels](https://en.wikipedia.org/wiki/Syslog#Severity_level) (between 0 and 7). (_Optional in GELF_) | level | -| Packet_Size | If transport protocol is `udp`, you can set the size of packets to be sent. | 1420 | +According to [GELF Payload Specification](https://docs.graylog.org/en/latest/pages/gelf.html#gelf-payload-specification), there are some mandatory and optional fields which are used by Graylog in GELF format. These fields are determined with _Gelf\_\*\_Key\_ key in this plugin. + +| Key | Description | default | +| :--- | :--- | :--- | +| Match | Pattern to match which tags of logs to be outputted by this plugin | | +| Host | IP address or hostname of the target Graylog server | 127.0.0.1 | +| Port | The port that your Graylog GELF input is listening on | 12201 | +| Mode | The protocol to use \(`tls`, `tcp` or `udp`\) | udp | +| Gelf\_Short\_Message\_Key | A short descriptive message \(**MUST be set in GELF**\) | short\_message | +| Gelf\_Timestamp\_Key | Your log timestamp \(_SHOULD be set in GELF_\) | timestamp | +| Gelf\_Host\_Key | Key which its value is used as the name of the host, source or application that sent this message. \(**MUST be set in GELF**\) | host | +| Gelf\_Full\_Message\_Key | Key to use as the long message that can i.e. contain a backtrace. \(_Optional in GELF_\) | full\_message | +| Gelf\_Level\_Key | Key to be used as the log level. Its value must be in [standard syslog levels](https://en.wikipedia.org/wiki/Syslog#Severity_level) \(between 0 and 7\). \(_Optional in GELF_\) | level | +| Packet\_Size | If transport protocol is `udp`, you can set the size of packets to be sent. | 1420 | | Compress | If transport protocol is `udp`, you can set this if you want your UDP packets to be compressed. | true | ### TLS / SSL -GELF output plugin supports TLS/SSL, for more details about the properties available and general configuration, please refer to the [TLS/SSL](../configuration/tls_ssl.md) section. +GELF output plugin supports TLS/SSL, for more details about the properties available and general configuration, please refer to the [TLS/SSL](https://github.com/fluent/fluent-bit-docs/tree/16f30161dc4c79d407cd9c586a0c6839d0969d97/pipeline/configuration/tls_ssl.md) section. ## Notes * If you're using Fluent Bit to collect Docker logs, note that Docker places your log in JSON under key `log`. So you can set `log` as your `Gelf_Short_Message_Key` to send everything in Docker logs to Graylog. In this case, you need your `log` value to be a string; so don't parse it using JSON parser. - * The order of looking up the timestamp in this plugin is as follows: - 1. Value of `Gelf_Timestamp_Key` provided in configuration 2. Value of `timestamp` key - 3. If you're using [Docker JSON parser](../parser/json.md), this parser can parse time and use it as timestamp of message. If all above fail, Fluent Bit tries to get timestamp extracted by your parser. - 4. Timestamp does not set by Fluent Bit. In this case, your Graylog server will set it to the current timestamp (now). - + 3. If you're using [Docker JSON parser](https://github.com/fluent/fluent-bit-docs/tree/16f30161dc4c79d407cd9c586a0c6839d0969d97/pipeline/parser/json.md), this parser can parse time and use it as timestamp of message. If all above fail, Fluent Bit tries to get timestamp extracted by your parser. + 4. Timestamp does not set by Fluent Bit. In this case, your Graylog server will set it to the current timestamp \(now\). * Your log timestamp has to be in [UNIX Epoch Timestamp](https://en.wikipedia.org/wiki/Unix_time) format. If the `Gelf_Timestamp_Key` value of your log is not in this format, your Graylog server will ignore it. - -* If you're using Fluent Bit in Kubernetes and you're using [Kubernetes Filter Plugin](../filter/kubernetes.md), this plugin adds `host` value to your log by default, and you don't need to add it by your own. - +* If you're using Fluent Bit in Kubernetes and you're using [Kubernetes Filter Plugin](https://github.com/fluent/fluent-bit-docs/tree/16f30161dc4c79d407cd9c586a0c6839d0969d97/pipeline/filter/kubernetes.md), this plugin adds `host` value to your log by default, and you don't need to add it by your own. * The `version` of GELF message is also mandatory and Fluent Bit sets it to 1.1 which is the current latest version of GELF. - * If you use `udp` as transport protocol and set `Compress` to `true`, Fluent Bit compresses your packets in GZIP format, which is the default compression that Graylog offers. This can be used to trade more CPU load for saving network bandwidth. ## Configuration File Example @@ -92,11 +86,11 @@ If you're using Fluent Bit for shipping Kubernetes logs, you can use something l By default, GELF tcp uses port 12201 and Docker places your logs in `/var/log/containers` directory. The logs are placed in value of the `log` key. For example, this is a log saved by Docker: -```JSON +```javascript {"log":"{\"data\": \"This is an example.\"}","stream":"stderr","time":"2019-07-21T12:45:11.273315023Z"} ``` -If you use [Tail Input](input/tail.md) and use a Parser like the `docker` parser shown above, it decodes your message and extracts `data` (and any other present) field. This is how this log in [stdout](stdout.md) looks like after decoding: +If you use [Tail Input](https://github.com/fluent/fluent-bit-docs/tree/16f30161dc4c79d407cd9c586a0c6839d0969d97/pipeline/outputs/input/tail.md) and use a Parser like the `docker` parser shown above, it decodes your message and extracts `data` \(and any other present\) field. This is how this log in [stdout](https://github.com/fluent/fluent-bit-docs/tree/16f30161dc4c79d407cd9c586a0c6839d0969d97/pipeline/outputs/stdout.md) looks like after decoding: ```text [0] kube.log: [1565770310.000198491, {"log"=>{"data"=>"This is an example."}, "stream"=>"stderr", "time"=>"2019-07-21T12:45:11.273315023Z"}] @@ -105,19 +99,15 @@ If you use [Tail Input](input/tail.md) and use a Parser like the `docker` parser Now, this is what happens to this log: 1. Fluent Bit GELF plugin adds `"version": "1.1"` to it. - -2. The [Nest Filter](filter/nest.md), unnests fields inside `log` key. In our example, it puts `data` alongside `stream` and `time`. - +2. The [Nest Filter](https://github.com/fluent/fluent-bit-docs/tree/16f30161dc4c79d407cd9c586a0c6839d0969d97/pipeline/outputs/filter/nest.md), unnests fields inside `log` key. In our example, it puts `data` alongside `stream` and `time`. 3. We used this `data` key as `Gelf_Short_Message_Key`; so GELF plugin changes it to `short_message`. - -4. [Kubernetes Filter](filter/kubernetes.md) adds `host` name. - +4. [Kubernetes Filter](https://github.com/fluent/fluent-bit-docs/tree/16f30161dc4c79d407cd9c586a0c6839d0969d97/pipeline/outputs/filter/kubernetes.md) adds `host` name. 5. Timestamp is generated. - -6. Any custom field (not present in [GELF Payload Specification](https://docs.graylog.org/en/latest/pages/gelf.html#gelf-payload-specification)) is prefixed by an underline. +6. Any custom field \(not present in [GELF Payload Specification](https://docs.graylog.org/en/latest/pages/gelf.html#gelf-payload-specification)\) is prefixed by an underline. Finally, this is what our Graylog server input sees: -```JSON +```javascript {"version":"1.1", "short_message":"This is an example.", "host": "", "_stream":"stderr", "timestamp":1565770310.000199} ``` + diff --git a/pipeline/outputs/http.md b/pipeline/outputs/http.md index 193ce0fc4..03ff1ab31 100644 --- a/pipeline/outputs/http.md +++ b/pipeline/outputs/http.md @@ -1,31 +1,31 @@ # HTTP -The **http** output plugin allows to flush your records into a HTTP endpoint. For now the functionality is pretty basic and it issues a POST request with the data records in [MessagePack](http://msgpack.org) (or JSON) format. +The **http** output plugin allows to flush your records into a HTTP endpoint. For now the functionality is pretty basic and it issues a POST request with the data records in [MessagePack](http://msgpack.org) \(or JSON\) format. ## Configuration Parameters -| Key | Description | default | -|-------------|----------------------|-------------------| -| Host | IP address or hostname of the target HTTP Server | 127.0.0.1 | -| HTTP_User | Basic Auth Username | | -| HTTP_Passwd | Basic Auth Password. Requires HTTP_User to be set | | -| Port | TCP port of the target HTTP Server | 80 | -| Proxy | Specify an HTTP Proxy. The expected format of this value is _http://host:port_. Note that _https_ is __not__ supported yet. || -| URI | Specify an optional HTTP URI for the target web server, e.g: /something | / | -| Format | Specify the data format to be used in the HTTP request body, by default it uses _msgpack_. Other supported formats are _json_, _json_stream_ and _json_lines_ and _gelf_. | msgpack | -| header_tag | Specify an optional HTTP header field for the original message tag. | | -| Header | Add a HTTP header key/value pair. Multiple headers can be set. | | -| json_date_key | Specify the name of the date field in output | date | -| json_date_format | Specify the format of the date. Supported formats are _double_ and _iso8601_ (eg: _2018-05-30T09:39:52.000681Z_)| double | -| gelf_timestamp_key | Specify the key to use for `timestamp` in _gelf_ format | | -| gelf_host_key | Specify the key to use for the `host` in _gelf_ format | | -| gelf_short_messge_key | Specify the key to use as the `short` message in _gelf_ format | | -| gelf_full_message_key | Specify the key to use for the `full` message in _gelf_ format | | -| gelf_level_key | Specify the key to use for the `level` in _gelf_ format | | +| Key | Description | default | +| :--- | :--- | :--- | +| Host | IP address or hostname of the target HTTP Server | 127.0.0.1 | +| HTTP\_User | Basic Auth Username | | +| HTTP\_Passwd | Basic Auth Password. Requires HTTP\_User to be set | | +| Port | TCP port of the target HTTP Server | 80 | +| Proxy | Specify an HTTP Proxy. The expected format of this value is [http://host:port](http://host:port). Note that _https_ is **not** supported yet. | | +| URI | Specify an optional HTTP URI for the target web server, e.g: /something | / | +| Format | Specify the data format to be used in the HTTP request body, by default it uses _msgpack_. Other supported formats are _json_, _json\_stream_ and _json\_lines_ and _gelf_. | msgpack | +| header\_tag | Specify an optional HTTP header field for the original message tag. | | +| Header | Add a HTTP header key/value pair. Multiple headers can be set. | | +| json\_date\_key | Specify the name of the date field in output | date | +| json\_date\_format | Specify the format of the date. Supported formats are _double_ and _iso8601_ \(eg: _2018-05-30T09:39:52.000681Z_\) | double | +| gelf\_timestamp\_key | Specify the key to use for `timestamp` in _gelf_ format | | +| gelf\_host\_key | Specify the key to use for the `host` in _gelf_ format | | +| gelf\_short\_messge\_key | Specify the key to use as the `short` message in _gelf_ format | | +| gelf\_full\_message\_key | Specify the key to use for the `full` message in _gelf_ format | | +| gelf\_level\_key | Specify the key to use for the `level` in _gelf_ format | | ### TLS / SSL -HTTP output plugin supports TTL/SSL, for more details about the properties available and general configuration, please refer to the [TLS/SSL](../configuration/tls_ssl.md) section. +HTTP output plugin supports TTL/SSL, for more details about the properties available and general configuration, please refer to the [TLS/SSL](https://github.com/fluent/fluent-bit-docs/tree/16f30161dc4c79d407cd9c586a0c6839d0969d97/pipeline/configuration/tls_ssl.md) section. ## Getting Started @@ -81,7 +81,7 @@ To configure this behaviour, add this config: Provided you are using Fluentd as data receiver, you can combine `in_http` and `out_rewrite_tag_filter` to make use of this HTTP header. -``` +```text @type http add_http_headers true @@ -101,7 +101,7 @@ Notice how we override the tag, which is from URI path, with our custom header #### Example : Add a header -``` +```text [OUTPUT] Name http Match * @@ -114,10 +114,9 @@ Notice how we override the tag, which is from URI path, with our custom header #### Example : Sumo Logic HTTP Collector -Suggested configuration for Sumo Logic using `json_lines` with `iso8601` timestamps. -The `PrivateKey` is specific to a configured HTTP collector. +Suggested configuration for Sumo Logic using `json_lines` with `iso8601` timestamps. The `PrivateKey` is specific to a configured HTTP collector. -``` +```text [OUTPUT] Name http Match * @@ -129,12 +128,12 @@ The `PrivateKey` is specific to a configured HTTP collector. Json_date_format iso8601 ``` -A sample Sumo Logic query for the [CPU](../input/cpu.md) input. -(Requires `json_lines` format with `iso8601` date format for the `timestamp` field). +A sample Sumo Logic query for the [CPU](https://github.com/fluent/fluent-bit-docs/tree/16f30161dc4c79d407cd9c586a0c6839d0969d97/pipeline/input/cpu.md) input. \(Requires `json_lines` format with `iso8601` date format for the `timestamp` field\). -``` +```text _sourcecategory="my_fluent_bit" | json "cpu_p" as cpu | timeslice 1m | max(cpu) as cpu group by _timeslice ``` + diff --git a/pipeline/outputs/influxdb.md b/pipeline/outputs/influxdb.md index afb57bbef..c4d542e30 100644 --- a/pipeline/outputs/influxdb.md +++ b/pipeline/outputs/influxdb.md @@ -17,7 +17,7 @@ The **influxdb** output plugin, allows to flush your records into a [InfluxDB](h ### TLS / SSL -InfluxDB output plugin supports TTL/SSL, for more details about the properties available and general configuration, please refer to the [TLS/SSL](../configuration/tls_ssl.md) section. +InfluxDB output plugin supports TTL/SSL, for more details about the properties available and general configuration, please refer to the [TLS/SSL](https://github.com/fluent/fluent-bit-docs/tree/16f30161dc4c79d407cd9c586a0c6839d0969d97/pipeline/configuration/tls_ssl.md) section. ## Getting Started diff --git a/pipeline/outputs/kafka-rest-proxy.md b/pipeline/outputs/kafka-rest-proxy.md index 578a45a67..d96a8ebd7 100644 --- a/pipeline/outputs/kafka-rest-proxy.md +++ b/pipeline/outputs/kafka-rest-proxy.md @@ -18,7 +18,7 @@ The **kafka-rest** output plugin, allows to flush your records into a [Kafka RES ### TLS / SSL -Kafka REST Proxy output plugin supports TTL/SSL, for more details about the properties available and general configuration, please refer to the [TLS/SSL](../configuration/tls_ssl.md) section. +Kafka REST Proxy output plugin supports TTL/SSL, for more details about the properties available and general configuration, please refer to the [TLS/SSL](https://github.com/fluent/fluent-bit-docs/tree/16f30161dc4c79d407cd9c586a0c6839d0969d97/pipeline/configuration/tls_ssl.md) section. ## Getting Started diff --git a/pipeline/outputs/kafka.md b/pipeline/outputs/kafka.md index b952121b2..7d4de4c9c 100644 --- a/pipeline/outputs/kafka.md +++ b/pipeline/outputs/kafka.md @@ -8,12 +8,12 @@ Kafka output plugin allows to ingest your records into an [Apache Kafka](https:/ | :--- | :--- | :--- | | Format | Specify data format, options available: json, msgpack. | json | | Message\_Key | Optional key to store the message | | -| Message\_Key\_Field | If set, the value of Message\_Key\_Field in the record will indicate the message key. If not set nor found in the record, Message\_Key will be used (if set). | | +| Message\_Key\_Field | If set, the value of Message\_Key\_Field in the record will indicate the message key. If not set nor found in the record, Message\_Key will be used \(if set\). | | | Timestamp\_Key | Set the key to store the record timestamp | @timestamp | | Timestamp\_Format | 'iso8601' or 'double' | double | | Brokers | Single of multiple list of Kafka Brokers, e.g: 192.168.1.3:9092, 192.168.1.4:9092. | | | Topics | Single entry or list of topics separated by comma \(,\) that Fluent Bit will use to send messages to Kafka. If only one topic is set, that one will be used for all records. Instead if multiple topics exists, the one set in the record by Topic\_Key will be used. | fluent-bit | -| Topic\_Key | If multiple Topics exists, the value of Topic\_Key in the record will indicate the topic to use. E.g: if Topic\_Key is _router_ and the record is {"key1": 123, "router": "route\_2"}, Fluent Bit will use topic _route\_2_. Note that if the value of Topic_Key is not present in Topics, then by default the first topic in the Topics list will indicate the topic to be used. | | +| Topic\_Key | If multiple Topics exists, the value of Topic\_Key in the record will indicate the topic to use. E.g: if Topic\_Key is _router_ and the record is {"key1": 123, "router": "route\_2"}, Fluent Bit will use topic _route\_2_. Note that if the value of Topic\_Key is not present in Topics, then by default the first topic in the Topics list will indicate the topic to be used. | | | rdkafka.{property} | `{property}` can be any [librdkafka properties](https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md) | | > Setting `rdkafka.log.connection.close` to `false` and `rdkafka.request.required.acks` to 1 are examples of recommended settings of librdfkafka properties. diff --git a/pipeline/outputs/null.md b/pipeline/outputs/null.md index c0e4556ce..b6f4f26cf 100644 --- a/pipeline/outputs/null.md +++ b/pipeline/outputs/null.md @@ -1,4 +1,4 @@ -# Null +# NULL The **null** output plugin just throws away events. diff --git a/pipeline/outputs/splunk.md b/pipeline/outputs/splunk.md index 102a83174..34ff5261e 100644 --- a/pipeline/outputs/splunk.md +++ b/pipeline/outputs/splunk.md @@ -17,7 +17,7 @@ To get more details about how to setup the HEC in Splunk please refer to the fol ### TLS / SSL -Splunk output plugin supports TTL/SSL, for more details about the properties available and general configuration, please refer to the [TLS/SSL](../configuration/tls_ssl.md) section. +Splunk output plugin supports TTL/SSL, for more details about the properties available and general configuration, please refer to the [TLS/SSL](https://github.com/fluent/fluent-bit-docs/tree/16f30161dc4c79d407cd9c586a0c6839d0969d97/pipeline/configuration/tls_ssl.md) section. ## Getting Started @@ -55,7 +55,7 @@ In your main configuration file append the following _Input_ & _Output_ sections By default, the Splunk output plugin nests the record under the `event` key in the payload sent to the HEC. It will also append the time of the record to a top level `time` key. -If you would like to customize any of the Splunk event metadata, such as the host or target index, you can set `Splunk_Send_Raw On` in the plugin configuration, and add the metadata as keys/values in the record. *Note*: with `Splunk_Send_Raw` enabled, you are responsible for creating and populating the `event` section of the payload. +If you would like to customize any of the Splunk event metadata, such as the host or target index, you can set `Splunk_Send_Raw On` in the plugin configuration, and add the metadata as keys/values in the record. _Note_: with `Splunk_Send_Raw` enabled, you are responsible for creating and populating the `event` section of the payload. For example, to add a custom index and hostname: @@ -89,7 +89,7 @@ For example, to add a custom index and hostname: This will create a payload that looks like: -```json +```javascript { "time": "1535995058.003385189", "index": "my-splunk-index", @@ -102,4 +102,5 @@ This will create a payload that looks like: } ``` -For more information on the Splunk HEC payload format and all event meatadata Splunk accepts, see here: http://docs.splunk.com/Documentation/Splunk/latest/Data/AboutHEC +For more information on the Splunk HEC payload format and all event meatadata Splunk accepts, see here: [http://docs.splunk.com/Documentation/Splunk/latest/Data/AboutHEC](http://docs.splunk.com/Documentation/Splunk/latest/Data/AboutHEC) + diff --git a/pipeline/outputs/stackdriver.md b/pipeline/outputs/stackdriver.md index 871afdfef..9f4a3cb06 100644 --- a/pipeline/outputs/stackdriver.md +++ b/pipeline/outputs/stackdriver.md @@ -15,7 +15,7 @@ Before to get started with the plugin configuration, make sure to obtain the pro | google\_service\_credentials | Absolute path to a Google Cloud credentials JSON file | Value of environment variable _$GOOGLE\_SERVICE\_CREDENTIALS_ | | service\_account\_email | Account email associated to the service. Only available if **no credentials file** has been provided. | Value of environment variable _$SERVICE\_ACCOUNT\_EMAIL_ | | service\_account\_secret | Private key content associated with the service account. Only available if **no credentials file** has been provided. | Value of environment variable _$SERVICE\_ACCOUNT\_SECRET_ | -| resource | Set resource type of data. Only _global_ and _gce_instance_ are supported. | global, gce_instance | +| resource | Set resource type of data. Only _global_ and _gce\_instance_ are supported. | global, gce\_instance | ### Configuration File @@ -35,18 +35,18 @@ If you are using a _Google Cloud Credentials File_, the following configuration ### Upstream connection error -> Github reference: [#761](https://github.com/fluent/fluent-bit/issues/761) +> Github reference: [\#761](https://github.com/fluent/fluent-bit/issues/761) An upstream connection error means Fluent Bit was not able to reach Google services, the error looks like this: -``` +```text [2019/01/07 23:24:09] [error] [oauth2] could not get an upstream connection ``` This belongs to a network issue by the environment where Fluent Bit is running, make sure that from the Host, Container or Pod you can reach the following Google end-points: -- [https://www.googleapis.com](https://www.googleapis.com/) -- [https://logging.googleapis.com](https://logging.googleapis.com/) +* [https://www.googleapis.com](https://www.googleapis.com/) +* [https://logging.googleapis.com](https://logging.googleapis.com/) ## Other implementations diff --git a/pipeline/outputs/standard-output.md b/pipeline/outputs/standard-output.md index f37fef5ea..f5ed237f4 100644 --- a/pipeline/outputs/standard-output.md +++ b/pipeline/outputs/standard-output.md @@ -4,11 +4,11 @@ The **stdout** output plugin allows to print to the standard output the data rec ## Configuration Parameters -| Key | Description | default | -|-------------|----------------------|-------------------| -| Format | Specify the data format to be printed. Supported formats are _msgpack_ _json_, _json_lines_ and _json\_stream_. | msgpack | -| json_date_key | Specify the name of the date field in output | date | -| json_date_format | Specify the format of the date. Supported formats are _double_, _iso8601_ (eg: _2018-05-30T09:39:52.000681Z_) and _epoch_. | double | +| Key | Description | default | +| :--- | :--- | :--- | +| Format | Specify the data format to be printed. Supported formats are _msgpack_ _json_, _json\_lines_ and _json\_stream_. | msgpack | +| json\_date\_key | Specify the name of the date field in output | date | +| json\_date\_format | Specify the format of the date. Supported formats are _double_, _iso8601_ \(eg: _2018-05-30T09:39:52.000681Z_\) and _epoch_. | double | ### Command Line @@ -16,7 +16,7 @@ The **stdout** output plugin allows to print to the standard output the data rec $ bin/fluent-bit -i cpu -o stdout -v ``` -We have specified to gather [CPU](../input/cpu.md) usage metrics and print them out to the standard output in a human readable way: +We have specified to gather [CPU](https://github.com/fluent/fluent-bit-docs/tree/ddc1cf3d996966b9db39f8784596c8b7132b4d5b/pipeline/input/cpu.md) usage metrics and print them out to the standard output in a human readable way: ```bash $ bin/fluent-bit -i cpu -o stdout -p format=msgpack -v diff --git a/pipeline/outputs/tcp-and-tls.md b/pipeline/outputs/tcp-and-tls.md index 387c6a772..f56800010 100644 --- a/pipeline/outputs/tcp-and-tls.md +++ b/pipeline/outputs/tcp-and-tls.md @@ -1,32 +1,30 @@ -# TCP and TLS Output +# TCP & TLS The **tcp** output plugin allows to send records to a remote TCP server. The payload can be formatted in different ways as required. ## Configuration Parameters -| Key | Description | default | -|-------------|----------------------|-------------------| +| Key | Description | default | +| :--- | :--- | :--- | | Host | Target host where Fluent-Bit or Fluentd are listening for Forward messages. | 127.0.0.1 | | Port | TCP Port of the target service. | 5170 | -| Format | Specify the data format to be printed. Supported formats are _msgpack_ _json_, _json_lines_ and _json\_stream_. | msgpack | -| json_date_key | Specify the name of the date field in output | date | -| json_date_format | Specify the format of the date. Supported formats are _double_ , _iso8601_ (eg: _2018-05-30T09:39:52.000681Z_) and _epoch_. | double | +| Format | Specify the data format to be printed. Supported formats are _msgpack_ _json_, _json\_lines_ and _json\_stream_. | msgpack | +| json\_date\_key | Specify the name of the date field in output | date | +| json\_date\_format | Specify the format of the date. Supported formats are _double_ , _iso8601_ \(eg: _2018-05-30T09:39:52.000681Z_\) and _epoch_. | double | -## TLS Configuration Parameters +## TLS Configuration Parameters The following parameters are available to configure a secure channel connection through TLS: -| Key | Description | Default | -| :-------------- | :----------------------------------------------------------- | :------ | -| tls | Enable or disable TLS support | Off | -| tls.verify | Force certificate validation | On | -| tls.debug | Set TLS debug verbosity level. It accept the following values: 0 \(No debug\), 1 \(Error\), 2 \(State change\), 3 \(Informational\) and 4 Verbose | 1 | -| tls.ca\_file | Absolute path to CA certificate file | | -| tls.crt\_file | Absolute path to Certificate file. | | -| tls.key\_file | Absolute path to private Key file. | | -| tls.key\_passwd | Optional password for tls.key\_file file. | | - -## +| Key | Description | Default | +| :--- | :--- | :--- | +| tls | Enable or disable TLS support | Off | +| tls.verify | Force certificate validation | On | +| tls.debug | Set TLS debug verbosity level. It accept the following values: 0 \(No debug\), 1 \(Error\), 2 \(State change\), 3 \(Informational\) and 4 Verbose | 1 | +| tls.ca\_file | Absolute path to CA certificate file | | +| tls.crt\_file | Absolute path to Certificate file. | | +| tls.key\_file | Absolute path to private Key file. | | +| tls.key\_passwd | Optional password for tls.key\_file file. | | ### Command Line @@ -34,20 +32,18 @@ The following parameters are available to configure a secure channel connection $ bin/fluent-bit -i cpu -o tcp://127.0.0.1:5170 -p format=json_lines -v ``` -We have specified to gather [CPU](../input/cpu.md) usage metrics and send them in JSON lines mode to a remote end-point using netcat service, e.g: +We have specified to gather [CPU](https://github.com/fluent/fluent-bit-docs/tree/16f30161dc4c79d407cd9c586a0c6839d0969d97/pipeline/input/cpu.md) usage metrics and send them in JSON lines mode to a remote end-point using netcat service, e.g: #### Start the TCP listener Run the following in a separate terminal, netcat will start listening for messages on TCP port 5170 -``` +```text $ nc -l 5170 ``` Start Fluent Bit - - ```bash $ bin/fluent-bit -i cpu -o stdout -p format=msgpack -v Fluent-Bit v1.2.x diff --git a/pipeline/parsers/json.md b/pipeline/parsers/json.md index 4ea3a7f01..686fe2898 100644 --- a/pipeline/parsers/json.md +++ b/pipeline/parsers/json.md @@ -1,4 +1,4 @@ -# JSON Parser +# JSON The JSON parser is the simplest option: if the original log source is a JSON map string, it will take it structure and convert it directly to the internal binary representation. diff --git a/pipeline/parsers/logfmt.md b/pipeline/parsers/logfmt.md index 13b23524b..e5cafaddf 100644 --- a/pipeline/parsers/logfmt.md +++ b/pipeline/parsers/logfmt.md @@ -1,7 +1,6 @@ -# Logfmt Parser +# Logfmt -The **logfmt** parser allows to parse the logfmt format described in https://brandur.org/logfmt . -A more formal description is in https://godoc.org/github.com/kr/logfmt . +The **logfmt** parser allows to parse the logfmt format described in [https://brandur.org/logfmt](https://brandur.org/logfmt) . A more formal description is in [https://godoc.org/github.com/kr/logfmt](https://godoc.org/github.com/kr/logfmt) . Here is an example configuration: @@ -23,3 +22,4 @@ After processing, it internal representation will be: [1540936693, {"key1"=>"val1", "key2"=>"val2"}] ``` + diff --git a/pipeline/parsers/ltsv.md b/pipeline/parsers/ltsv.md index 9af29341f..aa2590f89 100644 --- a/pipeline/parsers/ltsv.md +++ b/pipeline/parsers/ltsv.md @@ -1,8 +1,8 @@ -# LTSV Parser +# LTSV -The **ltsv** parser allows to parse [LTSV](http://ltsv.org/) formatted texts. +The **ltsv** parser allows to parse [LTSV](http://ltsv.org/) formatted texts. -Labeled Tab-separated Values (LTSV format is a variant of Tab-separated Values (TSV). Each record in a LTSV file is represented as a single line. Each field is separated by TAB and has a label and a value. The label and the value have been separated by ':'. +Labeled Tab-separated Values \(LTSV format is a variant of Tab-separated Values \(TSV\). Each record in a LTSV file is represented as a single line. Each field is separated by TAB and has a label and a value. The label and the value have been separated by ':'. Here is an example how to use this format in the apache access log. @@ -43,3 +43,4 @@ After processing, it internal representation will be: ``` The time has been converted to Unix timestamp \(UTC\). + diff --git a/pipeline/parsers/regular-expression.md b/pipeline/parsers/regular-expression.md index d98365148..b53475717 100644 --- a/pipeline/parsers/regular-expression.md +++ b/pipeline/parsers/regular-expression.md @@ -1,4 +1,4 @@ -# Regular Expression Parser +# Regular Expression The **regex** parser allows to define a custom Ruby Regular Expression that will use a named capture feature to define which content belongs to which key name. @@ -6,7 +6,7 @@ Fluent Bit uses [Onigmo](https://github.com/k-takata/Onigmo) regular expression [http://rubular.com/](http://rubular.com/) -Important: do not attempt to add multiline support in your regular expressions if you are using [Tail](../input/tail.md) input plugin since each line is handled as a separated entity. Instead use Tail [Multiline](../input/tail.md#multiline) support configuration feature. +Important: do not attempt to add multiline support in your regular expressions if you are using [Tail](https://github.com/fluent/fluent-bit-docs/tree/1787fd8bfb2035bf10faf8cb7b14c4521e1265b3/pipeline/input/tail.md) input plugin since each line is handled as a separated entity. Instead use Tail [Multiline](https://github.com/fluent/fluent-bit-docs/tree/1787fd8bfb2035bf10faf8cb7b14c4521e1265b3/pipeline/input/tail.md#multiline) support configuration feature. > Note: understanding how regular expressions works is out of the scope of this content. @@ -44,9 +44,7 @@ The above content do not provide a defined structure for Fluent Bit, but enablin ] ``` -A common pitfall is that you cannot use characters other than alphabets, numbers -and underscore in group names. For example, a group name like `(?.*)` -will cause an error due to containing an invalid character (`-`). +A common pitfall is that you cannot use characters other than alphabets, numbers and underscore in group names. For example, a group name like `(?.*)` will cause an error due to containing an invalid character \(`-`\). In order to understand, learn and test regular expressions like the example above, we suggest you try the following Ruby Regular Expression Editor: [http://rubular.com/r/X7BH0M4Ivm](http://rubular.com/r/X7BH0M4Ivm) diff --git a/stream-processing/changelog.md b/stream-processing/changelog.md index a71657bb9..128e08765 100644 --- a/stream-processing/changelog.md +++ b/stream-processing/changelog.md @@ -1,4 +1,4 @@ -# Stream Processor Changelog +# Changelog Upon new versions of [Fluent Bit](https://fluentbit.io), the Stream Processor engine gets new improvements. In the following section you will find the details of the new additions in the major release versions. @@ -10,7 +10,7 @@ Upon new versions of [Fluent Bit](https://fluentbit.io), the Stream Processor en It's pretty common that records contains nested maps or sub-keys. Now we provide the ability to use sub-keys to perform conditionals and keys selection. Consider the following record: -```json +```javascript { "key1": 123, "key2": 456, @@ -30,25 +30,26 @@ SELECT key3['sub1']['sub2'] FROM STREAM:test WHERE key3['sub1']['sub2'] = 789; ### New @record functions -On conditionals we have introduced the new *@record* functions: +On conditionals we have introduced the new _@record_ functions: -| Function | Description | -| :-------------------- | :-------------------------------------------------- | -| @record.time() | returns the record timestamp | -| @record.contains(key) | returns true or false if *key* exists in the record | +| Function | Description | +| :--- | :--- | +| @record.time\(\) | returns the record timestamp | +| @record.contains\(key\) | returns true or false if _key_ exists in the record | ### IS NULL, IS NOT NULL -We currently support different data types such as *strings*, *integers*, *floats*, *maps* and *null*. In Fluent Bit, a *null* value is totally valid and is not related to the absence of a value as in normal databases. To compare if an existing key in the record have a *null* value or not, we have introduced *IS NULL* and *IS NOT NULL* statements, e.g: +We currently support different data types such as _strings_, _integers_, _floats_, _maps_ and _null_. In Fluent Bit, a _null_ value is totally valid and is not related to the absence of a value as in normal databases. To compare if an existing key in the record have a _null_ value or not, we have introduced _IS NULL_ and _IS NOT NULL_ statements, e.g: ```sql SELECT * FROM STREAM:test WHERE key3['sub1'] IS NOT NULL; ``` -For more details please review the section [Check Keys and NULL values](../getting_started/is_null_is_not_null_record_contains) +For more details please review the section [Check Keys and NULL values](https://github.com/fluent/fluent-bit-docs/tree/6bc4af039821d9e8bc1636797a25ad23b52a511f/getting_started/is_null_is_not_null_record_contains/README.md) ## Fluent Bit v1.1 > Release date: May 09, 2019 This is the initial version of the Stream Processor into Fluent Bit. + diff --git a/stream-processing/getting-started/README.md b/stream-processing/getting-started/README.md index 75e8bd584..0ae5515ac 100644 --- a/stream-processing/getting-started/README.md +++ b/stream-processing/getting-started/README.md @@ -2,22 +2,20 @@ The following guide assumes that you are familiar with [Fluent Bit](https://fluentbit.io), if that is not the case we suggest you review the official manual first: -- [Fluent Bit Manual](https://docs.fluentbit.io/manual/) +* [Fluent Bit Manual](https://docs.fluentbit.io/manual/) ## Requirements -- [Fluent Bit](https://fluentbit.io) >= v1.1.0 or Fluent Bit from [GIT Master](https://github.com/fluent/fluent-bit) -- Basic understanding of Structured Query Language (SQL) +* [Fluent Bit](https://fluentbit.io) >= v1.1.0 or Fluent Bit from [GIT Master](https://github.com/fluent/fluent-bit) +* Basic understanding of Structured Query Language \(SQL\) ## Technical Concepts -| Concept | Description | -| ------- | ------------------------------------------------------------ | -| Stream | A Stream represents an unique flow of data being ingested by an Input plugin. By default Streams get a name using the plugin name plus an internal numerical identification, e.g: tail.0 . Stream name can be changed setting the _alias_ property. | -| Task | Stream Processor configuration have the notion of Tasks that represents an execution unit, for short: SQL queries are configured in a Task. | +| Concept | Description | +| :--- | :--- | +| Stream | A Stream represents an unique flow of data being ingested by an Input plugin. By default Streams get a name using the plugin name plus an internal numerical identification, e.g: tail.0 . Stream name can be changed setting the _alias_ property. | +| Task | Stream Processor configuration have the notion of Tasks that represents an execution unit, for short: SQL queries are configured in a Task. | | Results | When Stream Processor runs a SQL query, results are generated. These results can be re-ingested back into the main Fluent Bit pipeline or simply redirected to the standard output interfaces for debugging purposes. | -| Tag | Fluent Bit group records and associate a Tag to them. Tags are used to define routing rules or in the case of the stream processor to attach to specific Tag that matches a pattern. | -| Match | Matching rule that can use a wildcard to match specific records associated to a Tag. | - - +| Tag | Fluent Bit group records and associate a Tag to them. Tags are used to define routing rules or in the case of the stream processor to attach to specific Tag that matches a pattern. | +| Match | Matching rule that can use a wildcard to match specific records associated to a Tag. | diff --git a/stream-processing/getting-started/check-keys-null-values.md b/stream-processing/getting-started/check-keys-null-values.md index 882e3be73..0ec4f2cb5 100644 --- a/stream-processing/getting-started/check-keys-null-values.md +++ b/stream-processing/getting-started/check-keys-null-values.md @@ -1,8 +1,8 @@ -# IS NULL, IS NOT NULL, KEY EXISTS ? +# Check Keys and NULL values -> Feature available on Fluent Bit >= 1.2 +> Feature available on Fluent Bit >= 1.2 -When working with structured messages (records), there are certain cases where we want to know if a key exists, if it value is _null_ or have a value different than _null_. +When working with structured messages \(records\), there are certain cases where we want to know if a key exists, if it value is _null_ or have a value different than _null_. [Fluent Bit](https://fluentbit.io) internal records are a binary serialization of maps with keys and values. A value can be _null_ which is a valid data type. In our SQL language we provide the following statements that can be applied to the conditionals statements: @@ -26,7 +26,7 @@ SELECT * FROM STREAM:test WHERE phone IS NOT NULL; Another common use-case is to check if certain key exists in the record. We provide specific record functions that can be used in the conditional part of the SQL statement. The prototype of the function to check if a key exists in the record is the following: -``` +```text @record.contains(key) ``` diff --git a/stream-processing/getting-started/fluent-bit-sql.md b/stream-processing/getting-started/fluent-bit-sql.md index 0565f3e0d..688a75894 100644 --- a/stream-processing/getting-started/fluent-bit-sql.md +++ b/stream-processing/getting-started/fluent-bit-sql.md @@ -1,4 +1,4 @@ -# Fluent Bit Query Language: SQL +# Fluent Bit + SQL Fluent Bit stream processor uses common SQL to perform record queries. The following section describe the features available and examples of it. @@ -20,7 +20,7 @@ SELECT results_statement #### Description -Select keys from records coming from a stream or records matching a specific Tag pattern. Note that a simple `SELECT` statement __not__ associated from a stream creation will send the results to the standard output interface (stdout), useful for debugging purposes. +Select keys from records coming from a stream or records matching a specific Tag pattern. Note that a simple `SELECT` statement **not** associated from a stream creation will send the results to the standard output interface \(stdout\), useful for debugging purposes. The query allows filtering the results by applying a condition using `WHERE` statement. We will explain `WINDOW` and `GROUP BY` statements later in aggregation functions section. @@ -70,8 +70,7 @@ CREATE STREAM hello AS SELECT * FROM TAG:'apache.*'; ## Aggregation Functions -Aggregation functions are used in `results_statement` on the keys, allowing to perform data calculation on groups of records. -Group of records that aggregation functions apply on are determined by `WINDOW` keyword. When `WINDOW` is not specified, aggregation functions apply on the current buffer of records received, which may have non-deterministic number of elements. Aggregation functions can be applied on records in a window of a specific time interval (see the syntax of `WINDOW` in select statement). +Aggregation functions are used in `results_statement` on the keys, allowing to perform data calculation on groups of records. Group of records that aggregation functions apply on are determined by `WINDOW` keyword. When `WINDOW` is not specified, aggregation functions apply on the current buffer of records received, which may have non-deterministic number of elements. Aggregation functions can be applied on records in a window of a specific time interval \(see the syntax of `WINDOW` in select statement\). Fluent Bit streaming currently supports tumbling window, which is non-overlapping window type. That means, a window of size 5 seconds performs aggregation computations on records over a 5-second interval, and then starts new calculations for the next interval. @@ -153,7 +152,7 @@ SELECT NOW() FROM STREAM:apache; Add system time using format: %Y-%m-%d %H:%M:%S. Output example: 2019-03-09 21:36:05. -### UNIX_TIMESTAMP +### UNIX\_TIMESTAMP #### Synopsis @@ -169,7 +168,7 @@ Add current Unix timestamp to the record. Output example: 1552196165 . Record functions append new keys to the record using values from the record context. -### RECORD_TAG +### RECORD\_TAG #### Synopsis @@ -181,27 +180,34 @@ SELECT RECORD_TAG() FROM STREAM:apache; Append Tag string associated to the record as a new key. -### RECORD_TIME +### RECORD\_TIME #### Synopsis ```sql SELECT RECORD_TIME() FROM STREAM:apache; ``` + ## WHERE Condition Similar to conventional SQL statements, `WHERE` condition is supported in Fluent Bit query language. The language supports conditions over keys and subkeys, for instance: + ```sql SELECT AVG(size) FROM STREAM:apache WHERE method = 'POST' AND status = 200; ``` + It is possible to check the existence of a key in the record using record-specific function `@record.contains`: + ```sql SELECT MAX(key) FROM STREAM:apache WHERE @record.contains(key); ``` + And to check if the value of a key is/is not `NULL`: + ```sql SELECT MAX(key) FROM STREAM:apache WHERE key IS NULL; ``` + ```sql SELECT * FROM STREAM:apache WHERE user IS NOT NULL; ``` @@ -209,3 +215,4 @@ SELECT * FROM STREAM:apache WHERE user IS NOT NULL; #### Description Append a new key with the record Timestamp in _double_ format: seconds.nanoseconds. Output example: 1552196165.705683 . + diff --git a/stream-processing/getting-started/hands-on.md b/stream-processing/getting-started/hands-on.md index 4726dfa06..6f6d42b7b 100644 --- a/stream-processing/getting-started/hands-on.md +++ b/stream-processing/getting-started/hands-on.md @@ -1,4 +1,4 @@ -# Hands on: Stream Processing 101 +# Hands On! 101 This article goes through very specific and simple steps to learn how Stream Processor works. For simplicity it uses a custom Docker image that contains the relevant components for testing. @@ -6,12 +6,12 @@ This article goes through very specific and simple steps to learn how Stream Pro The following tutorial requires the following software components: -- [Fluent Bit](https://fluentbit.io) >= v1.2.0 -- [Docker Engine](https://www.docker.com/products/docker-engine) (not mandatory if you already have Fluent Bit binary installed in your system) +* [Fluent Bit](https://fluentbit.io) >= v1.2.0 +* [Docker Engine](https://www.docker.com/products/docker-engine) \(not mandatory if you already have Fluent Bit binary installed in your system\) -In addition download the following data sample file (130KB): +In addition download the following data sample file \(130KB\): -- https://fluentbit.io/samples/sp-samples-1k.log +* [https://fluentbit.io/samples/sp-samples-1k.log](https://fluentbit.io/samples/sp-samples-1k.log) ## Stream Processing using the command line @@ -39,7 +39,7 @@ $ docker run -ti -v `pwd`/sp-samples-1k.log:/sp-samples-1k.log \ The command above will simply print the parsed content to the standard output interface. The content will print the _Tag_ associated to each record and an array with two fields: record timestamp and record map: -``` +```text Fluent Bit v1.2.0 Copyright (C) Treasure Data @@ -56,11 +56,11 @@ Copyright (C) Treasure Data [5] tail.0: [1557322456.315550927, {"date"=>"22/abr/2019:12:43:52 -0600", "ip"=>"132.113.203.169", "word"=>"fendered", "country"=>"United States", "flag"=>true, "num"=>53}] ``` -As of now there is no Stream Processing, on step #3 we will start doing some basic queries. +As of now there is no Stream Processing, on step \#3 we will start doing some basic queries. ### 3. Selecting specific record keys -This command introduces a Stream Processor (SP) query through the __-T__ option and changes the output plugin to _null_, this is done with the purpose of obtaining the SP results in the standard output interface and avoid confusions in the terminal. +This command introduces a Stream Processor \(SP\) query through the **-T** option and changes the output plugin to _null_, this is done with the purpose of obtaining the SP results in the standard output interface and avoid confusions in the terminal. ```bash $ docker run -ti -v `pwd`/sp-samples-1k.log:/sp-samples-1k.log \ @@ -76,7 +76,7 @@ $ docker run -ti -v `pwd`/sp-samples-1k.log:/sp-samples-1k.log \ The query above aims to retrieve all records that a key named _country_ value matches the value _Chile_, and for each match compose and output a record using only the key fields _word_ and _num_: -``` +```text [0] [1557322913.263534, {"word"=>"Candide", "num"=>94}] [0] [1557322913.263581, {"word"=>"delightfulness", "num"=>99}] [0] [1557322913.263607, {"word"=>"effulges", "num"=>63}] @@ -86,7 +86,7 @@ The query above aims to retrieve all records that a key named _country_ value ma ### 4. Calculate Average Value -The following query is similar to the one in the previous step, but this time we will use the aggregation function called AVG() to get the average value of the records ingested: +The following query is similar to the one in the previous step, but this time we will use the aggregation function called AVG\(\) to get the average value of the records ingested: ```bash $ docker run -ti -v `pwd`/sp-samples-1k.log:/sp-samples-1k.log \ @@ -102,7 +102,7 @@ $ docker run -ti -v `pwd`/sp-samples-1k.log:/sp-samples-1k.log \ output: -``` +```text [0] [1557323573.940149, {"AVG(num)"=>61.230770}] [0] [1557323573.941890, {"AVG(num)"=>47.842106}] [0] [1557323573.943544, {"AVG(num)"=>40.647060}] @@ -133,7 +133,7 @@ $ docker run -ti -v `pwd`/sp-samples-1k.log:/sp-samples-1k.log \ output: -``` +```text [0] [1557324239.003211, {"country"=>"Chile", "AVG(num)"=>53.164558}] ``` @@ -141,7 +141,7 @@ output: Now we see a more real-world use case. Sending data results to the standard output interface is good for learning purposes, but now we will instruct the Stream Processor to ingest results as part of Fluent Bit data pipeline and attach a Tag to them. -This can be done using the __CREATE STREAM__ statement that will also tag results with __sp-results__ value. Note that output plugin parameter is now _stdout_ matching all records tagged with _sp-results_: +This can be done using the **CREATE STREAM** statement that will also tag results with **sp-results** value. Note that output plugin parameter is now _stdout_ matching all records tagged with _sp-results_: ```bash $ docker run -ti -v `pwd`/sp-samples-1k.log:/sp-samples-1k.log \ @@ -162,7 +162,7 @@ $ docker run -ti -v `pwd`/sp-samples-1k.log:/sp-samples-1k.log \ output: -``` +```text [0] sp-results: [1557325032.000160100, {"country"=>"Chile", "AVG(num)"=>53.164558}] ``` @@ -170,7 +170,7 @@ output: ### Where STREAM name comes from? -Fluent Bit have the notion of streams, and every input plugin instance gets a default name. You can override that behavior by setting an alias. Check the __alias__ parameter and new __stream__ name in the following example: +Fluent Bit have the notion of streams, and every input plugin instance gets a default name. You can override that behavior by setting an alias. Check the **alias** parameter and new **stream** name in the following example: ```bash $ docker run -ti -v `pwd`/sp-samples-1k.log:/sp-samples-1k.log \ @@ -189,3 +189,4 @@ $ docker run -ti -v `pwd`/sp-samples-1k.log:/sp-samples-1k.log \ GROUP BY country;" \ -o stdout -m 'sp-results' -f 1 ``` + diff --git a/stream-processing/overview.md b/stream-processing/overview.md index 3975e4f45..07c73bf69 100644 --- a/stream-processing/overview.md +++ b/stream-processing/overview.md @@ -6,23 +6,23 @@ In order to understand how Stream Processing works in Fluent Bit, we will go thr ## Fluent Bit Data Pipeline -[Fluent Bit](https://fluentbit.io) collects and process logs (records) from different input sources and allows to parse and filter these records before they hit the Storage interface. One data is processed and it's in a safe state (either in memory or the file system), the records are routed through the proper output destinations. +[Fluent Bit](https://fluentbit.io) collects and process logs \(records\) from different input sources and allows to parse and filter these records before they hit the Storage interface. One data is processed and it's in a safe state \(either in memory or the file system\), the records are routed through the proper output destinations. > Most of the phases in the pipeline are implemented through plugins: Input, Filter and Output. -![](../imgs/flb_pipeline.png) +![](https://github.com/fluent/fluent-bit-docs/tree/6bc4af039821d9e8bc1636797a25ad23b52a511f/imgs/flb_pipeline.png) -The Filtering interface is good to perform specific record modifications like append or remove a key, enrich with specific metadata (e.g: Kubernetes Filter) or discard records based on specific conditions. Just after the data will not have any further modification and hits the Storage, optionally, will be redirected to the Stream Processor. +The Filtering interface is good to perform specific record modifications like append or remove a key, enrich with specific metadata \(e.g: Kubernetes Filter\) or discard records based on specific conditions. Just after the data will not have any further modification and hits the Storage, optionally, will be redirected to the Stream Processor. -## Stream Processor +## Stream Processor -The Stream Processor is an independent subsystem that check for new records hitting the Storage interface. By configuration the Stream Processor will attach to records coming from a specific Input plugin (stream) or by applying Tag and Matching rules. +The Stream Processor is an independent subsystem that check for new records hitting the Storage interface. By configuration the Stream Processor will attach to records coming from a specific Input plugin \(stream\) or by applying Tag and Matching rules. -> Every _Input_ instance is considered a __Stream__, that stream collects data and ingest records into the pipeline. +> Every _Input_ instance is considered a **Stream**, that stream collects data and ingest records into the pipeline. -![](../imgs/flb_pipeline_sp.png) +![](https://github.com/fluent/fluent-bit-docs/tree/6bc4af039821d9e8bc1636797a25ad23b52a511f/imgs/flb_pipeline_sp.png) -By configuring specific SQL queries (Structured Query Language), the user can perform specific tasks like key selections, filtering and data aggregation within others. Note that there is __no__ database concept here, everything is **schema-less** and happens **in-memory**, for hence the concept of _Tables_ as in common relational databases don't exists. +By configuring specific SQL queries \(Structured Query Language\), the user can perform specific tasks like key selections, filtering and data aggregation within others. Note that there is **no** database concept here, everything is **schema-less** and happens **in-memory**, for hence the concept of _Tables_ as in common relational databases don't exists. -One of the powerful features of Fluent Bit Stream Processor is that allows to create new streams of data using the results from a previous SQL query, these results are re-ingested back into the pipeline to be consumed again for the Stream Processor (if desired) or routed to output destinations such any common record by using Tag/Matching rules (tip: stream processor results can be Tagged!) +One of the powerful features of Fluent Bit Stream Processor is that allows to create new streams of data using the results from a previous SQL query, these results are re-ingested back into the pipeline to be consumed again for the Stream Processor \(if desired\) or routed to output destinations such any common record by using Tag/Matching rules \(tip: stream processor results can be Tagged!\) diff --git a/stream-processing/README.md b/stream-processing/stream-processing.md similarity index 83% rename from stream-processing/README.md rename to stream-processing/stream-processing.md index 3eb003f9b..6717ab460 100644 --- a/stream-processing/README.md +++ b/stream-processing/stream-processing.md @@ -1,6 +1,6 @@ -![](imgs/stream_processor.png) - +# Introduction +![](https://github.com/fluent/fluent-bit-docs/tree/6bc4af039821d9e8bc1636797a25ad23b52a511f/stream-processing/imgs/stream_processor.png) [Fluent Bit](https://fluentbit.io) is a fast and flexible Log processor that aims to collect, parse, filter and deliver logs to remote databases, so Data Analysis can be performed.