Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add reference documentation about indexers and matchers #17139

Merged
merged 3 commits into from
Mar 31, 2020
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions filebeat/docs/index.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ include::{asciidoc-dir}/../../shared/attributes.asciidoc[]
:ignores_max_retries:
:has_docker_label_ex:
:has_modules_command:
:has_kubernetes_logs_path_matcher:
:has_registry:
:deb_os:
:rpm_os:
Expand All @@ -28,6 +29,8 @@ include::{asciidoc-dir}/../../shared/attributes.asciidoc[]
:docker_platform:
:win_os:

:kubernetes_default_indexers: {docdir}/kubernetes-default-indexers-matchers.asciidoc

include::{libbeat-dir}/shared-beats-attributes.asciidoc[]

include::./overview.asciidoc[]
Expand Down
14 changes: 14 additions & 0 deletions filebeat/docs/kubernetes-default-indexers-matchers.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
When `add_kubernetes_metadata` is used with {beatname_uc}, it uses the
`container` indexer and the `logs_path`. So events whose path in `log.file.path`
contains a reference to a container ID are enriched with metadata of the pod of
this container.

This behaviour can be disabled disabling default indexers and matchers in the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor but maybe a by is needed here:

Suggested change
This behaviour can be disabled disabling default indexers and matchers in the
This behaviour can be disabled by disabling default indexers and matchers in the

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better, yes, thanks!

configuration:
[source,yaml]
-------------------------------------------------------------------------------
processors:
- add_kubernetes_metadata:
default_indexers.enabled: false
default_matchers.enabled: false
-------------------------------------------------------------------------------
1 change: 1 addition & 0 deletions libbeat/docs/processors-list.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,7 @@ include::{libbeat-processors-dir}/add_id/docs/add_id.asciidoc[]
endif::[]
ifndef::no_add_kubernetes_metadata_processor[]
include::{libbeat-processors-dir}/add_kubernetes_metadata/docs/add_kubernetes_metadata.asciidoc[]
include::{libbeat-processors-dir}/add_kubernetes_metadata/docs/indexers_and_matchers.asciidoc[]
endif::[]
ifndef::no_add_labels_processor[]
include::{libbeat-processors-dir}/actions/docs/add_labels.asciidoc[]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,19 +24,31 @@ The `add_kubernetes_metadata` processor has two basic building blocks which are:
* Indexers
* Matchers

Indexers take in a pod's metadata and builds indices based on the pod metadata.
For example, the `ip_port` indexer can take a Kubernetes pod and index the pod
metadata based on all `pod_ip:container_port` combinations.

Matchers are used to construct lookup keys for querying indices. For example,
when the `fields` matcher takes `["metricset.host"]` as a lookup field, it would
construct a lookup key with the value of the field `metricset.host`.

Indexers use pods metadata to create unique identifiers for each one of the
pods, these identifiers help to correlate the metadata of the observed pods with
actual events. For example, the `ip_port` indexer can take a Kubernetes pod and
create identifiers for it based on all its `pod_ip:container_port` combinations.

Matchers use information in events to construct lookup keys that match the
identifiers created by the indexers. For example, when the `fields` matcher takes
`["metricset.host"]` as a lookup field, it would construct a lookup key with the
value of the field `metricset.host`. When one of this lookup keys match with one
of the identifiers, the event is enriched with the metadata of the identified
pod.

ifdef::kubernetes_default_indexers[]
include::{kubernetes_default_indexers}[]
endif::kubernetes_default_indexers[]
ifndef::kubernetes_default_indexers[]
Each Beat can define its own default indexers and matchers which are enabled by
default. For example, FileBeat enables the `container` indexer, which indexes
default. For example, Filebeat enables the `container` indexer, which identifies
pod metadata based on all container IDs, and a `logs_path` matcher, which takes
the `log.file.path` field, extracts the container ID, and uses it to retrieve
metadata.
endif::kubernetes_default_indexers[]

You can find more information about the available indexers and matchers, and some
examples in <<kubernetes-indexers-and-matchers>>.

The configuration below enables the processor when {beatname_lc} is run as a pod in
Kubernetes.
Expand Down Expand Up @@ -93,4 +105,4 @@ client. It defaults to `KUBECONFIG` environment variable if present.
`default_indexers.enabled`:: (Optional) Enable/Disable default pod indexers, in
case you want to specify your own.
`default_matchers.enabled`:: (Optional) Enable/Disable default pod matchers, in
case you want to specify your own.
case you want to specify your own.
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
[float]
[[kubernetes-indexers-and-matchers]]
=== Indexers and matchers

==== Indexers

Indexers use pods metadata to create unique identifiers for each one of the
pods.

Available indexers are:

`container`:: Identifies the pod metadata using the IDs of its containers.
`ip_port`:: Identifies the pod metadata using combinations of its IP and its exposed ports.
When using this indexer metadata is identified using the IP of the pods, and the
combination if `ip:port` for each one of the ports exposed by its containers.
`pod_name`:: Identifies the pod metadata using its namespace and its name as
`namespace/pod_name`.
`pod_uid`:: Identifies the pod metadata using the UID of the pod.

==== Matchers

Matchers are used to construct the lookup keys that match with the identifiers
created by indexes.

===== `field_format`

Looks up pod metadata using a key created with a string format that can include
event fields.

This matcher has an option `format` to define the string format. This string
format can contain placeholders for any field in the event.

For example, the following configuration uses the `ip_port` indexer to identify
the pod metadata by combinations of the pod IP and its exposed ports, and uses
the destination IP and port in events as match keys:

[source,yaml]
-------------------------------------------------------------------------------
processors:
- add_kubernetes_metadata:
...
default_indexers.enabled: false
default_matchers.enabled: false
indexers:
- ip_port:
matchers:
- field_format:
format: '%{[destination.ip]}:%{[destination.port]}'
-------------------------------------------------------------------------------

===== `fields`

Looks up pod metadata using as key the value of some specific fields. When
multiple fields are defined, the first one included in the event is used.

This matcher has an option `lookup_fields` to define the files whose value will
be used for lookup.

For example, the following configuration uses the `ip_port` indexer to identify
pods, and defines a matcher that uses the destination IP or the server IP for the
lookup, the first it finds in the event:

[source,yaml]
-------------------------------------------------------------------------------
processors:
- add_kubernetes_metadata:
...
default_indexers.enabled: false
default_matchers.enabled: false
indexers:
- ip_port:
matchers:
- fields:
lookup_fields: ['destination.ip', 'server.ip']
-------------------------------------------------------------------------------

ifdef::has_kubernetes_logs_path_matcher[]
===== `logs_path`

Looks up pod metadata using identifiers extracted from the log path stored in
the `log.file.path` field.

This matcher has the following configuration settings:

`logs_path`:: (Optional) Base path of container logs. If not specified, it uses
the default logs path of the platform where {beatname_uc} is running.
`resource_type`:: (Optional) Type of the resource to obtain the ID of. It can be
`pod`, to make the lookup based on the pod UID, or `container`, to make the
lookup based on the container ID. It defaults to `container`.

The default configuration is able to lookup the metadata using the container ID
when the logs are collected from the default docker logs path
(`/var/lib/docker/containers/<container ID>/...` on Linux).

For example the following configuration would use the pod UID when the logs are
collected from `/var/lib/kubelet/pods/<pod UID>/...`.

[source,yaml]
-------------------------------------------------------------------------------
processors:
- add_kubernetes_metadata:
...
default_indexers.enabled: false
default_matchers.enabled: false
indexers:
- pod_uid:
matchers:
- logs_path:
logs_path: '/var/lib/kubelet/pods'
resource_type: 'pod'
-------------------------------------------------------------------------------
endif::has_kubernetes_logs_path_matcher[]
2 changes: 2 additions & 0 deletions metricbeat/docs/index.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@ include::{asciidoc-dir}/../../shared/attributes.asciidoc[]
:no_decode_csv_fields_processor:
:no_timestamp_processor:

:kubernetes_default_indexers: {docdir}/kubernetes-default-indexers-matchers.asciidoc

include::{libbeat-dir}/shared-beats-attributes.asciidoc[]

include::./overview.asciidoc[]
Expand Down
14 changes: 14 additions & 0 deletions metricbeat/docs/kubernetes-default-indexers-matchers.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
When `add_kubernetes_metadata` is used with {beatname_uc}, it uses the `ip_port`
indexer and the `fields` matcher with the `metricset.host` field. So events that
contain a `metricset.host` field are enriched with metadata of the pods that
exposes the same port in the same IP.

This behaviour can be disabled disabling default indexers and matchers in the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor but maybe a by is needed here:

Suggested change
This behaviour can be disabled disabling default indexers and matchers in the
This behaviour can be disabled by disabling default indexers and matchers in the

configuration:
[source,yaml]
-------------------------------------------------------------------------------
processors:
- add_kubernetes_metadata:
default_indexers.enabled: false
default_matchers.enabled: false
-------------------------------------------------------------------------------