Improve Kubernetes Logs Collection Experience #25251

TylerHelmuth · 2023-08-14T18:08:12Z

Component(s)

receiver/filelog

Describe the issue you're reporting

Problem Statement

The Collector's solution for collecting logs from Kubernetes is the Filelog Receiver and it can handle collection of Kubernetes Logs for most scenarios. But Filelog Receiver was created to be a generic solution and therefore does not take advantage of useful Kubernetes assumptions out-of-the-box.

At the moment to collector logs with the Filelog receiver the recommended configuration is:

receivers:
filelog:
exclude: []
include:
- /var/log/pods/*/*/*.log
include_file_name: false
include_file_path: true
operators:
- id: get-format
  routes:
  - expr: body matches "^\\{"
    output: parser-docker
  - expr: body matches "^[^ Z]+ "
    output: parser-crio
  - expr: body matches "^[^ Z]+Z"
    output: parser-containerd
  type: router
- id: parser-crio
  regex: ^(?P<time>[^ Z]+) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$
  timestamp:
    layout: 2006-01-02T15:04:05.999999999Z07:00
    layout_type: gotime
    parse_from: attributes.time
  type: regex_parser
- combine_field: attributes.log
  combine_with: ""
  id: crio-recombine
  is_last_entry: attributes.logtag == 'F'
  max_log_size: 102400
  output: extract_metadata_from_filepath
  source_identifier: attributes["log.file.path"]
  type: recombine
- id: parser-containerd
  regex: ^(?P<time>[^ ^Z]+Z) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$
  timestamp:
    layout: '%Y-%m-%dT%H:%M:%S.%LZ'
    parse_from: attributes.time
  type: regex_parser
- combine_field: attributes.log
  combine_with: ""
  id: containerd-recombine
  is_last_entry: attributes.logtag == 'F'
  max_log_size: 102400
  output: extract_metadata_from_filepath
  source_identifier: attributes["log.file.path"]
  type: recombine
- id: parser-docker
  output: extract_metadata_from_filepath
  timestamp:
    layout: '%Y-%m-%dT%H:%M:%S.%LZ'
    parse_from: attributes.time
  type: json_parser
- id: extract_metadata_from_filepath
  parse_from: attributes["log.file.path"]
  regex: ^.*\/(?P<namespace>[^_]+)_(?P<pod_name>[^_]+)_(?P<uid>[a-f0-9\-]+)\/(?P<container_name>[^\._]+)\/(?P<restart_count>\d+)\.log$
  type: regex_parser
- from: attributes.stream
  to: attributes["log.iostream"]
  type: move
- from: attributes.container_name
  to: resource["k8s.container.name"]
  type: move
- from: attributes.namespace
  to: resource["k8s.namespace.name"]
  type: move
- from: attributes.pod_name
  to: resource["k8s.pod.name"]
  type: move
- from: attributes.restart_count
  to: resource["k8s.container.restart_count"]
  type: move
- from: attributes.uid
  to: resource["k8s.pod.uid"]
  type: move
- from: attributes.log
  to: body
  type: move
start_at: beginning

To a new user that is a lot of scary configuration that will take time to comprehend, and they also probably don't want to comprehend it, yet it has to live in their configuration. In the Collector Helm chart we hide this complexity behind a preset, but it can't handle all situations.

Here are a couple experiences I'd like to improve:

"Multi-tenancy" support. The Filelog receiver is good at gathering all the logs at once and sending them down a pipeline, but it is not setup to collect logs for a specific namespace or pod and send that down a specific pipeline. To meet this requirement you must configure multiple instances of the filelogreceiver and duplicate all of the configuration, only changing the include section as needed.
- I believe multiple instances of the receiver are needed, but it would be nice to reduce the amount of duplicate configuration. It would be nice to be able to quickly configure a k8s-specific filelogreceiver and add it to the appropriate pipeline
No support for label selectors. Although you can specify specific namespaces/pods/containers to collect by taking advantage of the log path, using label selectors to identify an object is a common practice in k8s.

Some of the solution might be in the helm chart and some might be in the file log receiver itself. It is also possible this spawns a new k8s-specific receiver that is using stanza behind the scenes.

Ultimately, I am looking to improve the "easy path" solution for most users in Kubernetes. I want to make it easier for user to collect logs for a specific subset of all the logs in the cluster and for it to be easier to configure multiple instances of the receiver to support different destinations. Packaging up all the Kubernetes assumptions into something like:

receivers:
  filelog:
    forKubernetes: true
    include: 
      - /var/log/pods/my-namespace/*/*.log

or

receivers:
  filelog/api-server:
    forKubernetes: true
    labelSelectors:
      - component=kube-apiserver,tier=control-plane

would be great.

The text was updated successfully, but these errors were encountered:

TylerHelmuth · 2023-08-14T18:11:32Z

/cc @dmitryax @djaglowski

github-actions · 2023-08-14T18:14:13Z

Pinging code owners:

receiver/filelog: @djaglowski

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions · 2023-08-14T18:15:40Z

Pinging code owners for receiver/filelog: @djaglowski. See Adding Labels via Comments if you do not have permissions to add labels yourself.

dmitryax · 2023-08-14T20:53:33Z

There is no way to apply label selectors with filelog receiver because that information is not exposed on the file paths. The only way to do that is by fetching logs from k8s API, which adds significant load on the API and likely will significantly degrade the performance. This should not be part of filelog receiver. We have a proposal with another component for this purpose: #24439. Feel free to take a look

TylerHelmuth · 2023-08-14T21:07:45Z

@dmitryax I agree, which is why I believe a possible outcome of this issue is a k8s-specific logs collection receiver. It looks like the component you linked (#23339) is exactly that and would help the 2 experiences I want to solve.

github-actions · 2023-10-16T03:30:06Z

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

receiver/filelog: @djaglowski

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions · 2023-12-18T03:30:19Z

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

receiver/filelog: @djaglowski

See Adding Labels via Comments if you do not have permissions to add labels yourself.

makas45 · 2023-12-21T13:24:14Z

We have also ended the same situation and would like to export the logs based on the namespace level.

ChrsMark · 2024-01-10T14:23:40Z

@dmitryax @TylerHelmuth +1 for this!

It would be nice if we can skip/exclude Pods' logfiles based on Pod's labels/annotations.

This is super useful when a user would like to use https://opentelemetry.io/docs/concepts/sdk-configuration/general-sdk-configuration/#otel_logs_exporter to collect the logs for specific instrumented apps while for the rest of the apps/Pods filelog receiver should handle the collection.

Without the option to skip/exclude specific Pod's logfiles based on metadata we end up having duplicate records.

ChrsMark · 2024-01-11T14:07:09Z

@TylerHelmuth I wonder if for this we could instead use the receiver_creator in order to populate filelog receiver configs dynamically.

Sth like that:

receivers:
  receiver_creator:
    watch_observers: [ k8s_observer ]
    receivers:
      filelog:
        rule: type == "pod" && labels["otel.logs.exporter"] == "otlp"
        config:
          ...

Note: I'm trying to make it work but without success yet but I'm posting the question already just to verify if that approach would be a no-go for any reasons.

TylerHelmuth · 2024-01-11T17:37:19Z

@ChrsMark I think that is an option for handling the multi-tenency solution, but I don't think it address these issues:

Still needing to configure the filelogreceiver multiple times
The ability to select what pod logs to scrape based on selectors

ChrsMark · 2024-01-12T09:21:56Z

@ChrsMark I think that is an option for handling the multi-tenency solution, but I don't think it address these issues:

Still needing to configure the filelogreceiver multiple times

Maybe I miss sth here but k8slog receiver proposed at #23339 would need to be configured multiple times as well so as to cover all the different filter combinations, right? Or it will have more explicit support for routing to specific pipelines somehow?

The ability to select what pod logs to scrape based on selectors

Won't a rule in the receiver_creator like rule: type == "pod" && labels["component"] == "kube-apiserver" be equivalent to

labelSelectors:
     - component=kube-apiserver`

?

I'm just trying to understand what extra value the k8slog receiver will bring. However it's true that defining filelog receivers as part of the receiver_creator makes the configuration more complex and maybe hard to troubleshoot so maybe a more native approach would be preferable here.

Let me know what you think @TylerHelmuth. I'm also cc-ing @h0cheung who works on the k8slog proposal.

I'm also sharing my working example for reference:

daemonset.yaml

apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: daemonset
spec:
  mode: daemonset
  serviceAccount:
  hostNetwork: true
  volumeMounts:
    - name: varlogpods
      mountPath: /var/log/pods
      readOnly: true
  volumes:
    - name: varlogpods
      hostPath:
        path: /var/log/pods
  config: |
    exporters:
      debug: {}
      logging: {}
      otlp/elastic:
        compression: none
        endpoint: http://fleet-server:8200
        tls:
          insecure: true
          insecure_skip_verify: true
    extensions:
      k8s_observer:
        auth_type: serviceAccount
        node: ${env:K8S_NODE_NAME}
        observe_pods: true
      health_check: {}
      memory_ballast:
        size_in_percentage: 40
    processors:
      batch: {}
      resource/k8s:
        attributes:
          - key: service.name
            from_attribute: app.label.component
            action: insert
      k8sattributes:
        extract:
          metadata:
          - k8s.namespace.name
          - k8s.deployment.name
          - k8s.statefulset.name
          - k8s.daemonset.name
          - k8s.cronjob.name
          - k8s.job.name
          - k8s.node.name
          - k8s.pod.name
          - k8s.pod.uid
          - k8s.pod.start_time
          - container.id
          labels:
          - tag_name: app.label.component
            key: app.kubernetes.io/component
            from: pod
          - tag_name: logs.exporter
            key: otel.logs.exporter
            from: pod
        filter:
          node_from_env_var: K8S_NODE_NAME
        passthrough: false
        pod_association:
        - sources:
          - from: resource_attribute
            name: k8s.pod.ip
        - sources:
          - from: resource_attribute
            name: k8s.pod.uid
        - sources:
          - from: connection
      memory_limiter:
        check_interval: 5s
        limit_percentage: 80
        spike_limit_percentage: 25
    receivers:
      receiver_creator:
        watch_observers: [ k8s_observer ]
        receivers:
          filelog:
            rule: type == "pod" && labels["otel.logs.exporter"] = "otlp"
            config: 
              exclude:
              - /var/log/pods/default_daemonset-opentelemetry-collector*_*/opentelemetry-collector/*.log
              include:
              - /var/log/pods/`namespace`_`name`*/*/*.log
              include_file_name: false
              include_file_path: true
              operators:
              - id: get-format
                routes:
                - expr: body matches "^{"
                  output: parser-docker
                - expr: body matches '^[^ Z]+ '
                  output: parser-crio
                - expr: body matches '^[^ Z]+Z'
                  output: parser-containerd
                type: router
              - id: parser-crio
                regex: ^(?P<time>[^ Z]+) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$
                timestamp:
                  layout: 2006-01-02T15:04:05.999999999Z07:00
                  layout_type: gotime
                  parse_from: attributes.time
                type: regex_parser
              - combine_field: attributes.log
                combine_with: ""
                id: crio-recombine
                is_last_entry: attributes.logtag == 'F'
                max_log_size: 102400
                output: json_parser
                source_identifier: attributes['log.file.path']
                type: recombine
              - id: parser-containerd
                regex: ^(?P<time>[^ ^Z]+Z) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$
                timestamp:
                  layout: '%Y-%m-%dT%H:%M:%S.%LZ'
                  parse_from: attributes.time
                type: regex_parser
              - combine_field: attributes.log
                combine_with: ""
                id: containerd-recombine
                is_last_entry: attributes.logtag == 'F'
                max_log_size: 102400
                output: json_parser
                source_identifier: attributes["log.file.path"]
                type: recombine
              - id: parser-docker
                output: json_parser
                timestamp:
                  layout: '%Y-%m-%dT%H:%M:%S.%LZ'
                  parse_from: attributes.time
                type: json_parser
              - type: json_parser
                if: 'body matches "^{.*}$"'
                severity:
                  parse_from: attributes.level
              start_at: end
    service:
      extensions:
      - health_check
      - k8s_observer
      pipelines:
        logs:
          exporters:
          - otlp/elastic
          processors:
          - k8sattributes
          - batch
          - resource/k8s
          receivers:
          - receiver_creator

TylerHelmuth · 2024-01-12T17:37:39Z

Maybe I miss sth here but k8slog receiver proposed at #23339 would need to be configured multiple times as well so as to cover all the different filter combinations, right? Or it will have more explicit support for routing to specific pipelines somehow?

I think it would need configured multiple times as well, but it would hide all the big, long, complex fileconsumer config behind the scenes. Even with the receivercreator I believe the config in the description would need duplicated.

Won't a rule in the receiver_creator like rule: type == "pod" && labels["componeney"] == "kube-apiserver" be equivalent to

Oh I didn't realize it could do that, that's pretty cool. Yes I think that's equivalent. I expect there'd end up being some other k8s-specific options in a k8slogreceiver, but I'm curious if there would be any more overlap.

I think obfuscating the complexity of the fileconsumer configuration is the primary benefit of the k8slogreceiver. It lets users focus on k8s-stuff an not worry about how the receiver gets the logs, trusting that it knows how to take advantage of standard k8s formatting and expectations.

ChrsMark · 2024-01-12T19:41:23Z

I think obfuscating the complexity of the fileconsumer configuration is the primary benefit of the k8slogreceiver. It lets users focus on k8s-stuff an not worry about how the receiver gets the logs, trusting that it knows how to take advantage of standard k8s formatting and expectations.

Agree @TylerHelmuth , I think a "wrapper" over the current filelog receiver which will be k8s specific hiding details would benefit the users.
I will try to have a look into the proposal PR as well and comment there if I have any questions or suggestions :).

djaglowski · 2024-01-12T19:59:56Z

If in the end we're just looking for a way to gloss over the complexity of the filelog receiver, I think we should be looking to solve it with a "template". See open-telemetry/opentelemetry-collector#8372. That said, I believe #24439 was intended to add additional functionality which is not possible with filelog receiver.

TylerHelmuth · 2024-01-12T20:39:48Z

@djaglowski ya we're looking for both things:

more kubernetes-specific features for determining which logs to scrape
Encapsulating common k8s-log collection configuration.

It may be possible to use a combination of templates and the receiver creator to achieve this, but I'm not sure if that would come together in a solution that is simpler than a targeted receiver.

ChrsMark · 2024-01-29T12:12:40Z

Encapsulating common k8s-log collection configuration

That would be great. I guess we would need to "hide" (make it an implementation detail) the operator part that handles the docker, cri-o and containerd logs?

At the moment the routing as well as the special handling of the logs per runtime looks weird/scary to someone that is not fully familiar with the Collector's features.

Also I wonder if this functionality is somehow tested. Are there any tests that ensure that the Collector can handle docker, cri-o and containerd logs specifically? I guess

opentelemetry-collector-contrib/testbed/datasenders/k8s.go

Line 154 in eebda5f

func NewKubernetesContainerWriter() *FileLogK8sWriter {

covers this.
In addition, any specific configuration details should be well documented.

We can wait and cover all those as part of the new k8slog receiver or we can handle them today as part of the Filelog receiver. Happy to discuss more my thoughts here and see if I can help in any way, so @TylerHelmuth @djaglowski let me know your thoughts on this.

djaglowski · 2024-02-12T16:21:15Z

@djaglowski ya we're looking for both things:

more kubernetes-specific features for determining which logs to scrape

Encapsulating common k8s-log collection configuration.

It may be possible to use a combination of templates and the receiver creator to achieve this, but I'm not sure if that would come together in a solution that is simpler than a targeted receiver.

#23339 (comment) is still the right approach in my opinion. Basically a dedicated receiver which shares much of the same code but can additionally add k8s specific features as needed.

If there's agreement there, we might consider consolidating this issue with #23339

djaglowski · 2024-02-12T16:49:28Z

I'm removing the filelog receiver label since there does not appear to be anything actionable in relation that that receiver (though there may be changes to shared packages).

TylerHelmuth · 2024-02-13T17:45:34Z

Basically a dedicated receiver which shares much of the same code but can additionally add k8s specific features as needed.

@djaglowski agreed

djaglowski · 2024-02-14T15:51:20Z

Closing in favor of #23339. Please continue conversation over there.

TylerHelmuth added needs triage New item requiring triage discussion needed Community discussion needed receiver/filelog and removed needs triage New item requiring triage labels Aug 14, 2023

github-actions bot added the Stale label Oct 16, 2023

TylerHelmuth removed the Stale label Oct 17, 2023

github-actions bot added the Stale label Dec 18, 2023

github-actions bot removed the Stale label Dec 22, 2023

djaglowski mentioned this issue Feb 12, 2024

Collect logs from files in container, especially Kubernetes and Docker #19444

Closed

djaglowski removed the receiver/filelog label Feb 12, 2024

djaglowski closed this as completed Feb 14, 2024

ChrsMark mentioned this issue Mar 26, 2024

Introduce a container_parser operator for container/k8s logs parsing #31959

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Kubernetes Logs Collection Experience #25251

Improve Kubernetes Logs Collection Experience #25251

TylerHelmuth commented Aug 14, 2023

TylerHelmuth commented Aug 14, 2023

github-actions bot commented Aug 14, 2023

github-actions bot commented Aug 14, 2023

dmitryax commented Aug 14, 2023

TylerHelmuth commented Aug 14, 2023

github-actions bot commented Oct 16, 2023

github-actions bot commented Dec 18, 2023

makas45 commented Dec 21, 2023

ChrsMark commented Jan 10, 2024

ChrsMark commented Jan 11, 2024 •

edited

Loading

TylerHelmuth commented Jan 11, 2024

ChrsMark commented Jan 12, 2024 •

edited

Loading

TylerHelmuth commented Jan 12, 2024

ChrsMark commented Jan 12, 2024

djaglowski commented Jan 12, 2024

TylerHelmuth commented Jan 12, 2024

ChrsMark commented Jan 29, 2024 •

edited

Loading

djaglowski commented Feb 12, 2024

djaglowski commented Feb 12, 2024

TylerHelmuth commented Feb 13, 2024

djaglowski commented Feb 14, 2024

Improve Kubernetes Logs Collection Experience #25251

Improve Kubernetes Logs Collection Experience #25251

Comments

TylerHelmuth commented Aug 14, 2023

Component(s)

Describe the issue you're reporting

Problem Statement

TylerHelmuth commented Aug 14, 2023

github-actions bot commented Aug 14, 2023

github-actions bot commented Aug 14, 2023

dmitryax commented Aug 14, 2023

TylerHelmuth commented Aug 14, 2023

github-actions bot commented Oct 16, 2023

github-actions bot commented Dec 18, 2023

makas45 commented Dec 21, 2023

ChrsMark commented Jan 10, 2024

ChrsMark commented Jan 11, 2024 • edited Loading

TylerHelmuth commented Jan 11, 2024

ChrsMark commented Jan 12, 2024 • edited Loading

TylerHelmuth commented Jan 12, 2024

ChrsMark commented Jan 12, 2024

djaglowski commented Jan 12, 2024

TylerHelmuth commented Jan 12, 2024

ChrsMark commented Jan 29, 2024 • edited Loading

djaglowski commented Feb 12, 2024

djaglowski commented Feb 12, 2024

TylerHelmuth commented Feb 13, 2024

djaglowski commented Feb 14, 2024

ChrsMark commented Jan 11, 2024 •

edited

Loading

ChrsMark commented Jan 12, 2024 •

edited

Loading

ChrsMark commented Jan 29, 2024 •

edited

Loading