Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k8sattributes it`s not working in EKS 1.26 #22036

Closed
AndriySidliarskiy opened this issue May 17, 2023 · 53 comments
Closed

k8sattributes it`s not working in EKS 1.26 #22036

AndriySidliarskiy opened this issue May 17, 2023 · 53 comments
Labels
bug Something isn't working closed as inactive processor/k8sattributes k8s Attributes processor Stale

Comments

@AndriySidliarskiy
Copy link

Component(s)

processor/k8sattributes

What happened?

Description

I have an open telemetry colector configuration with k8sattributes but in the log context, I cannot see anything from metadata included in k8sattributes

Steps to Reproduce

Configure k8sattributes In EKS 1.26 with open telemetry collector helm chart.

Expected Result

   - k8s.pod.name
   - k8s.pod.uid
   - k8s.deployment.name
   - k8s.namespace.name

Actual Result

nothing

Collector version

0.77.0

Environment information

Environment

OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")

OpenTelemetry Collector configuration

k8sattributes:
      auth_type: "serviceAccount"
      passthrough: false
      filter:
        node_from_env_var: KUBE_NODE_NAME     
      extract:
        metadata:
          - k8s.pod.name
          - k8s.pod.uid
          - k8s.deployment.name
          - k8s.namespace.name
        labels:
          - tag_name: c2i.pipeline.execution
            key: c2i.pipeline.execution
            from: pod
          - tag_name: c2i.pipeline.project
            key: c2i.pipeline.project
            from: pod
        annotations:
          - tag_name: monitoring # extracts value of annotation from pods with key `annotation-one` and inserts it as a tag with key `a1`
            key: monitoring
            from: pod
    pipelines:
      logs/eks:
        exporters:
          - coralogix
        processors:
          # - batch
          # - resourcedetection/env
          - k8sattributes
        receivers:
          - k8s_events
          - filelog
          - filelog/2

Log output

No response

Additional context

No response

@AndriySidliarskiy AndriySidliarskiy added bug Something isn't working needs triage New item requiring triage labels May 17, 2023
@github-actions github-actions bot added the processor/k8sattributes k8s Attributes processor label May 17, 2023
@github-actions
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@atoulme
Copy link
Contributor

atoulme commented May 18, 2023

Anything in the collector logs?

@AndriySidliarskiy
Copy link
Author

nothing, k8sattributes launched but in logs nothing

@swiatekm
Copy link
Contributor

Did this configuration work with a previous collector or EKS version?

@AndriySidliarskiy
Copy link
Author

idk i use only this version, but in my opinion, it may not work in EKS in general. I used your example and it`s also not working because you extract pod names and names from the path of the file so k8attributes does nothing.

@AndriySidliarskiy
Copy link
Author

    receivers:
      filelog:
        include:
          - /var/log/pods/*/*/*.log
        exclude:
          # Exclude logs from all containers named otel-collector
          - /var/log/pods/*/otel-collector/*.log
        start_at: beginning
        include_file_path: true
        include_file_name: false
        operators:
          # Find out which format is used by kubernetes
          - type: router
            id: get-format
            routes:
              - output: parser-docker
                expr: 'body matches "^\\{"'
              - output: parser-crio
                expr: 'body matches "^[^ Z]+ "'
              - output: parser-containerd
                expr: 'body matches "^[^ Z]+Z"'
          # Parse CRI-O format
          - type: regex_parser
            id: parser-crio
            regex: '^(?P<time>[^ Z]+) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$'
            output: extract_metadata_from_filepath
            timestamp:
              parse_from: attributes.time
              layout_type: gotime
              layout: '2006-01-02T15:04:05.999999999Z07:00'
          # Parse CRI-Containerd format
          - type: regex_parser
            id: parser-containerd
            regex: '^(?P<time>[^ ^Z]+Z) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$'
            output: extract_metadata_from_filepath
            timestamp:
              parse_from: attributes.time
              layout: '%Y-%m-%dT%H:%M:%S.%LZ'
          # Parse Docker format
          - type: json_parser
            id: parser-docker
            output: extract_metadata_from_filepath
            timestamp:
              parse_from: attributes.time
              layout: '%Y-%m-%dT%H:%M:%S.%LZ'
          - type: move
            from: attributes.log
            to: body
          # Extract metadata from file path
          - type: regex_parser
            id: extract_metadata_from_filepath
            regex: '^.*\/(?P<namespace>[^_]+)_(?P<pod_name>[^_]+)_(?P<uid>[a-f0-9\-]{36})\/(?P<container_name>[^\._]+)\/(?P<restart_count>\d+)\.log$'
            parse_from: attributes["log.file.path"]
            cache:
              size: 128  # default maximum amount of Pods per Node is 110
          # Rename attributes
          - type: move
            from: attributes.stream
            to: attributes["log.iostream"]
          - type: move
            from: attributes.container_name
            to: resource["k8s.container.name"]
          - type: move
            from: attributes.namespace
            to: resource["k8s.namespace.name"]
          - type: move
            from: attributes.pod_name
            to: resource["k8s.pod.name"]
          - type: move
            from: attributes.restart_count
            to: resource["k8s.container.restart_count"]
          - type: move
            from: attributes.uid
            to: resource["k8s.pod.uid"]

    processors:
      # k8sattributes processor to get the metadata from K8s
      k8sattributes:
        auth_type: "serviceAccount"
        passthrough: false
        extract:
          metadata:
            - k8s.pod.name
            - k8s.pod.uid
            - k8s.deployment.name
            - k8s.cluster.name
            - k8s.namespace.name
            - k8s.node.name
            - k8s.pod.start_time
          # Pod labels which can be fetched via K8sattributeprocessor
          labels:
            - tag_name: key1
              key: label1
              from: pod
            - tag_name: key2
              key: label2
              from: pod
        # Pod association using resource attributes and connection
        pod_association:
          - from: resource_attribute
            name: k8s.pod.uid
          - from: resource_attribute
            name: k8s.pod.ip
          - from: connection

    exporters:
      logging:
        loglevel: debug
    service:
      pipelines:
        logs:
          receivers: [filelog]
          processors: [k8sattributes]
          exporters: [logging]

@AndriySidliarskiy
Copy link
Author

you can remove "move" from filelog proccesor and leave only k8sattributes and you must see

  • k8s.pod.name or another attribute not present in the log context.

@swiatekm
Copy link
Contributor

swiatekm commented May 18, 2023

By default k8sattributes processor identifies the Pod by looking at the ip of the remote which sent the data. This works if the data is sent directly from instrumentation, but if you want to use it in a different context (for example a DaemonSet collecting logs), you need to tell the processor how to identify the Pod for a given resource.

For the configuration you posted, you get k8s.pod.name from the filepath. You also need to tell the processor to use that:

pod_association:
  - sources:
      - from: resource_attribute
        name: k8s.pod.name
      - from: resource_attribute
        name: k8s.namespace.name

I think your current config is missing the - sources part. If it wasn't, k8s.pod.uid should work as well.

@AndriySidliarskiy
Copy link
Author

Okay but in the case of pod label how i must configure this in the example it only adds the label part and all good but in my case it`s not working.

@swiatekm
Copy link
Contributor

swiatekm commented May 18, 2023

I'm not sure I follow what exactly you're seeing at this point. Can you post

  • your current configuration
  • the DaemonSet definition
  • what the exported data looks like

?

@AndriySidliarskiy
Copy link
Author

global:
  domain: "coralogix.com"

mode: daemonset
hostNetwork: false
fullnameOverride: otel-coralogix
clusterRole:
  create: true
  name: "otel-coralogix"
  rules:
    - apiGroups:
      - "*"
      resources:
      - events
      - namespaces
      - namespaces/status
      - nodes
      - nodes/spec
      - pods
      - pods/status
      - replicationcontrollers
      - replicationcontrollers/status
      - resourcequotas
      - services
      - endpoints
      - nodes/proxy
      verbs:
      - get
      - list
      - watch
    - apiGroups:
      - apps
      resources:
      - daemonsets
      - deployments
      - replicasets
      - statefulsets
      verbs:
      - get
      - list
      - watch
    - apiGroups:
      - extensions
      resources:
      - daemonsets
      - deployments
      - replicasets
      verbs:
      - get
      - list
      - watch
    - apiGroups:
      - batch
      resources:
      - jobs
      - cronjobs
      verbs:
      - get
      - list
      - watch
    - apiGroups:
      - "*"
      resources:
      - horizontalpodautoscalers
      verbs:
      - get
      - list
      - watch
    - apiGroups:
      - "*"
      resources:
      - nodes/stats
      - configmaps
      - events
      - leases
      verbs:
      - get
      - list
      - watch
      - create
      - update
  clusterRoleBinding:
    name: "otel-coralogix"
presets:
  logsCollection:
    enabled: false
    storeCheckpoints: true
  # kubernetesAttributes:
  #   enabled: true
  # hostMetrics:
  #   enabled: true
  # kubeletMetrics:
  #   enabled: true

extraEnvs:
- name: CORALOGIX_PRIVATE_KEY
  value: 
- name: KUBE_NODE_NAME
  valueFrom:
    fieldRef:
      apiVersion: v1
      fieldPath: spec.nodeName
- name: HOST_IP
  valueFrom:
    fieldRef:
      fieldPath: status.hostIP
- name: HOST_NAME
  valueFrom:
    fieldRef:
      fieldPath: spec.nodeName
- name: K8S_NAMESPACE
  valueFrom:
      fieldRef:
        fieldPath: metadata.namespace
- name: CLUSTER_NAME
  value: ""
config:
  extensions:
    zpages:
      endpoint: localhost:55679
    pprof:
      endpoint: localhost:1777
  exporters:
    coralogix:
      timeout: "1m"
      private_key: "${CORALOGIX_PRIVATE_KEY}"
      domain: "{{ .Values.global.domain }}"
      application_name_attributes:
      - "cloud.account.id"
      application_name: "{{.Values.global.defaultApplicationName }}"
      subsystem_name: "{{.Values.global.defaultSubsystemName }}"
  processors:
    # transform:
    #   error_mode: ignore
    #   log_statements:
    #     - context: resource
    #       statements:
    #         - keep_keys(attributes, ["cloud.region", "host.id", "host.type"])
    #     - context: log
    #       statements:
    #         - keep_keys(attributes, ["time", "log.file.name", "k8s.cluster.name", "k8s.namespace.name", "k8s.pod.name", "log.file.path", "k8s.ident", "message", "log", "k8s.pod.restart_count", "k8s.job.name"])
    transform/cw:
      error_mode: ignore
      log_statements:
        - context: resource
          statements:
            - keep_keys(attributes, ["cloud.region", "cloudwatch.log.group.name", "cloudwatch.log.stream"])
    k8sattributes:
      auth_type: "serviceAccount"
      passthrough: false
      filter:
        node_from_env_var: KUBE_NODE_NAME     
      extract:
        metadata:
          - k8s.pod.start_time
          - k8s.deployment.name
        labels:
          - tag_name: pipeline.execution
            key: pipeline.execution
            from: pod
          - tag_name: pipeline.project
            key: pipeline.project
            from: pod
        annotations:
          - tag_name: monitoring # extracts value of annotation from pods with key `annotation-one` and inserts it as a tag with key `a1`
            key: monitoring
            from: pod
      pod_association:
        - sources:
            - from: resource_attribute
              name: k8s.pod.start_time
            - from: resource_attribute
              name: k8s.deployment.name
    memory_limiter: null # Will get the k8s resource limits
    resourcedetection/env:
      detectors: ["env", "ec2"]
      timeout: 2s
      override: false
    spanmetrics:
      metrics_exporter: coralogix
      dimensions:
        - name: "k8s.pod.name"
        - name: "k8s.cronjob.name"
        - name: "k8s.job.name"
        - name: "k8s.node.name"
        - name: "k8s.namespace.name" 
  receivers:
    awscloudwatch:
      region: eu-west-1
      logs:
        poll_interval: "30s"
        groups:
          autodiscover:
            limit: 100
            prefix: /aws/vendedlogs/states/
            streams:
              prefixes: [states/]
    filelog:
      include: [/var/log/pods/*/*/*.log]
      include_file_name: false
      include_file_path: true
      operators:
        - type: router
          id: get-format
          routes:
            - output: parser-docker
              expr: 'body matches "^\\{"'
            - output: parser-containerd
              expr: 'body matches "^[^ Z]+Z"'
        - type: regex_parser
          id: parser-containerd
          regex: '^(?P<time>[^ ^Z]+Z) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<message>.*)$'
          output: extract_metadata_from_filepath
          timestamp:
            parse_from: attributes.time
            layout: '%Y-%m-%dT%H:%M:%S.%LZ'
        - type: json_parser
          id: parser-docker
          output: extract_metadata_from_filepath
          timestamp:
            parse_from: attributes.time
            layout: '%Y-%m-%dT%H:%M:%S.%LZ'
        - type: regex_parser
          id: extract_metadata_from_filepath
          regex: '^.*\/(?P<namespace>[^_]+)_(?P<pod_name>[^_]+)_(?P<uid>[a-f0-9\-]{36})\/(?P<container_name>[^\._]+)\/(?P<restart_count>\d+)\.log$'
          parse_from: attributes["log.file.path"]
        - type: move
          from: attributes.namespace
          to: attributes["k8s.namespace.name"]
        - type: move
          from: attributes.restart_count
          to: attributes["k8s.pod.restart_count"]
        - type: move
          from: attributes.message
          to: body
        - type: move
          from: attributes.pod_name
          to: attributes["k8s.pod.name"]
        - type: move
          from: attributes.container_name
          to: attributes["k8s.container.name"]
        - type: add
          field: attributes["k8s.cluster.name"]
          value: '${CLUSTER_NAME}'
    filelog/2:
      include: [ /var/log/messages, /var/log/dmesg, /var/log/secure]
      include_file_name: false
      include_file_path: true
      operators:
        - type: router
          id: get-format
          routes:
            - output: parser-containerd
              expr: 'body matches " .* containerd: .*"'
            - output: parser-kubelet
              expr: 'body matches " .* kubelet: .*"'
            - output: parser-syslog
              expr: 'body matches " .* systemd: .*"'
            - output: parser-dhclient
              expr: 'body matches ".* dhclient[4608]: .*"'
        - type: regex_parser
          id: parser-dhclient
          regex: '^(?P<time>[^ ]* {1,2}[^ ]* [^ ]*) (?P<host>[^ ]*) (?P<ident>[^:]*): (?P<message>.*)$'
          output: move
        - type: syslog_parser
          id: parser-syslog
          protocol: rfc3164
        - type: regex_parser
          id: parser-containerd
          regex: '^(?P<time>^[^ ]* {1,2}[^ ]* [^ ]*) (?P<host>[^ ]*) (?P<indent>[a-zA-z0-9_\/\.\-]*): time=\".+\" (?P<level>level=[a-zA-Z]+) (?P<msg>msg=".*")'
          output: move
        - type: regex_parser
          id: parser-kubelet
          regex: '^(?P<time>[^ ]* {1,2}[^ ]* [^ ]*) (?P<host>[^ ]*) (?P<ident>[a-zA-Z0-9_\/.\-]*): (?P<level>[A-Z][a-z]*[0-9]*) (?P<pid>[0-9]+) (?P<source>[^:]*): *(?P<message>.*)$'
          output: move
        - type: file_input
          id: parser-dmesg
          include:
            - /var/log/dmesg
            - /var/log/secure
        - type: move
          from: attributes.message
          to: body
        - type: move
          from: attributes.ident
          to: attributes["k8s.ident"]
        - type: add
          field: attributes["k8s.cluster.name"]
          value: '${CLUSTER_NAME}'
    k8s_events:
      auth_type: "serviceAccount"
      namespaces: [default, kube-system, general, monitoring]
    k8s_cluster:
      auth_type: "serviceAccount"
      allocatable_types_to_report: [cpu, memory, storage, ephemeral-storage]
      node_conditions_to_report: [Ready, MemoryPressure]
    otlp:
      protocols:
        grpc:
          endpoint: ${MY_POD_IP}:4317
        http:
          endpoint: ${MY_POD_IP}:4318
  service:
    extensions:
    - zpages
    - pprof
    - health_check
    - memory_ballast
    telemetry:
      metrics:
        address: ${MY_POD_IP}:8888
    pipelines:
      logs:
        exporters:
          - coralogix
        processors:
          - batch
          - resourcedetection/env
          - transform/cw
        receivers:
          - awscloudwatch
      logs/eks:
        exporters:
          - coralogix
        processors:
          # - batch
          # - resourcedetection/env
          - k8sattributes
        receivers:
          - k8s_events
          - filelog
          - filelog/2
      metrics:
        exporters:
          - coralogix
        processors:
          - memory_limiter
          - resourcedetection/env
          - batch
        receivers:
          - otlp
          - k8s_cluster
      traces:
        exporters:
          - coralogix
        processors:
          - memory_limiter
          - spanmetrics
          - batch
          - resourcedetection/env
        receivers:
          - otlp
          - zipkin
tolerations: 
  - operator: Exists

extraVolumes:
  - name: varlog
    hostPath:
      path: /var/log
      type: ''
  - name: rootfs
    hostPath:
      path: /
  - name: varlibdocker
    hostPath:
      path: /var/lib/docker
  - name: containerdsock
    hostPath:
      path: /run/containerd/containerd.sock
  - name: sys
    hostPath:
      path: /sys
  - name: devdisk
    hostPath:
      path: /dev/disk/
extraVolumeMounts:
  - name: varlog
    readOnly: true
    mountPath: /var/log
  - name: rootfs
    mountPath: /rootfs
    readOnly: true
  - name: containerdsock
    mountPath: /run/containerd/containerd.sock
    readOnly: true
  - name: sys
    mountPath: /sys
    readOnly: true
  - name: devdisk
    mountPath: /dev/disk
    readOnly: true
resources:
  requests:
    cpu: 100m
    memory: 128Mi
  limits:
    cpu: 1
    memory: 2G

ports:
  metrics:
    enabled: true
# podMonitor:
#   enabled: true
# prometheusRule:
#   enabled: true
#   defaultRules:
#     enabled: true

it`s current configuration

resource.attributes.cx.application.name 
resource.attributes.cx.subsystem.name: 
resource.droppedAttributesCount:0
scope.name:
scope.version:
logRecord.body:time="2023-05-18T12:45:21.789Z" level=info duration="106.653µs" method=GET path=index.html size=473 status=0
logRecord.severityNumber:0
logRecord.attributes.k8s.cluster.name:
logRecord.attributes.k8s.container.name:argo-server
logRecord.attributes.k8s.namespace.name:argowf
logRecord.attributes.k8s.pod.name:argo-workflow-argo-workflows-server-544c8467d8-hrmc2
logRecord.attributes.k8s.pod.restart_count:0
logRecord.attributes.log.file.path:/var/log/pods/argowf_argo-workflow-argo-workflows-server-544c8467d8-hrmc2_5226c03f-ac02-4b0b-8246-03693780f345/argo-server/0.log
logRecord.attributes.logtag:F
logRecord.attributes.stream:stderr
logRecord.attributes.time:2023-05-18T12:45:21.789936237Z
logRecord.attributes.uid:5226c03f-ac02-4b0b-8246-03693780f345
18/05/2023 15:45:19.561 pm

If you see i added k8s.deployment.name in k8sattributes but nothing in log context. @swiatekm-sumo

@swiatekm
Copy link
Contributor

This:

pod_association:
  - sources:
      - from: resource_attribute
        name: k8s.pod.start_time
      - from: resource_attribute
        name: k8s.deployment.name

should instead be:

pod_association:
  - sources:
      - from: resource_attribute
        name: k8s.pod.name
      - from: resource_attribute
        name: k8s.namespace.name

To be clear, these sources shouldn't be the same as the attributes you have specified under extract.metadata.

As an aside, you should be careful about using "global" receivers like k8sevents in a DaemonSet context. You're going to get the same events out of every collector Pod, whereas you only want them once per cluster. Same thing is true about the cluster receiver, and probably about aws cloudwatch.

@AndriySidliarskiy
Copy link
Author

what difference between the 2 pod_association for me, i need the deployment name also labels but it`s now extracting metadata and labels from the pod.

@AndriySidliarskiy
Copy link
Author

im sending k8s.pod.start_time for example its not the right decision, the main problem k8sattributes not working

@swiatekm
Copy link
Contributor

pod_association is for telling the processor how to identify your Pod. The attributes in that section need to already be present on the resource. extract.metadata is where you specify what new attributes you want added. Does that make sense?

@AndriySidliarskiy
Copy link
Author

yes but maybe you know why k8sattributes cannot extract the deployment name or another attribute. Maybe he can conflict with another processor.

@AndriySidliarskiy
Copy link
Author

Are there any updates on why the processor cannot extract the pod label? @swiatekm-sumo

@swiatekm
Copy link
Contributor

I'm honestly a bit lost as to the current state of your setup @AndriySidliarskiy. Can you be clearer about:

  • What your configuration is
  • What metadata you do see as a result
  • What metadata you expect to see, but don't

@AndriySidliarskiy
Copy link
Author

  1. k8sattributes it`s not working in EKS 1.26 #22036 (comment)

  2. resource.attributes.cx.application.name
    resource.attributes.cx.subsystem.name:
    resource.droppedAttributesCount:0
    scope.name:
    scope.version:
    logRecord.body:time="2023-05-18T12:45:21.789Z" level=info duration="106.653µs" method=GET path=index.html size=473 status=0
    logRecord.severityNumber:0
    logRecord.attributes.k8s.cluster.name:
    logRecord.attributes.k8s.container.name:argo-server
    logRecord.attributes.k8s.namespace.name:argowf
    logRecord.attributes.k8s.pod.name:argo-workflow-argo-workflows-server-544c8467d8-hrmc2
    logRecord.attributes.k8s.pod.restart_count:0
    logRecord.attributes.log.file.path:/var/log/pods/argowf_argo-workflow-argo-workflows-server-544c8467d8-hrmc2_5226c03f-ac02-4b0b-8246-03693780f345/argo-server/0.log
    logRecord.attributes.logtag:F
    logRecord.attributes.stream:stderr
    logRecord.attributes.time:2023-05-18T12:45:21.789936237Z
    logRecord.attributes.uid:5226c03f-ac02-4b0b-8246-03693780f345
    18/05/2023 15:45:19.561 pm

  3. pipeline.execution - string inside log context in coralogix log

@swiatekm-sumo

@swiatekm
Copy link
Contributor

I see the problem now, you have the identifying information in record attributes instead of resource attributes. They need to be at the resource level. In your filelog receiver configuration, change:

        - type: move
          from: attributes.namespace
          to: attributes["k8s.namespace.name"]
        - type: move
          from: attributes.restart_count
          to: attributes["k8s.pod.restart_count"]

        - type: move
          from: attributes.pod_name
          to: attributes["k8s.pod.name"]
        - type: move
          from: attributes.container_name
          to: attributes["k8s.container.name"]

to:

        - type: move
          from: attributes.namespace
          to: resource["k8s.namespace.name"]
        - type: move
          from: attributes.restart_count
          to: resource["k8s.pod.restart_count"]

        - type: move
          from: attributes.pod_name
          to: resource["k8s.pod.name"]
        - type: move
          from: attributes.container_name
          to: resource["k8s.container.name"]

@AndriySidliarskiy
Copy link
Author

AndriySidliarskiy commented May 24, 2023

@swiatekm-sumo but the main problem it`s the extract pod label. and for me this solution not working. I cannot extract metadata and put it in log.

k8sattributes:
      auth_type: "serviceAccount"
      passthrough: false
      filter:
        node_from_env_var: KUBE_NODE_NAME     
      extract:
        metadata:
          - k8s.pod.start_time
          - k8s.deployment.name
        labels:
          - tag_name: pipeline.execution
            key: pipeline.execution
            from: pod
          - tag_name: pipeline.project
            key: pipeline.project
            from: pod
        annotations:
          - tag_name: monitoring # extracts value of annotation from pods with key `annotation-one` and inserts it as a tag with key `a1`
            key: monitoring
            from: pod

@swiatekm
Copy link
Contributor

So you do see k8s.pod.start_time and k8s.deployment.name in your resource attributes, but not the tags from Pod labels?

@AndriySidliarskiy
Copy link
Author

and after this #22036 (comment) i have k8s.pod.name inside the resource but the log doesn't have a deployment name and labels that must be extracted by k8sattributtes. @swiatekm-sumo

resource.attributes.cx.application.name:
resource.attributes.cx.subsystem.name:
resource.attributes.k8s.container.name:argo-server
resource.attributes.k8s.namespace.name:argowf
resource.attributes.k8s.pod.name:argo-workflow-argo-workflows-server-7cdb9788bb-dmrxc
resource.attributes.k8s.pod.restart_count:0
resource.droppedAttributesCount:0
scope.name:
scope.version:
logRecord.body:time="2023-05-24T09:51:00.040Z" level=info duration="109.532µs" method=GET path=index.html size=473 status=0
logRecord.severityNumber:0
logRecord.attributes.k8s.cluster.name:cs-dev-eks-cluster
logRecord.attributes.log.file.path:/var/log/pods/argowf_argo-workflow-argo-workflows-server-7cdb9788bb-dmrxc_9a03c9b3-9af8-4aa1-bed5-5ee773fc928a/argo-server/0.log
logRecord.attributes.logtag:F
logRecord.attributes.stream:stderr
logRecord.attributes.time:2023-05-24T09:51:00.041367531Z
logRecord.attributes.uid:9a03c9b3-9af8-4aa1-bed5-5ee773fc928a

@swiatekm
Copy link
Contributor

Have you also implemented the changes from #22036 (comment)?

@AndriySidliarskiy
Copy link
Author

yes, in my opinion, k8sattributes not working, in opentemeetry logs he launched but cannot extract metadata.

@swiatekm
Copy link
Contributor

Yes, I can see it's not working, I'm trying to figure out what's wrong with your configuration that's causing it. Can you post your current configuration again? If you're looking at collector logs, can you post those as well?

@AndriySidliarskiy
Copy link
Author

@swiatekm-sumo

global:
  domain: "coralogix.com"
  defaultApplicationName: ""
  defaultSubsystemName: ""

mode: daemonset
hostNetwork: false
fullnameOverride: otel-coralogix
clusterRole:
  create: true
  name: "otel-coralogix"
  rules:
    - apiGroups:
      - "*"
      resources:
      - events
      - namespaces
      - namespaces/status
      - nodes
      - nodes/spec
      - pods
      - pods/status
      - replicationcontrollers
      - replicationcontrollers/status
      - resourcequotas
      - services
      - endpoints
      - nodes/proxy
      verbs:
      - get
      - list
      - watch
    - apiGroups:
      - apps
      resources:
      - daemonsets
      - deployments
      - replicasets
      - statefulsets
      verbs:
      - get
      - list
      - watch
    - apiGroups:
      - extensions
      resources:
      - daemonsets
      - deployments
      - replicasets
      verbs:
      - get
      - list
      - watch
    - apiGroups:
      - batch
      resources:
      - jobs
      - cronjobs
      verbs:
      - get
      - list
      - watch
    - apiGroups:
      - "*"
      resources:
      - horizontalpodautoscalers
      verbs:
      - get
      - list
      - watch
    - apiGroups:
      - "*"
      resources:
      - nodes/stats
      - configmaps
      - events
      - leases
      verbs:
      - get
      - list
      - watch
      - create
      - update
  clusterRoleBinding:
    name: "otel-coralogix"
presets:
  logsCollection:
    enabled: false
    storeCheckpoints: true
  # kubernetesAttributes:
  #   enabled: true
  # hostMetrics:
  #   enabled: true
  # kubeletMetrics:
  #   enabled: true

extraEnvs:
- name: CORALOGIX_PRIVATE_KEY
  value:
- name: KUBE_NODE_NAME
  valueFrom:
    fieldRef:
      apiVersion: v1
      fieldPath: spec.nodeName
- name: HOST_IP
  valueFrom:
    fieldRef:
      fieldPath: status.hostIP
- name: HOST_NAME
  valueFrom:
    fieldRef:
      fieldPath: spec.nodeName
- name: K8S_NAMESPACE
  valueFrom:
      fieldRef:
        fieldPath: metadata.namespace
- name: CLUSTER_NAME
  value: ""
config:
  extensions:
    zpages:
      endpoint: localhost:55679
    pprof:
      endpoint: localhost:1777
  exporters:
    coralogix:
      timeout: "1m"
      private_key: "${CORALOGIX_PRIVATE_KEY}"
      domain: "{{ .Values.global.domain }}"
      application_name_attributes:
      - "cloud.account.id"
      application_name: "{{.Values.global.defaultApplicationName }}"
      subsystem_name: "{{.Values.global.defaultSubsystemName }}"
  processors:
    # transform:
    #   error_mode: ignore
    #   log_statements:
    #     - context: resource
    #       statements:
    #         - keep_keys(attributes, ["cloud.region", "host.id", "host.type"])
    #     - context: log
    #       statements:
    #         - keep_keys(attributes, ["time", "log.file.name", "k8s.cluster.name", "k8s.namespace.name", "k8s.pod.name", "log.file.path", "k8s.ident", "message", "log", "k8s.pod.restart_count", "k8s.job.name"])
    transform/cw:
      error_mode: ignore
      log_statements:
        - context: resource
          statements:
            - keep_keys(attributes, ["cloud.region", "cloudwatch.log.group.name", "cloudwatch.log.stream"])
    k8sattributes:
      auth_type: "serviceAccount"
      passthrough: false
      filter:
        node_from_env_var: KUBE_NODE_NAME     
      extract:
        metadata:
          - k8s.pod.start_time
          - k8s.pod.name
          - k8s.deployment.name
          - k8s.namespace.name
        labels:
          - tag_name: c2i.pipeline.execution
            key: c2i.pipeline.execution
            from: pod
          - tag_name: c2i.pipeline.project
            key: c2i.pipeline.project
            from: pod
        annotations:
          - tag_name: monitoring # extracts value of annotation from pods with key `annotation-one` and inserts it as a tag with key `a1`
            key: monitoring
            from: pod
      pod_association:
        - sources:
            - from: resource_attribute
              name: k8s.pod.name
            - from: resource_attribute
              name: k8s.deployment.name
            - from: resource_attribute
              name: c2i.pipeline.project
    memory_limiter: null # Will get the k8s resource limits
    resourcedetection/env:
      detectors: ["env", "ec2"]
      timeout: 2s
      override: false
    spanmetrics:
      metrics_exporter: coralogix
      dimensions:
        - name: "k8s.pod.name"
        - name: "k8s.cronjob.name"
        - name: "k8s.job.name"
        - name: "k8s.node.name"
        - name: "k8s.namespace.name" 
  receivers:
    awscloudwatch:
      region: eu-west-1
      logs:
        poll_interval: "30s"
        groups:
          autodiscover:
            limit: 100
            prefix: /aws/vendedlogs/states/
            streams:
              prefixes: [states/]
    filelog:
      include: [/var/log/pods/*/*/*.log]
      include_file_name: false
      include_file_path: true
      operators:
        - type: router
          id: get-format
          routes:
            - output: parser-docker
              expr: 'body matches "^\\{"'
            - output: parser-containerd
              expr: 'body matches "^[^ Z]+Z"'
        - type: regex_parser
          id: parser-containerd
          regex: '^(?P<time>[^ ^Z]+Z) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<message>.*)$'
          output: extract_metadata_from_filepath
          timestamp:
            parse_from: attributes.time
            layout: '%Y-%m-%dT%H:%M:%S.%LZ'
        - type: json_parser
          id: parser-docker
          output: extract_metadata_from_filepath
          timestamp:
            parse_from: attributes.time
            layout: '%Y-%m-%dT%H:%M:%S.%LZ'
        - type: regex_parser
          id: extract_metadata_from_filepath
          regex: '^.*\/(?P<namespace>[^_]+)_(?P<pod_name>[^_]+)_(?P<uid>[a-f0-9\-]{36})\/(?P<container_name>[^\._]+)\/(?P<restart_count>\d+)\.log$'
          parse_from: attributes["log.file.path"]
        - type: move
          from: attributes.namespace
          to: resource["k8s.namespace.name"]
        - type: move
          from: attributes.restart_count
          to: resource["k8s.pod.restart_count"]
        - type: move
          from: attributes.message
          to: body
        - type: move
          from: attributes.pod_name
          to: resource["k8s.pod.name"]
        - type: move
          from: attributes.container_name
          to: resource["k8s.container.name"]
        - type: add
          field: attributes["k8s.cluster.name"]
          value: '${CLUSTER_NAME}'
    filelog/2:
      include: [ /var/log/messages, /var/log/dmesg, /var/log/secure]
      include_file_name: false
      include_file_path: true
      operators:
        - type: router
          id: get-format
          routes:
            - output: parser-containerd
              expr: 'body matches " .* containerd: .*"'
            - output: parser-kubelet
              expr: 'body matches " .* kubelet: .*"'
            - output: parser-syslog
              expr: 'body matches " .* systemd: .*"'
            - output: parser-dhclient
              expr: 'body matches ".* dhclient[4608]: .*"'
        - type: regex_parser
          id: parser-dhclient
          regex: '^(?P<time>[^ ]* {1,2}[^ ]* [^ ]*) (?P<host>[^ ]*) (?P<ident>[^:]*): (?P<message>.*)$'
          output: move
        - type: syslog_parser
          id: parser-syslog
          protocol: rfc3164
        - type: regex_parser
          id: parser-containerd
          regex: '^(?P<time>^[^ ]* {1,2}[^ ]* [^ ]*) (?P<host>[^ ]*) (?P<indent>[a-zA-z0-9_\/\.\-]*): time=\".+\" (?P<level>level=[a-zA-Z]+) (?P<msg>msg=".*")'
          output: move
        - type: regex_parser
          id: parser-kubelet
          regex: '^(?P<time>[^ ]* {1,2}[^ ]* [^ ]*) (?P<host>[^ ]*) (?P<ident>[a-zA-Z0-9_\/.\-]*): (?P<level>[A-Z][a-z]*[0-9]*) (?P<pid>[0-9]+) (?P<source>[^:]*): *(?P<message>.*)$'
          output: move
        - type: file_input
          id: parser-dmesg
          include:
            - /var/log/dmesg
            - /var/log/secure
        - type: move
          from: attributes.message
          to: body
        - type: move
          from: attributes.ident
          to: attributes["k8s.ident"]
        - type: add
          field: attributes["k8s.cluster.name"]
          value: '${CLUSTER_NAME}'
    # k8s_events:
    #   auth_type: "serviceAccount"
    #   namespaces: [default, kube-system, general, monitoring]
    k8s_cluster:
      auth_type: "serviceAccount"
      allocatable_types_to_report: [cpu, memory, storage, ephemeral-storage]
      node_conditions_to_report: [Ready, MemoryPressure]
    otlp:
      protocols:
        grpc:
          endpoint: ${MY_POD_IP}:4317
        http:
          endpoint: ${MY_POD_IP}:4318
  service:
    extensions:
    - zpages
    - pprof
    - health_check
    - memory_ballast
    telemetry:
      metrics:
        address: ${MY_POD_IP}:8888
    pipelines:
      logs:
        exporters:
          - coralogix
        processors:
          - batch
          - resourcedetection/env
          - transform/cw
        receivers:
          - awscloudwatch
      logs/eks:
        exporters:
          - coralogix
        processors:
          - batch
          - resourcedetection/env
          - k8sattributes
        receivers:
          # - k8s_events
          - filelog
          - filelog/2
      metrics:
        exporters:
          - coralogix
        processors:
          - memory_limiter
          - resourcedetection/env
          - batch
        receivers:
          - otlp
          - k8s_cluster
      traces:
        exporters:
          - coralogix
        processors:
          - memory_limiter
          - spanmetrics
          - batch
          - resourcedetection/env
        receivers:
          - otlp
          - zipkin
tolerations: 
  - operator: Exists

extraVolumes:
  - name: varlog
    hostPath:
      path: /var/log
      type: ''
  - name: rootfs
    hostPath:
      path: /
  - name: varlibdocker
    hostPath:
      path: /var/lib/docker
  - name: containerdsock
    hostPath:
      path: /run/containerd/containerd.sock
  - name: sys
    hostPath:
      path: /sys
  - name: devdisk
    hostPath:
      path: /dev/disk/
extraVolumeMounts:
  - name: varlog
    readOnly: true
    mountPath: /var/log
  - name: rootfs
    mountPath: /rootfs
    readOnly: true
  - name: containerdsock
    mountPath: /run/containerd/containerd.sock
    readOnly: true
  - name: sys
    mountPath: /sys
    readOnly: true
  - name: devdisk
    mountPath: /dev/disk
    readOnly: true
resources:
  requests:
    cpu: 100m
    memory: 128Mi
  limits:
    cpu: 1
    memory: 2G

ports:
  metrics:
    enabled: true
# podMonitor:
#   enabled: true
# prometheusRule:
#   enabled: true
#   defaultRules:
#     enabled: true
2023-05-24T10:22:51.268Z	info	service/telemetry.go:113	Setting up own telemetry...
2023-05-24T10:22:51.269Z	info	service/telemetry.go:136	Serving Prometheus metrics	{"address": "10.125.37.140:8888", "level": "Basic"}
2023-05-24T10:22:51.269Z	info	processor/processor.go:300	Deprecated component. Will be removed in future releases.	{"kind": "processor", "name": "spanmetrics", "pipeline": "traces"}
2023-05-24T10:22:51.269Z	info	[email protected]/processor.go:139	Building spanmetrics	{"kind": "processor", "name": "spanmetrics", "pipeline": "traces"}
2023-05-24T10:22:51.272Z	info	[email protected]/memorylimiter.go:149	Using percentage memory limiter	{"kind": "processor", "name": "memory_limiter", "pipeline": "traces", "total_memory_mib": 1907, "limit_percentage": 80, "spike_limit_percentage": 25}
2023-05-24T10:22:51.272Z	info	[email protected]/memorylimiter.go:113	Memory limiter configured	{"kind": "processor", "name": "memory_limiter", "pipeline": "traces", "limit_mib": 1525, "spike_limit_mib": 476, "check_interval": 5}
2023-05-24T10:22:51.272Z	info	kube/client.go:101	k8s filtering	{"kind": "processor", "name": "k8sattributes", "pipeline": "logs/eks", "labelSelector": "", "fieldSelector": "spec.nodeName=ip-10-125-46-141.eu-west-1.compute.internal"}
2023-05-24T10:22:51.277Z	info	service/service.go:141	Starting otelcol-contrib...	{"Version": "0.77.0", "NumCPU": 96}
2023-05-24T10:22:51.277Z	info	extensions/extensions.go:41	Starting extensions...
2023-05-24T10:22:51.277Z	info	extensions/extensions.go:44	Extension is starting...	{"kind": "extension", "name": "zpages"}
2023-05-24T10:22:51.277Z	info	[email protected]/zpagesextension.go:64	Registered zPages span processor on tracer provider	{"kind": "extension", "name": "zpages"}
2023-05-24T10:22:51.277Z	info	[email protected]/zpagesextension.go:74	Registered Host's zPages	{"kind": "extension", "name": "zpages"}
2023-05-24T10:22:51.277Z	info	[email protected]/zpagesextension.go:86	Starting zPages extension	{"kind": "extension", "name": "zpages", "config": {"TCPAddr":{"Endpoint":"localhost:55679"}}}
2023-05-24T10:22:51.277Z	info	extensions/extensions.go:48	Extension started.	{"kind": "extension", "name": "zpages"}
2023-05-24T10:22:51.277Z	info	extensions/extensions.go:44	Extension is starting...	{"kind": "extension", "name": "pprof"}
2023-05-24T10:22:51.278Z	info	[email protected]/pprofextension.go:71	Starting net/http/pprof server	{"kind": "extension", "name": "pprof", "config": {"TCPAddr":{"Endpoint":"localhost:1777"},"BlockProfileFraction":0,"MutexProfileFraction":0,"SaveToFile":""}}
2023-05-24T10:22:51.278Z	info	extensions/extensions.go:48	Extension started.	{"kind": "extension", "name": "pprof"}
2023-05-24T10:22:51.278Z	info	extensions/extensions.go:44	Extension is starting...	{"kind": "extension", "name": "health_check"}
2023-05-24T10:22:51.278Z	info	[email protected]/healthcheckextension.go:45	Starting health_check extension	{"kind": "extension", "name": "health_check", "config": {"Endpoint":"0.0.0.0:13133","TLSSetting":null,"CORS":null,"Auth":null,"MaxRequestBodySize":0,"IncludeMetadata":false,"Path":"/","ResponseBody":null,"CheckCollectorPipeline":{"Enabled":false,"Interval":"5m","ExporterFailureThreshold":5}}}
2023-05-24T10:22:51.278Z	warn	internal/warning.go:51	Using the 0.0.0.0 address exposes this server to every network interface, which may facilitate Denial of Service attacks	{"kind": "extension", "name": "health_check", "documentation": "https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md#safeguards-against-denial-of-service-attacks"}
2023-05-24T10:22:51.278Z	info	extensions/extensions.go:48	Extension started.	{"kind": "extension", "name": "health_check"}
2023-05-24T10:22:51.278Z	info	extensions/extensions.go:44	Extension is starting...	{"kind": "extension", "name": "memory_ballast"}
2023-05-24T10:22:51.295Z	info	[email protected]/memory_ballast.go:52	Setting memory ballast	{"kind": "extension", "name": "memory_ballast", "MiBs": 762}
2023-05-24T10:22:51.297Z	info	extensions/extensions.go:48	Extension started.	{"kind": "extension", "name": "memory_ballast"}
2023-05-24T10:22:51.298Z	info	internal/resourcedetection.go:136	began detecting resource information	{"kind": "processor", "name": "resourcedetection/env", "pipeline": "traces"}
2023-05-24T10:22:53.298Z	info	internal/resourcedetection.go:150	detected resource information	{"kind": "processor", "name": "resourcedetection/env", "pipeline": "traces", "resource": {}}
2023-05-24T10:22:53.299Z	info	adapter/receiver.go:56	Starting stanza receiver	{"kind": "receiver", "name": "filelog/2", "data_type": "logs"}
2023-05-24T10:22:53.299Z	info	[email protected]/processor.go:182	Starting spanmetricsprocessor	{"kind": "processor", "name": "spanmetrics", "pipeline": "traces"}
2023-05-24T10:22:53.299Z	info	[email protected]/processor.go:202	Found exporter	{"kind": "processor", "name": "spanmetrics", "pipeline": "traces", "spanmetrics-exporter": "coralogix"}
2023-05-24T10:22:53.320Z	info	[email protected]/otlp.go:94	Starting GRPC server	{"kind": "receiver", "name": "otlp", "data_type": "traces", "endpoint": "10.125.37.140:4317"}
2023-05-24T10:22:53.320Z	info	[email protected]/otlp.go:112	Starting HTTP server	{"kind": "receiver", "name": "otlp", "data_type": "traces", "endpoint": "10.125.37.140:4318"}
2023-05-24T10:22:53.320Z	info	adapter/receiver.go:56	Starting stanza receiver	{"kind": "receiver", "name": "filelog", "data_type": "logs"}
2023-05-24T10:22:53.320Z	info	[email protected]/receiver.go:60	Starting shared informers and wait for initial cache sync.	{"kind": "receiver", "name": "k8s_cluster", "data_type": "metrics"}

@AndriySidliarskiy
Copy link
Author

@swiatekm-sumo i add this for test purposes and it`s also not working.

@swiatekm
Copy link
Contributor

Also, this:

      pod_association:
        - sources:
            - from: resource_attribute
              name: k8s.deployment.name

should have k8s.namespace.name instead of k8s.deployment.name.

@swiatekm
Copy link
Contributor

If that doesn't help, please enable debug logging by setting:

service:
  telemetry:
    logs:
      level: DEBUG

and post the collector logs you see. There's probably going to be a lot, so it would help a lot if you only posted logs from the k8sattributes processor.

@AndriySidliarskiy
Copy link
Author

@swiatekm-sumo hi. New logs from DEBUG

2023-05-25T07:43:16.274Z	debug	[email protected]/processor.go:102	evaluating pod identifier	{"kind": "processor", "name": "k8sattributes", "pipeline": "logs/eks", "value": [{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""}]}
2023-05-25T07:43:16.332Z	error	fileconsumer/reader.go:62	Failed to seek	{"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "path": "/var/log/pods/monitoring_job-remover-7c46f7d89d-ssqb8_7c5fb391-88f2-471f-9a79-333534cdb41f/job-remover/747.log", "error": "seek /var/log/pods/monitoring_job-remover-7c46f7d89d-ssqb8_7c5fb391-88f2-471f-9a79-333534cdb41f/job-remover/747.log: file already closed"}
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer.(*Reader).ReadToEnd
	github.com/open-telemetry/opentelemetry-collector-contrib/pkg/[email protected]/fileconsumer/reader.go:62
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer.(*Manager).consume.func1
	github.com/open-telemetry/opentelemetry-collector-contrib/pkg/[email protected]/fileconsumer/file.go:148
2023-05-25T07:43:16.474Z	debug	[email protected]/processor.go:102	evaluating pod identifier	{"kind": "processor", "name": "k8sattributes", "pipeline": "logs/eks", "value": [{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""}]}
2023-05-25T07:43:16.499Z	debug	fileconsumer/reader.go:161	Problem closing reader	{"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "path": "/var/log/pods/monitoring_job-remover-7c46f7d89d-ssqb8_7c5fb391-88f2-471f-9a79-333534cdb41f/job-remover/747.log", "error": "close /var/log/pods/monitoring_job-remover-7c46f7d89d-ssqb8_7c5fb391-88f2-471f-9a79-333534cdb41f/job-remover/747.log: file already closed"}
2023-05-25T07:43:16.676Z	debug	[email protected]/processor.go:102	evaluating pod identifier	{"kind": "processor", "name": "k8sattributes", "pipeline": "logs/eks", "value": [{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""}]}
2023-05-25T07:43:16.876Z	debug	[email protected]/processor.go:102	evaluating pod identifier	{"kind": "processor", "name": "k8sattributes", "pipeline": "logs/eks", "value": [{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""}]}
2023-05-25T07:43:17.077Z	debug	[email protected]/processor.go:102	evaluating pod identifier	{"kind": "processor", "name": "k8sattributes", "pipeline": "logs/eks", "value": [{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""}]}
2023-05-25T07:43:17.098Z	debug	fileconsumer/file.go:129	Consuming files	{"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer"}
2023-05-25T07:43:17.277Z	debug	[email protected]/processor.go:102	evaluating pod identifier	{"kind": "processor", "name": "k8sattributes", "pipeline": "logs/eks", "value": [{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""}]}
2023-05-25T07:43:17.296Z	debug	fileconsumer/file.go:129	Consuming files	{"kind": "receiver", "name": "filelog/2", "data_type": "logs", "component": "fileconsumer"}
2023-05-25T07:43:17.478Z	debug	[email protected]/processor.go:102	evaluating pod identifier	{"kind": "processor", "name": "k8sattributes", "pipeline": "logs/eks", "value": [{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""}]}
2023-05-25T07:43:17.478Z	debug	[email protected]/processor.go:102	evaluating pod identifier	{"kind": "processor", "name": "k8sattributes", "pipeline": "logs/eks", "value": [{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""}]}
2023-05-25T07:43:17.498Z	error	fileconsumer/reader.go:62	Failed to seek	{"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "path": "/var/log/pods/monitoring_job-remover-7c46f7d89d-ssqb8_7c5fb391-88f2-471f-9a79-333534cdb41f/job-remover/748.log", "error": "seek /var/log/pods/monitoring_job-remover-7c46f7d89d-ssqb8_7c5fb391-88f2-471f-9a79-333534cdb41f/job-remover/748.log: file already closed"}
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer.(*Reader).ReadToEnd
	github.com/open-telemetry/opentelemetry-collector-contrib/pkg/[email protected]/fileconsumer/reader.go:62
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer.(*detectLostFiles).readLostFiles.func1
	github.com/open-telemetry/opentelemetry-collector-contrib/pkg/[email protected]/fileconsumer/roller_other.go:40
2023-05-25T07:43:17.499Z	debug	fileconsumer/reader.go:161	Problem closing reader	{"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "path": "/var/log/pods/monitoring_job-remover-7c46f7d89d-ssqb8_7c5fb391-88f2-471f-9a79-333534cdb41f/job-remover/747.log", "error": "close /var/log/pods/monitoring_job-remover-7c46f7d89d-ssqb8_7c5fb391-88f2-471f-9a79-333534cdb41f/job-remover/747.log: file already closed"}
2023-05-25T07:43:17.678Z	debug	[email protected]/processor.go:102	evaluating pod identifier	{"kind": "processor", "name": "k8sattributes", "pipeline": "logs/eks", "value": [{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""}]}
2023-05-25T07:43:17.679Z	debug	[email protected]/processor.go:102	evaluating pod identifier	{"kind": "processor", "name": "k8sattributes", "pipeline": "logs/eks", "value": [{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""}]}
2023-05-25T07:43:17.879Z	debug	[email protected]/processor.go:102	evaluating pod identifier	{"kind": "processor", "name": "k8sattributes", "pipeline": "logs/eks", "value": [{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""}]}
2023-05-25T07:43:17.879Z	debug	[email protected]/processor.go:102	evaluating pod identifier	{"kind": "processor", "name": "k8sattributes", "pipeline": "logs/eks", "value": [{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""}]}
2023-05-25T07:43:18.080Z	debug	[email protected]/processor.go:102	evaluating pod identifier	{"kind": "processor", "name": "k8sattributes", "pipeline": "logs/eks", "value": [{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""}]}
2023-05-25T07:43:18.280Z	debug	[email protected]/processor.go:102	evaluating pod identifier	{"kind": "processor", "name": "k8sattributes", "pipeline": "logs/eks", "value": [{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""}]}

@swiatekm
Copy link
Contributor

Thanks! That confirms my hypothesis that the problem lies in identifying the Pod for the given resource. These log lines:

debug	[email protected]/processor.go:102	evaluating pod identifier	{"kind": "processor", "name": "k8sattributes", "pipeline": "logs/eks", "value": [{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""}]

mean that we can't find the Pod identifier.

Can you confirm the following facts for me:

  1. When you look at the logs in Coralogix, you see the following attributes:
resource.attributes.k8s.namespace.name:argowf
resource.attributes.k8s.pod.name:argo-workflow-argo-workflows-server-7cdb9788bb-dmrxc
  1. The pod_association section in your k8sprocessor looks like this:
      pod_association:
        - sources:
            - from: resource_attribute
              name: k8s.pod.name
            - from: resource_attribute
              name: k8s.namespace.name

@AndriySidliarskiy
Copy link
Author

i`m using file log receiver to extract the namespace name and pod name from the file path but we can test this with k8s.deployment.name. Now configuration looks like

    k8sattributes:
      auth_type: "serviceAccount"
      passthrough: false
      filter:
        node_from_env_var: KUBE_NODE_NAME     
      extract:
        metadata:
          - k8s.pod.name
          - k8s.deployment.name
          - k8s.namespace.name
        labels:
          - tag_name: c2i.pipeline.execution
            key: c2i.pipeline.execution
            from: pod
          - tag_name: c2i.pipeline.project
            key: c2i.pipeline.project
            from: pod
        annotations:
          - tag_name: monitoring # extracts value of annotation from pods with key `annotation-one` and inserts it as a tag with key `a1`
            key: monitoring
            from: pod

@AndriySidliarskiy
Copy link
Author

AndriySidliarskiy commented May 25, 2023

@swiatekm-sumo but also when i committed part of moving form attributes to resource k8s.pod.name etc i have the same errors.

    filelog:
      include: [/var/log/pods/*/*/*.log]
      include_file_name: false
      include_file_path: true
      operators:
        - type: router
          id: get-format
          routes:
            - output: parser-docker
              expr: 'body matches "^\\{"'
            - output: parser-containerd
              expr: 'body matches "^[^ Z]+Z"'
        - type: regex_parser
          id: parser-containerd
          regex: '^(?P<time>[^ ^Z]+Z) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<message>.*)$'
          output: extract_metadata_from_filepath
          timestamp:
            parse_from: attributes.time
            layout: '%Y-%m-%dT%H:%M:%S.%LZ'
        - type: json_parser
          id: parser-docker
          output: extract_metadata_from_filepath
          timestamp:
            parse_from: attributes.time
            layout: '%Y-%m-%dT%H:%M:%S.%LZ'
        - type: regex_parser
          id: extract_metadata_from_filepath
          regex: '^.*\/(?P<namespace>[^_]+)_(?P<pod_name>[^_]+)_(?P<uid>[a-f0-9\-]{36})\/(?P<container_name>[^\._]+)\/(?P<restart_count>\d+)\.log$'
          parse_from: attributes["log.file.path"]
        # - type: move
        #   from: attributes.namespace
        #   to: resource["k8s.namespace.name"]
        # - type: move
        #   from: attributes.restart_count
        #   to: resource["k8s.pod.restart_count"]
        # - type: move
        #   from: attributes.message
        #   to: body
        # - type: move
        #   from: attributes.pod_name
        #   to: resource["k8s.pod.name"]
        # - type: move
        #   from: attributes.container_name
        #   to: resource["k8s.container.name"]

@AndriySidliarskiy
Copy link
Author

and also with this configuration i have the same error

    k8sattributes:
      auth_type: "serviceAccount"
      passthrough: false
      filter:
        node_from_env_var: KUBE_NODE_NAME     
      extract:
        metadata:
          - k8s.pod.name
          - k8s.deployment.name
          - k8s.namespace.name
        labels:
          - tag_name: c2i.pipeline.execution
            key: c2i.pipeline.execution
            from: pod
          - tag_name: c2i.pipeline.project
            key: c2i.pipeline.project
            from: pod
        annotations:
          - tag_name: monitoring # extracts value of annotation from pods with key `annotation-one` and inserts it as a tag with key `a1`
            key: monitoring
            from: pod
      pod_association:
      - sources:
          - from: resource_attribute
            name: k8s.pod.name
          - from: resource_attribute
            name: k8s.namespace.name

@swiatekm
Copy link
Contributor

That should work. At the very least k8sattributes processor should compute the right identifier. Even with the above config, you see the same logs?

@AndriySidliarskiy
Copy link
Author

yes

@AndriySidliarskiy
Copy link
Author

could you have time to test this on eks environment? @swiatekm-sumo

@swiatekm
Copy link
Contributor

I don't think this has anything to do with the specific K8s distribution in play, but I will test your specific config in a KinD cluster.

@AndriySidliarskiy
Copy link
Author

@swiatekm-sumo Thanks but how much time it can take to test?

@swiatekm
Copy link
Contributor

Just to be clear, I'm not going to be committing to any timelines here, any assistance offered in this issue is on a best-effort basis.

With that said, I tested the following configurations:

  k8sattributes:
    auth_type: "serviceAccount"
    passthrough: false
    filter:
      node_from_env_var: KUBE_NODE_NAME     
    extract:
      metadata:
        - k8s.pod.name
        - k8s.pod.uid
        - k8s.deployment.name
        - k8s.namespace.name
      labels:
        - tag_name: c2i.pipeline.execution
          key: c2i.pipeline.execution
          from: pod
        - tag_name: c2i.pipeline.project
          key: c2i.pipeline.project
          from: pod
      annotations:
        - tag_name: monitoring # extracts value of annotation from pods with key `annotation-one` and inserts it as a tag with key `a1`
          key: monitoring
          from: pod
    pod_association:
    - sources:
        - from: resource_attribute
          name: k8s.pod.name
        - from: resource_attribute
          name: k8s.namespace.name
    - id: extract-metadata-from-filepath                                                                                                                                                                                                                     
      parse_from: attributes["log.file.path"]                                                                                                                                                                                                                
      regex: ^.*\/(?P<namespace>[^_]+)_(?P<pod_name>[^_]+)_(?P<uid>[a-f0-9\-]+)\/(?P<container_name>[^\._]+)\/(?P<run_id>\d+)\.log$                                                                                                                          
      type: regex_parser                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
    - from: attributes.container_name                                                                                                                                                                                                                        
      to: resource["k8s.container.name"]                                                                                                                                                                                                                     
      type: move                                                                                                                                                                                                                                             
    - from: attributes.namespace                                                                                                                                                                                                                             
      to: resource["k8s.namespace.name"]                                                                                                                                                                                                                     
      type: move                                                                                                                                                                                                                                             
    - from: attributes.pod_name                                                                                                                                                                                                                              
      to: resource["k8s.pod.name"]                                                                                                                                                                                                                           

and this worked as expected:

otelcol 2023-05-25T10:25:41.168Z    debug    [email protected]/processor.go:113    evaluating pod identifier    {"kind": "processor", "name": "k8sattributes", "pipeline": "logs/containers", "value": [{"Source":{"From":"resource_attribute","
Name":"k8s.pod.name"},"Value":"collection-sumologic-otelcol-logs-collector-htbp7"},{"Source":{"From":"resource_attribute","Name":"k8s.namespace.name"},"Value":"sumologic"},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Valu
e":""}]}

So there must be something in your actual configuration that doesn't match what you've posted here.

@AndriySidliarskiy
Copy link
Author

@swiatekm-sumo where you launched it? it`s like eks, aks local Kubernetes or what:?

@AndriySidliarskiy
Copy link
Author

AndriySidliarskiy commented May 25, 2023

and could you please clarify how k8sattributes extracts data, Is it like he calls endpoint, or how it works?

@AndriySidliarskiy
Copy link
Author

and could you please provide a full confirmation for me that you use, thanks a lot.

@swiatekm
Copy link
Contributor

@swiatekm-sumo where you launched it? it`s like eks, aks local Kubernetes or what:?

In a local KiND cluster.

and could you please clarify how k8sattributes extracts data, Is it like he calls endpoint, or how it works?

Are you asking how it gets metadata from the K8s apiserver? It establishes a Watch for the necessary resources (mostly Pods) and maintains a local cache of them via the standard client-go mechanism of informers.

For the issue you're experiencing, the problem isn't that metadata though, it's that the processor can't tell which Pod your log records come from. That's what the logs about identifying the Pod mean.

@AndriySidliarskiy
Copy link
Author

yea, i understand but it`s interesting why this processor cannot identify pod.

@swiatekm
Copy link
Contributor

and could you please provide a full confirmation for me that you use, thanks a lot.

Here's a stripped down manifest where the Pod is identified correctly:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: otelcol-logs-collector
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: otelcol-logs-collector
  template:
    metadata:
      labels:
        app.kubernetes.io/name: otelcol-logs-collector
    spec:
      securityContext:
        fsGroup: 0
        runAsGroup: 0
        runAsUser: 0
      containers:
      - args:
        - --config=/etc/otelcol/config.yaml
        image: "otel/opentelemetry-collector-contrib:0.77.0"
        name: otelcol
        volumeMounts:
        - mountPath: /etc/otelcol
          name: otelcol-config
        - mountPath: /var/log/pods
          name: varlogpods
          readOnly: true
        env:
        - name: KUBE_NODE_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName
      volumes:
      - configMap:
          defaultMode: 420
          items:
          - key: config.yaml
            path: config.yaml
          name: otelcol-logs-collector
        name: otelcol-config
      - hostPath:
          path: /var/log/pods
          type: ""
        name: varlogpods
---
# Source: sumologic/templates/logs/collector/otelcol/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: otelcol-logs-collector
  labels:
    app: otelcol-logs-collector
data:
  config.yaml: |
    exporters:
      logging:
    processors:
      k8sattributes:
        auth_type: serviceAccount
        extract:
          annotations:
          - from: pod
            key: monitoring
            tag_name: monitoring
          labels:
          - from: pod
            key: c2i.pipeline.execution
            tag_name: c2i.pipeline.execution
          - from: pod
            key: c2i.pipeline.project
            tag_name: c2i.pipeline.project
          metadata:
          - k8s.pod.name
          - k8s.pod.uid
          - k8s.deployment.name
          - k8s.namespace.name
        filter:
          node_from_env_var: KUBE_NODE_NAME
        passthrough: false
        pod_association:
        - sources:
          - from: resource_attribute
            name: k8s.pod.name
          - from: resource_attribute
            name: k8s.namespace.name
    receivers:
      filelog/containers:
        include:
        - /var/log/pods/*/*/*.log
        include_file_name: false
        include_file_path: true
        operators:
        - id: parser-containerd
          output: merge-cri-lines
          parse_to: body
          regex: ^(?P<time>[^ ^Z]+Z) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*)( |)(?P<log>.*)$
          timestamp:
            layout: '%Y-%m-%dT%H:%M:%S.%LZ'
            parse_from: body.time
          type: regex_parser
        - combine_field: body.log
          combine_with: ""
          id: merge-cri-lines
          is_last_entry: body.logtag == "F"
          overwrite_with: newest
          source_identifier: attributes["log.file.path"]
          type: recombine
        - id: extract-metadata-from-filepath
          parse_from: attributes["log.file.path"]
          parse_to: attributes
          regex: ^.*\/(?P<namespace>[^_]+)_(?P<pod_name>[^_]+)_(?P<uid>[a-f0-9\-]+)\/(?P<container_name>[^\._]+)\/(?P<run_id>\d+)\.log$
          type: regex_parser
        - from: attributes.container_name
          to: resource["k8s.container.name"]
          type: move
        - from: attributes.namespace
          to: resource["k8s.namespace.name"]
          type: move
        - from: attributes.pod_name
          to: resource["k8s.pod.name"]
          type: move
        - field: attributes.run_id
          type: remove
        - field: attributes.uid
          type: remove
        - field: attributes["log.file.path"]
          type: remove
        - from: body.log
          to: body
          type: move
    service:
      pipelines:
        logs/containers:
          exporters:
          - logging
          processors:
          - k8sattributes
          receivers:
          - filelog/containers
      telemetry:
        logs:
          level: debug

Note that k8sattributes doesn't add metadata here, as it doesn't have the required RBAC. But it does identify Pods correctly, which you can confirm in the debug logs.

@AndriySidliarskiy
Copy link
Author

so i try to use another configuration and it work but now i have this error but for metrics pipeline [email protected]/processor.go:102 evaluating pod identifier {"kind": "processor", "name": "k8sattributes", "pipeline": "logs/eks", "value": [{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""}]}

@swiatekm
Copy link
Contributor

@AndriySidliarskiy was your original problem fixed, then? If you have a different one, please close this issue and open a new one, with more information pertaining to the new problem with metrics.

@atoulme atoulme removed the needs triage New item requiring triage label Jun 27, 2023
@github-actions
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Aug 28, 2023
@github-actions
Copy link
Contributor

This issue has been closed as inactive because it has been stale for 120 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 27, 2023
@omri-shilton
Copy link

this issue is still happening to me in EKS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working closed as inactive processor/k8sattributes k8s Attributes processor Stale
Projects
None yet
Development

No branches or pull requests

4 participants