Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add golang auto-instrumentation #973

Closed

Conversation

pavolloffay
Copy link
Member

@pavolloffay pavolloffay commented Jul 12, 2022

Signed-off-by: Pavol Loffay [email protected]

This PR uses https://github.com/keyval-dev/opentelemetry-go-instrumentation to instrument golang applications. It instruments the app via eBPF and by .spec.shareProcessNamespace on the pod level https://kubernetes.io/docs/tasks/configure-pod-container/share-process-namespace/. It does not use init-container like other instrumentations, but it injects a sidecar container that performs the instrumentation.

Two annotations are added:

  • instrumentation.opentelemetry.io/inject-golang - to enable the instrumentation like for other langs
  • instrumentation.opentelemetry.io/golang-target-exec - to specify path to the app binary. @edeNFed I wonder if we can get rid of this annotation and get the path by scanning the processes. I also wonder if the instrumentation utility should consider multiple processes.

Resolves open-telemetry/community#908

Before this is merged I would prefer that https://github.com/keyval-dev/opentelemetry-go-instrumentation is part of the OTEL community.

Signed-off-by: Pavol Loffay <[email protected]>
@pavolloffay
Copy link
Member Author

Example of injection

apiVersion: v1
kind: Pod
metadata:
  annotations:
    instrumentation.opentelemetry.io/golang-target-exec: /usr/local/bin/emojivoto-voting-svc
    instrumentation.opentelemetry.io/inject-golang: "true"
  creationTimestamp: "2022-07-12T15:53:05Z"
  generateName: voting-7c9bdbb749-
  labels:
    app: voting-svc
    pod-template-hash: 7c9bdbb749
    version: v11
  name: voting-7c9bdbb749-46lpw
  namespace: emojivoto
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: ReplicaSet
    name: voting-7c9bdbb749
    uid: 62894888-04b3-4ef6-b1ae-274ad9d6cf1a
  resourceVersion: "8975"
  uid: a85f1e35-8282-4c50-a914-36886081daf6
spec:
  containers:
  - env:
    - name: GRPC_PORT
      value: "8080"
    - name: PROM_PORT
      value: "8801"
    image: docker.l5d.io/buoyantio/emojivoto-voting-svc:v11
    imagePullPolicy: IfNotPresent
    name: voting-svc
    ports:
    - containerPort: 8080
      name: grpc
      protocol: TCP
    - containerPort: 8801
      name: prom
      protocol: TCP
    resources:
      requests:
        cpu: 100m
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-sbvqh
      readOnly: true
  - env:
    - name: OTEL_TARGET_EXE
      value: /usr/local/bin/emojivoto-voting-svc
    - name: OTEL_SERVICE_NAME
      value: voting
    - name: OTEL_EXPORTER_OTLP_ENDPOINT
      value: simplest-collector.default.svc:4317
    - name: OTEL_RESOURCE_ATTRIBUTES_POD_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.name
    - name: OTEL_RESOURCE_ATTRIBUTES_NODE_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: spec.nodeName
    - name: OTEL_PROPAGATORS
      value: tracecontext,baggage,b3
    - name: OTEL_TRACES_SAMPLER
      value: parentbased_traceidratio
    - name: OTEL_TRACES_SAMPLER_ARG
      value: "1"
    - name: OTEL_RESOURCE_ATTRIBUTES
      value: k8s.container.name=opentelemetry-auto-instrumentation,k8s.deployment.name=voting,k8s.namespace.name=emojivoto,k8s.node.name=$(OTEL_RESOURCE_ATTRIBUTES_NODE_NAME),k8s.pod.name=$(OTEL_RESOURCE_ATTRIBUTES_POD_NAME),k8s.replicaset.name=voting-7c9bdbb749
    image: keyval/otel-go-agent:v0.5.3
    imagePullPolicy: IfNotPresent
    name: opentelemetry-auto-instrumentation
    resources: {}
    securityContext:
      capabilities:
        add:
        - SYS_PTRACE
      privileged: true
      runAsUser: 0
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /sys/kernel/debug
      name: kernel-debug
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-sbvqh
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  nodeName: kind-control-plane
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: voting
  serviceAccountName: voting
  shareProcessNamespace: true
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - name: kube-api-access-sbvqh
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
  - hostPath:
      path: /sys/kernel/debug
      type: ""
    name: kernel-debug

@edeNFed
Copy link

edeNFed commented Jul 12, 2022

This is awesome!
I opened this issue for process discovery, hope to finish it in the next few days.
You can track the donation process to the OpenTelemetry project here

Copy link
Member

@frzifus frzifus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @pavolloffay @edeNFed !

return pod
}
zero := int64(0)
truee := bool(true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
truee := bool(true)
truee := true

i think thats enough, but what do you think about renaming it to something like shareProcessNSEnabled?

if pod.Spec.ShareProcessNamespace != nil && *pod.Spec.ShareProcessNamespace == false {
return pod
}
execPath, ok := pod.Annotations[annotationGolangExecPath]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we get this information from the registered v1.Container command itself? That might limit the number of containers to 1, but should be fine for now. What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the command in the container spec can be empty if I am not mistaken - e.g. reply on default from the container image.

Even inspecting the container image CMD or entrypoint is not sufficient. Many containers run scripts before executing the final process.

@@ -297,7 +297,7 @@ spec:
- args:
- --metrics-addr=127.0.0.1:8080
- --enable-leader-election
image: ghcr.io/open-telemetry/opentelemetry-operator/opentelemetry-operator:0.55.0
image: docker.io/pavolloffay/opentelemetry-operator:300003
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that a leftover from testing?

@TylerHelmuth
Copy link
Member

@pavolloffay @edeNFed @frzifus now that donation is complete is this PR ready to be discussed?

@pavolloffay
Copy link
Member Author

Let's move the discussion to #908

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants