OpenTelemetry webhook modifies orphaned pods in an invalid way #1514

andrewdinunzio · 2023-02-24T14:08:39Z

Using the opentelemetry operator helm chart version 0.18.3.

This will be a bit difficult to describe or reproduce, but wanted to open this to describe what I found in the meantime.

Yesterday, we saw issues where a huge number of pods were stuck in Terminating phase. The cause of that is still somewhat unknown but we figure one of two things happened:

We have finalizers on Pods for record-keeping purposes that our own operator removes once its work is complete. However, our operator was unable to remove the finalizer. It reported errors like:

"The Pod "{PODNAME}" is invalid: spec.initContainers: Forbidden: pod updates may not add or remove containers"

We also tried to manually remove the finalizer, both via kubectl patch, which had the same error, and with kubectl edit which had an error saying the pod was invalid for the same reason.

So here's where I think there could be an issue with the opentelemetry webhook:

Our pod has an init container already, so I tried to remove that as part of the kubectl edit, and was met with a different 'invalid pod' error. In this case, there was an init container with multiple duplicate keys. Like, there were two name fields, one with the name of our init container and one that appeared to be injected by opentelemetry webhook.

Seeing this, I modified the webhook service to point to a bogus port to effectively disable it. After that, I was able to successfully remove the finalizer.

Some other notes:

This pod was orphaned. We typically delete using a Foreground cascading policy, but something must have deleted with a different policy.
The pod had a deletionTimestamp set, it was just waiting for finalizers to be removed
We actually deployed a new node pool while diagnosing this and tore down the old one. The pod still persisted, and even kubectl delete pod --force --grace-period=0 was hanging until the finalizer was removed.
Because we also have linkerd, I set nameOverride to include an aaaa prefix, so it runs before linkerd's webhook. This is because in cases where I want to inject something with autoinstrumentation, it was modifying linkerd's proxy, not our main app container. By adding this prefix, the opentelemetry webhook ran first.
The webhook appeared to run even for pods that originally were not marked for autoinstrumentation

I'll try to add additional notes as I find them, but this is the gist of it.

The text was updated successfully, but these errors were encountered:

svmaris · 2023-05-08T22:48:30Z

We're running into a similar issue with Jobs. When we auto-instrument them using the instrumentation.opentelemetry.io/inject-sdk: "true" annotation, the Pods are stuck on Terminating indefinitely after the Job completes.

Manually removing the batch.kubernetes.io/job-tracking finalizer is not allowed. Only after deleting the opentelemetry-operator Mutating/ValidatingWebhooks, I'm able to remove the finalizer and delete the stuck Pod.

Not sure what's happening here, but it seems like the Webhook is preventing the Job Controller from removing the finalizer and finishing the Job.

edude03 · 2023-05-19T18:58:57Z

Looks like the same issue(s) as #1541 #1417 and #1366.

I'm having the same issue. One thing I noticed that you can see in #1541 - is that attempting to re-inject the same variables that already exist - for example

            Name: "OTEL_RESOURCE_ATTRIBUTES",                                                                                                                             
            Value: strings.Join({                                                                                                                                         
              ... // 131 identical bytes                                                                                                                                  
              "TEL_RESOURCE_ATTRIBUTES_NODE_NAME),k8s.pod.name=$(OTEL_RESOURCE_",                                                                                         
              "ATTRIBUTES_POD_NAME),k8s.replicaset.name=xxxxx",                                                                                         
+             ",k8s.node.name=$(OTEL_RESOURCE_ATTRIBUTES_NODE_NAME),k8s.pod.nam",                                                                                         
+             "e=$(OTEL_RESOURCE_ATTRIBUTES_POD_NAME),k8s.pod.uid=xxxx",                                                                                                                                                                                                                  
            }, ""),

Notice that k8s.pod.name exists on the first line, but is being added again on the third line. My guess is the check that checks if the pod was already injected is failing to match for some reason and thus is trying again.

What I don't understand however, is what is trying to add these variables and why? As far as I can tell, the variables are added via a MutatingWebHook so there shouldn't be updates to the pod after creation 🤔

andrewdinunzio · 2023-07-06T19:32:06Z

I feel like I am getting close to hitting this with the debugger using Telepresence. If it helps anyone else look into this, I made the following changes to hook into the webhook and was able to hit some breakpoints in the webhook:

diff --git a/Makefile b/Makefile
index febd731..fad5ec6 100644
--- a/Makefile
+++ b/Makefile
@@ -13,6 +13,7 @@ AUTO_INSTRUMENTATION_DOTNET_VERSION ?= "$(shell grep -v '\#' versions.txt | grep
 AUTO_INSTRUMENTATION_APACHE_HTTPD_VERSION ?= "$(shell grep -v '\#' versions.txt | grep autoinstrumentation-apache-httpd | awk -F= '{print $$2}')"
 LD_FLAGS ?= "-X ${VERSION_PKG}.version=${VERSION} -X ${VERSION_PKG}.buildDate=${VERSION_DATE} -X ${VERSION_PKG}.otelCol=${OTELCOL_VERSION} -X ${VERSION_PKG}.targetAllocator=${TARGETALLOCATOR_VERSION} -X ${VERSION_PKG}.operatorOpAMPBridge=${OPERATOR_OPAMP_BRIDGE_VERSION} -X ${VERSION_PKG}.autoInstrumentationJava=${AUTO_INSTRUMENTATION_JAVA_VERSION} -X ${VERSION_PKG}.autoInstrumentationNodeJS=${AUTO_INSTRUMENTATION_NODEJS_VERSION} -X ${VERSION_PKG}.autoInstrumentationPython=${AUTO_INSTRUMENTATION_PYTHON_VERSION} -X ${VERSION_PKG}.autoInstrumentationDotNet=${AUTO_INSTRUMENTATION_DOTNET_VERSION}" ARCH ?= $(shell go env GOARCH)
+SHELL := /bin/bash
 
 # Image URL to use all building/pushing image targets
 IMG_PREFIX ?= ghcr.io/${USER}/opentelemetry-operator
@@ -45,7 +46,7 @@ GOBIN=$(shell go env GOBIN)
 endif
 
 # by default, do not run the manager with webhooks enabled. This only affects local runs, not the build or in-cluster deployments.
-ENABLE_WEBHOOKS ?= false
+ENABLE_WEBHOOKS ?= true

@@ -101,10 +102,24 @@ test: generate fmt vet ensure-generate-is-noop envtest
 manager: generate fmt vet
        go build -o bin/manager main.go

+.PHONY: copy-certs
+copy-certs:
+       @[ -d /tmp/k8s-webhook-server/serving-certs/ ] || { \
+       mkdir -p /tmp/k8s-webhook-server/serving-certs/;\
+       kubectl get secret monitoring-stack-aaaa-opentelemetry-operator-controller-manager-service-cert -o yaml | yq '.data["tls.crt"]' | base64 --decode > /tmp/k8s-webhook-server/serving-certs/tls.crt;\
+       kubectl get secret monitoring-stack-aaaa-opentelemetry-operator-controller-manager-service-cert -o yaml | yq '.data["tls.key"]' | base64 --decode > /tmp/k8s-webhook-server/serving-certs/tls.key;\
+       kubectl get secret monitoring-stack-aaaa-opentelemetry-operator-controller-manager-service-cert -o yaml | yq '.data["ca.crt"]' | base64 --decode > /tmp/k8s-webhook-server/serving-certs/ca.crt;\
+       }
+
 # Run against the configured Kubernetes cluster in ~/.kube/config
 .PHONY: run
-run: generate fmt vet manifests
-       ENABLE_WEBHOOKS=$(ENABLE_WEBHOOKS) go run -ldflags ${LD_FLAGS} ./main.go --zap-devel
+run: generate fmt vet manifests copy-certs
+       go build -o bin/otel-operator -gcflags='all=-N -l' -ldflags ${LD_FLAGS} ./main.go
+       ENABLE_WEBHOOKS=$(ENABLE_WEBHOOKS) telepresence intercept monitoring-stack-aaaa-opentelemetry-operator --port 9443:443 -- dlv exec bin/otel-operator --headless --listen=:2345 --log --api-version=2 --accept-multiclient --continue --only-same-user=false -- --zap-devel --webhook-port=9443 --metrics-addr=0.0.0.0:8080 --enable-leader-election --health-probe-addr=:8081 --collector-image=otel/opentelemetry-collector-contrib:0.71.0

Also had to add this to main.go (put whatever namespace the otel operator is in):

+               LeaderElectionNamespace: "monitoring",

Also added this launch config to .vscode/launch.json:

{
    // Use IntelliSense to learn about possible attributes.
    // Hover to view descriptions of existing attributes.
    // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Debug",
            "type": "go",
            "debugAdapter": "dlv-dap",
            "request": "attach",
            "mode": "remote",
            "port": 2345,
            "host": "127.0.0.1"
        }
    ]
}

Started debugging with

telepresence helm install # install telepresence into the cluster
kubectl config set-context --current --namespace {otel-namespace}
make run # start local instance with debugger and connect telepresence
# run the "Debug" launch configuration from the Debug tab

Note that when intercepting with Telepresence you need to use the name of the Deployment not the Service (monitoring-stack-aaaa-opentelemetry-operator in my case). It figures out the Service based on what ports, etc.

GarrettGeorge · 2023-08-08T23:00:18Z

(Feel free to remove this if deemed too out of scope or off topic for this issue)

Are there any upstream issues found related to this? Although I'm not using this operator, I do have an Open Policy Agent mutating webhook which exhibits the same behavior described here.

Our mutation is at the .spec.containers[*].image and spec.initContainers[*].image level to ensure all images come from our internal registry. In our "bad pod", the mutation doesn't seem to have ever been applied and when the webhook pod receives the request it tries to update the image and fails for similar reasons above:

# pods "cronjob-***" was not valid:
 # * spec: Forbidden: pod updates may not change fields other than `spec.containers[*].image`, `spec.initContainers[*].image`, `spec.activeDeadlineSeconds`, `spec.tolerations` (only additions to existing       tolerations) or `spec.terminationGracePeriodSeconds` (allow it to be set to 1 if it was previously negative)
 #   core.PodSpec{
 #     ... // 11 identical fields
 #     NodeName:        "aks-nodepool1-***",
 #     SecurityContext: &{},
 #     ImagePullSecrets: []core.LocalObjectReference{
 # -     {Name: "private-registry1"},
 # +     {Name: "private-registry2"},
 # -     {Name: "private-registry3"},
 #     },
 #     Hostname:  "",
 #     Subdomain: "",
 #     ... // 15 identical fields
 #   }

Even if you try to change something the API says you should be able to, it complains. Not sure if this is necessarily an upstream issue, but it's definitely not unique to opentelemetry-operator.

kristian-schjelderup · 2024-01-22T15:08:35Z

Seems like the mutating webhook doesn't account for the life-cycle of the resource it gets in. The modifications to the Pod definition should be done in the initial Pod CREATE. and not trying to modify anything after this.

Based on this, my take would be that the the MutatingWebhookConfiguration should only trigger on CREATE operations for Pod resources, and not both on CREATE and UPDATE as it currently does.

andrewdinunzio · 2024-01-30T18:41:06Z

Something like this?
#2584

andrewdinunzio mentioned this issue Mar 10, 2023

No errors logged when failing to reconcile with webhooks disabled #1569

Closed

pavolloffay added area:auto-instrumentation Issues for auto-instrumentation bug Something isn't working labels May 9, 2023

ntaylor1781 mentioned this issue Aug 23, 2023

The PodMutator for the Instrumentation injection modifies the env variables of the pod after it is created #1366

Closed

jaronoff97 added this to the v1alpha2 CRD release milestone Nov 28, 2023

andrewdinunzio mentioned this issue Jan 30, 2024

Remove updates from pod mutating webhook #2584

Merged

pavolloffay closed this as completed in #2584 Mar 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenTelemetry webhook modifies orphaned pods in an invalid way #1514

OpenTelemetry webhook modifies orphaned pods in an invalid way #1514

andrewdinunzio commented Feb 24, 2023 •

edited

Loading

svmaris commented May 8, 2023 •

edited

Loading

edude03 commented May 19, 2023 •

edited

Loading

andrewdinunzio commented Jul 6, 2023 •

edited

Loading

GarrettGeorge commented Aug 8, 2023

kristian-schjelderup commented Jan 22, 2024

andrewdinunzio commented Jan 30, 2024

OpenTelemetry webhook modifies orphaned pods in an invalid way #1514

OpenTelemetry webhook modifies orphaned pods in an invalid way #1514

Comments

andrewdinunzio commented Feb 24, 2023 • edited Loading

svmaris commented May 8, 2023 • edited Loading

edude03 commented May 19, 2023 • edited Loading

andrewdinunzio commented Jul 6, 2023 • edited Loading

GarrettGeorge commented Aug 8, 2023

kristian-schjelderup commented Jan 22, 2024

andrewdinunzio commented Jan 30, 2024

andrewdinunzio commented Feb 24, 2023 •

edited

Loading

svmaris commented May 8, 2023 •

edited

Loading

edude03 commented May 19, 2023 •

edited

Loading

andrewdinunzio commented Jul 6, 2023 •

edited

Loading