Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Watch namespace is still trying to change Pod from other namespaces #2668

Closed
janario opened this issue Feb 26, 2024 · 5 comments
Closed

Watch namespace is still trying to change Pod from other namespaces #2668

janario opened this issue Feb 26, 2024 · 5 comments
Labels
bug Something isn't working needs triage

Comments

@janario
Copy link
Contributor

janario commented Feb 26, 2024

Component(s)

No response

What happened?

Description

We have the operator installed and mapped to only a few namespaces using the WATCH_NAMESPACE env var.

In our scenario we have an app chart we have the inject config in place. The app chart is expected to be used across many different namespaces, especially in development environment.

But we are not interested in injecting with the operator in all of them since it was already restricted for a few namespaces.

The problem is that it still tries to list to namespaces that were not listed.

Initial PR: #2666

Steps to Reproduce

Our operator is deployed like:

helm upgrade otel-operator --install --atomic -n monitoring \
   -f ./values.yaml \
   --version 0.47.0 open-telemetry/opentelemetry-operator 

with values.yaml

manager:
  podAnnotations:
    sidecar.istio.io/inject: "false"
  env:
    WATCH_NAMESPACE: development,integration,monitoring
  resources:
    limits:
      memory: 256Mi

And our app chart Deployment looks like:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app 
spec:
  ...
  template:
    metadata:
      annotations:
        instrumentation.opentelemetry.io/container-names: 'app'
        instrumentation.opentelemetry.io/inject-java: monitoring/default

Expected Result

It shouldn't try to inject the otel configs for namespaces that were not listed in the WATCH_NAMESPACE.

Actual Result

The operator gets the event, tries to mutate the pod and show a error message due to a cache that was not created for the namespace. This leaves the Pod mutated but a few things looked weird on it like the service name. Like it was incompletely mutated.

Stack message:

{"level":"error","ts":"2024-02-23T07:19:40Z","msg":"failed to get replicaset","replicaset":"...","namespace":"..","error":"unable to get:.../... because of unknown namespace for the cache","stacktrace":"github.com/open-telemetry/opentelemetry-operator/pkg/instrumentation.(*sdkInjector).addParentResourceLabels
pkg/instrumentation/sdk.go:481
github.com/open-telemetry/opentelemetry-operator/pkg/instrumentation.(*sdkInjector).createResourceMap
pkg/instrumentation/sdk.go:448
github.com/open-telemetry/opentelemetry-operator/pkg/instrumentation.(*sdkInjector).injectCommonSDKConfig
pkg/instrumentation/sdk.go:255
github.com/open-telemetry/opentelemetry-operator/pkg/instrumentation.(*sdkInjector).inject
pkg/instrumentation/sdk.go:74
github.com/open-telemetry/opentelemetry-operator/pkg/instrumentation.(*instPodMutator).Mutate
pkg/instrumentation/podmutator.go:360
github.com/open-telemetry/opentelemetry-operator/internal/webhook/podmutation.(*podMutationWebhook).Handle
internal/webhook/podmutation/webhookhandler.go:92
sigs.k8s.io/controller-runtime/pkg/webhook/admission.(*Webhook).Handle
/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/webhook/admission/webhook.go:169
sigs.k8s.io/controller-runtime/pkg/webhook/admission.(*Webhook).ServeHTTP
/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/webhook/admission/http.go:119
sigs.k8s.io/controller-runtime/pkg/webhook/internal/metrics.InstrumentedHook.InstrumentHandlerInFlight.func1
/home/runner/go/pkg/mod/github.com/prometheus/[email protected]/prometheus/promhttp/instrument_server.go:60
net/http.HandlerFunc.ServeHTTP
/opt/hostedtoolcache/go/1.21.6/x64/src/net/http/server.go:2136
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerCounter.func1
/home/runner/go/pkg/mod/github.com/prometheus/[email protected]/prometheus/promhttp/instrument_server.go:147
net/http.HandlerFunc.ServeHTTP
/opt/hostedtoolcache/go/1.21.6/x64/src/net/http/server.go:2136
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerDuration.func2
/home/runner/go/pkg/mod/github.com/prometheus/[email protected]/prometheus/promhttp/instrument_server.go:109
net/http.HandlerFunc.ServeHTTP
/opt/hostedtoolcache/go/1.21.6/x64/src/net/http/server.go:2136
net/http.(*ServeMux).ServeHTTP
/opt/hostedtoolcache/go/1.21.6/x64/src/net/http/server.go:2514
net/http.serverHandler.ServeHTTP
/opt/hostedtoolcache/go/1.21.6/x64/src/net/http/server.go:2938
net/http.(*conn).serve
/opt/hostedtoolcache/go/1.21.6/x64/src/net/http/server.go:2009"}

Kubernetes Version

1.25

Operator version

0.93.0 (0.47.0 chart)

Collector version

0.93.0

Environment information

Environment

OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")

Log output

No response

Additional context

No response

@yuriolisa
Copy link
Contributor

Could you please try to use the operator on version 0.94? It seems this issue got fixed

@janario
Copy link
Contributor Author

janario commented Feb 26, 2024

Could you please try to use the operator on version 0.94? It seems this issue got fixed

I tried it, but the issue is still at the webhook when trying to mutate the pod because of the absent cache key at #2668 (comment)

@janario
Copy link
Contributor Author

janario commented Feb 26, 2024

But that's interesting 🤔

It should have other cleaner ways.

https://sdk.operatorframework.io/docs/building-operators/golang/operator-scope/

https://pkg.go.dev/sigs.k8s.io/controller-runtime/pkg/manager#example-New-LimitToNamespaces

I'll try some other approaches in the PR and get back here

@janario
Copy link
Contributor Author

janario commented Feb 27, 2024

I've tried like it shows at https://pkg.go.dev/sigs.k8s.io/controller-runtime/pkg/manager#example-New-LimitToNamespaces

With NewCache, but no luck it is still getting unwatched namespaces for some reason.

Any ideas that I can try? 🤔

@janario
Copy link
Contributor Author

janario commented May 10, 2024

Not an issue, we got into details at the PR #2666

@janario janario closed this as completed May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants