Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Operator Panic and CrashLoop for invalid prometheus exporter endpoint Collector #2628

Closed
Starefossen opened this issue Feb 15, 2024 · 5 comments
Labels
area:collector Issues for deploying collector bug Something isn't working help wanted Extra attention is needed needs triage

Comments

@Starefossen
Copy link
Contributor

Component(s)

No response

What happened?

Description

Operator exists with the following panic and enters CrashLoopBackoff after applying an OpenTelemetryCollector (see bellow) and stays crashing until the OpenTelemetryCollector is deleted from the cluster. It is not possible to edit the OpenTelemetryCollector due to the failing webhook.

Steps to Reproduce

Apply the following OpenTelemetryCollector resource:

apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: opentelemetry-collector-management-internet
  namespace: my-system
spec:
  config: |
    receivers:
      otlp:
        protocols:
          http:
            endpoint: "http://localhost:4318/"
    processors:
      batch: {}
      memory_limiter:
        check_interval: 5s
        limit_mib: 4000
        spike_limit_mib: 500
      attributes:
        actions:
          - key: source
            value: internet
            action: insert
    exporters:
      prometheus:
        endpoint: prometheus
      otlp:
        endpoint: http://tempo:4317
        tls:
          insecure: true
      loki:
        endpoint: http://loki:3100/loki/api/v1/push
        tls:
          insecure: true
    service:
      pipelines:
        metrics:
          receivers: [ otlp ]
          processors: [ batch, memory_limiter ]
          exporters: [ prometheus ]
        traces:
          receivers: [ otlp ]
          processors: [ batch, memory_limiter ]
          exporters: [ otlp ]
        logs:
          receivers: [ otlp ]
          processors: [ batch, memory_limiter ]
          exporters: [ loki ]
  deploymentUpdateStrategy: {}
  ingress:
    route: {}
  managementState: managed
  mode: deployment
  observability:
    metrics: {}
  podDisruptionBudget:
    maxUnavailable: 1
  podSecurityContext:
    fsGroup: 65532
    runAsGroup: 65532
    runAsNonRoot: true
    runAsUser: 65532
    seccompProfile:
      type: RuntimeDefault
  ports:
  - name: otlphttp
    port: 4318
    protocol: TCP
    targetPort: 4318
  replicas: 1
  resources: {}
  securityContext:
    allowPrivilegeEscalation: false
    capabilities:
      drop:
      - ALL
    readOnlyRootFilesystem: true
    runAsNonRoot: true
    runAsUser: 65532
    seccompProfile:
      type: RuntimeDefault
  targetAllocator:
    allocationStrategy: consistent-hashing
    filterStrategy: relabel-config
    observability:
      metrics: {}
    prometheusCR:
      scrapeInterval: 30s
    resources: {}
  updateStrategy: {}
  upgradeStrategy: automatic

Expected Result

Operator should not panic and instead create the OpenTelemetry Collector.

Actual Result

{"level":"info","ts":"2024-02-15T16:12:21Z","msg":"Starting the OpenTelemetry Operator","opentelemetry-operator":"0.93.0","opentelemetry-collector":"otel/opentelemetry-collector-contrib:0.93.0","opentelemetry-targetallocator":"ghcr.io/open-telemetry/opentelemetry-operator/target-allocator:0.93.0","operator-opamp-bridge":"ghcr.io/open-telemetry/opentelemetry-operator/operator-opamp-bridge:0.93.0","auto-instrumentation-java":"ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:1.32.0","auto-instrumentation-nodejs":"ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-nodejs:0.46.0","auto-instrumentation-python":"ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:0.43b0","auto-instrumentation-dotnet":"ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-dotnet:1.2.0","auto-instrumentation-go":"ghcr.io/open-telemetry/opentelemetry-go-instrumentation/autoinstrumentation-go:v0.10.1-alpha","auto-instrumentation-apache-httpd":"ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-apache-httpd:1.0.4","auto-instrumentation-nginx":"ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-apache-httpd:1.0.4","feature-gates":"operator.autoinstrumentation.apache-httpd,operator.autoinstrumentation.dotnet,-operator.autoinstrumentation.go,operator.autoinstrumentation.java,-operator.autoinstrumentation.multi-instrumentation,-operator.autoinstrumentation.nginx,operator.autoinstrumentation.nodejs,operator.autoinstrumentation.python,operator.collector.rewritetargetallocator,-operator.observability.prometheus","build-date":"2024-02-02T17:52:35Z","go-version":"go1.21.6","go-arch":"amd64","go-os":"linux","labels-filter":[]}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"setup","msg":"the env var WATCH_NAMESPACE isn't set, watching all namespaces"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.builder","msg":"Registering a mutating webhook","GVK":"opentelemetry.io/v1alpha1, Kind=OpenTelemetryCollector","path":"/mutate-opentelemetry-io-v1alpha1-opentelemetrycollector"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.webhook","msg":"Registering webhook","path":"/mutate-opentelemetry-io-v1alpha1-opentelemetrycollector"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.builder","msg":"Registering a validating webhook","GVK":"opentelemetry.io/v1alpha1, Kind=OpenTelemetryCollector","path":"/validate-opentelemetry-io-v1alpha1-opentelemetrycollector"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.webhook","msg":"Registering webhook","path":"/validate-opentelemetry-io-v1alpha1-opentelemetrycollector"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.builder","msg":"Registering a mutating webhook","GVK":"opentelemetry.io/v1alpha1, Kind=Instrumentation","path":"/mutate-opentelemetry-io-v1alpha1-instrumentation"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.webhook","msg":"Registering webhook","path":"/mutate-opentelemetry-io-v1alpha1-instrumentation"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.builder","msg":"Registering a validating webhook","GVK":"opentelemetry.io/v1alpha1, Kind=Instrumentation","path":"/validate-opentelemetry-io-v1alpha1-instrumentation"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.webhook","msg":"Registering webhook","path":"/validate-opentelemetry-io-v1alpha1-instrumentation"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.webhook","msg":"Registering webhook","path":"/mutate-v1-pod"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.builder","msg":"Registering a mutating webhook","GVK":"opentelemetry.io/v1alpha1, Kind=OpAMPBridge","path":"/mutate-opentelemetry-io-v1alpha1-opampbridge"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.webhook","msg":"Registering webhook","path":"/mutate-opentelemetry-io-v1alpha1-opampbridge"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.builder","msg":"Registering a validating webhook","GVK":"opentelemetry.io/v1alpha1, Kind=OpAMPBridge","path":"/validate-opentelemetry-io-v1alpha1-opampbridge"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.webhook","msg":"Registering webhook","path":"/validate-opentelemetry-io-v1alpha1-opampbridge"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"setup","msg":"starting manager"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.metrics","msg":"Starting metrics server"}
{"level":"info","ts":"2024-02-15T16:12:21Z","msg":"starting server","kind":"health probe","addr":"[::]:8081"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.metrics","msg":"Serving metrics server","bindAddress":"0.0.0.0:8080","secure":false}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.webhook","msg":"Starting webhook server"}
I0215 16:12:21.427966       1 leaderelection.go:250] attempting to acquire leader lease my-system/9f7554c3.opentelemetry.io...
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.certwatcher","msg":"Updated current TLS certificate"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.webhook","msg":"Serving webhook server","host":"","port":9443}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.certwatcher","msg":"Starting certificate watcher"}
I0215 16:15:12.024007       1 leaderelection.go:260] successfully acquired lease my-system/9f7554c3.opentelemetry.io
{"level":"info","ts":"2024-02-15T16:15:12Z","logger":"collector-upgrade","msg":"looking for managed instances to upgrade"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting EventSource","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","source":"kind source: *v1alpha1.OpenTelemetryCollector"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting EventSource","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","source":"kind source: *v1.ConfigMap"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting EventSource","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","source":"kind source: *v1.ServiceAccount"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting EventSource","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","source":"kind source: *v1.Service"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting EventSource","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","source":"kind source: *v1.Deployment"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting EventSource","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","source":"kind source: *v1.DaemonSet"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting EventSource","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","source":"kind source: *v1.StatefulSet"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting EventSource","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","source":"kind source: *v2.HorizontalPodAutoscaler"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting EventSource","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","source":"kind source: *v1.PodDisruptionBudget"}
{"level":"info","ts":"2024-02-15T16:15:12Z","logger":"instrumentation-upgrade","msg":"looking for managed Instrumentation instances to upgrade"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting Controller","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting EventSource","controller":"opampbridge","controllerGroup":"opentelemetry.io","controllerKind":"OpAMPBridge","source":"kind source: *v1alpha1.OpAMPBridge"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting EventSource","controller":"opampbridge","controllerGroup":"opentelemetry.io","controllerKind":"OpAMPBridge","source":"kind source: *v1.ConfigMap"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting EventSource","controller":"opampbridge","controllerGroup":"opentelemetry.io","controllerKind":"OpAMPBridge","source":"kind source: *v1.ServiceAccount"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting EventSource","controller":"opampbridge","controllerGroup":"opentelemetry.io","controllerKind":"OpAMPBridge","source":"kind source: *v1.Service"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting EventSource","controller":"opampbridge","controllerGroup":"opentelemetry.io","controllerKind":"OpAMPBridge","source":"kind source: *v1.Deployment"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting Controller","controller":"opampbridge","controllerGroup":"opentelemetry.io","controllerKind":"OpAMPBridge"}
{"level":"info","ts":"2024-02-15T16:15:12Z","logger":"collector-upgrade","msg":"no instances to upgrade"}
{"level":"info","ts":"2024-02-15T16:15:12Z","logger":"instrumentation-upgrade","msg":"no instances to upgrade"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting workers","controller":"opampbridge","controllerGroup":"opentelemetry.io","controllerKind":"OpAMPBridge","worker count":1}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting workers","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","worker count":1}
{"level":"error","ts":"2024-02-15T16:15:12Z","logger":"controllers.OpenTelemetryCollector","msg":"couldn't parse the endpoint's port","endpoint":"prometheus","error":"port should not be empty","stacktrace":"github.com/open-telemetry/opentelemetry-operator/internal/manifests/collector/parser/exporter.singlePortFromConfigEndpoint\n\t/home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/parser/exporter/exporter.go:73\ngithub.com/open-telemetry/opentelemetry-operator/internal/manifests/collector/parser/exporter.(*PrometheusExporterParser).Ports\n\t/home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/parser/exporter/exporter_prometheus.go:63\ngithub.com/open-telemetry/opentelemetry-operator/internal/manifests/collector/adapters.ConfigToComponentPorts\n\t/home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/adapters/config_to_ports.go:107\ngithub.com/open-telemetry/opentelemetry-operator/internal/manifests/collector/adapters.ConfigToPorts\n\t/home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/adapters/config_to_ports.go:132\ngithub.com/open-telemetry/opentelemetry-operator/internal/manifests/collector.getConfigContainerPorts\n\t/home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/container.go:172\ngithub.com/open-telemetry/opentelemetry-operator/internal/manifests/collector.Container\n\t/home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/container.go:46\ngithub.com/open-telemetry/opentelemetry-operator/internal/manifests/collector.Deployment\n\t/home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/deployment.go:56\ngithub.com/open-telemetry/opentelemetry-operator/internal/manifests/collector.Build.FactoryWithoutError[...].func1\n\t/home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/builder.go:31\ngithub.com/open-telemetry/opentelemetry-operator/internal/manifests/collector.Build\n\t/home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/collector.go:71\ngithub.com/open-telemetry/opentelemetry-operator/controllers.BuildCollector\n\t/home/runner/work/opentelemetry-operator/opentelemetry-operator/controllers/common.go:54\ngithub.com/open-telemetry/opentelemetry-operator/controllers.(*OpenTelemetryCollectorReconciler).Reconcile\n\t/home/runner/work/opentelemetry-operator/opentelemetry-operator/controllers/opentelemetrycollector_controller.go:124\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:119\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:316\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","OpenTelemetryCollector":{"name":"opentelemetry-collector-management-internet","namespace":"my-system"},"namespace":"my-system","name":"opentelemetry-collector-management-internet","reconcileID":"283b9ac7-3665-4299-a9d8-67ef904ced7b"}
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
        panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1751325]

goroutine 412 [running]:
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
        /home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:116 +0x1e5
panic({0x2c27860?, 0x52f65a0?})
        /opt/hostedtoolcache/go/1.21.6/x64/src/runtime/panic.go:914 +0x21f
github.com/open-telemetry/opentelemetry-operator/internal/manifests/collector/parser/exporter.(*PrometheusExporterParser).Ports(0x3999788?)
        /home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/parser/exporter/exporter_prometheus.go:63 +0x45
github.com/open-telemetry/opentelemetry-operator/internal/manifests/collector/adapters.ConfigToComponentPorts({{0x3999788?, 0xc000642ae0?}, 0xc0012d91a0?}, 0x1, 0xc0009f5400?)
        /home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/adapters/config_to_ports.go:107 +0x708
github.com/open-telemetry/opentelemetry-operator/internal/manifests/collector/adapters.ConfigToPorts({{0x3999788?, 0xc000642ae0?}, 0xc001582bb8?}, 0x267ac17?)
        /home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/adapters/config_to_ports.go:132 +0x90
github.com/open-telemetry/opentelemetry-operator/internal/manifests/collector.getConfigContainerPorts({{0x3999788?, 0xc000642ae0?}, 0x3?}, {0xc0004afc00, 0x374})
        /home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/container.go:172 +0xa5
github.com/open-telemetry/opentelemetry-operator/internal/manifests/collector.Container({{0x3968240, 0xc00088cb90}, {{0x3999788, 0xc0007bd2c0}, 0x0}, {0xc0007b01e0, 0x45}, {0xc0007b0230, 0x4a}, {0xc0007b0320, ...}, ...}, ...)
        /home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/container.go:46 +0xb6
github.com/open-telemetry/opentelemetry-operator/internal/manifests/collector.Deployment({{0x39a5d00, 0xc00040f5f0}, {0x39924e0, 0xc0006d1980}, 0xc0001c6230, {{0x3999788, 0xc000642ae0}, 0x0}, {{{0x27bed39, 0x16}, ...}, ...}, ...})
        /home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/deployment.go:56 +0x2a6
github.com/open-telemetry/opentelemetry-operator/internal/manifests/collector.Build.FactoryWithoutError[...].func1()
        /home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/builder.go:31 +0x44
github.com/open-telemetry/opentelemetry-operator/internal/manifests/collector.Build({{0x39a5d00, 0xc00040f5f0}, {0x39924e0, 0xc0006d1980}, 0xc0001c6230, {{0x3999788, 0xc000642ae0}, 0x0}, {{{0x27bed39, 0x16}, ...}, ...}, ...})
        /home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/collector.go:71 +0xa3d
github.com/open-telemetry/opentelemetry-operator/controllers.BuildCollector({{0x39a5d00, 0xc00040f5f0}, {0x39924e0, 0xc0006d1980}, 0xc0001c6230, {{0x3999788, 0xc000642ae0}, 0x0}, {{{0x27bed39, 0x16}, ...}, ...}, ...})
        /home/runner/work/opentelemetry-operator/opentelemetry-operator/controllers/common.go:54 +0x10a
github.com/open-telemetry/opentelemetry-operator/controllers.(*OpenTelemetryCollectorReconciler).Reconcile(0xc00067ba20, {0x3994780, 0xc0012e8390}, {{{0xc000c56810, 0xb}, {0xc000db2cc0, 0x2b}}})
        /home/runner/work/opentelemetry-operator/opentelemetry-operator/controllers/opentelemetrycollector_controller.go:124 +0x44b
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x3999788?, {0x3994780?, 0xc0012e8390?}, {{{0xc000c56810?, 0xb?}, {0xc000db2cc0?, 0x0?}}})
        /home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:119 +0xb7
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000aaefa0, {0x39947b8, 0xc0005caeb0}, {0x2e0da00?, 0xc0006d5700?})
        /home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:316 +0x3cc
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000aaefa0, {0x39947b8, 0xc0005caeb0})
        /home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266 +0x1af
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
        /home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227 +0x79
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2 in goroutine 199
        /home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:223 +0x565

Kubernetes Version

v1.28.3-gke.1286000

Operator version

0.93.0

Collector version

n/a

Environment information

Environment

Kubernetes: GKE

Log output

No response

Additional context

No response

@Starefossen Starefossen added bug Something isn't working needs triage labels Feb 15, 2024
@pavolloffay
Copy link
Member

  exporters:
      prometheus:
        endpoint: prometheus

The endpoint in the CR is not valid .

@pavolloffay
Copy link
Member

We should also fix the operator to avoid panic.

@pavolloffay pavolloffay added area:collector Issues for deploying collector help wanted Extra attention is needed labels Feb 16, 2024
@dexter0195
Copy link
Contributor

Hey !
I would love to help on this, I was having a look at those files :

I see the problem could come from the fact that if singlePortFromConfigEndpoint is not able to parse the url then it returns nil. What behaviour would you recommend in case the configuration is not correct ?

@CLIN42
Copy link

CLIN42 commented Feb 19, 2024

@dexter0195 To avoid panic, it makes sense to check at here if return of singlePortFromConfigEndpoint is nil, only append if it's not

    prometheusPort := singlePortFromConfigEndpoint(o.logger, o.name, o.config)
    if prometheusPort != nil {
	ports = append(ports, *prometheusPort)
    }

@Starefossen Starefossen changed the title Operator Panic and CrashLoop Operator Panic and CrashLoop for invalid prometheus exporter endpoint Collector Feb 21, 2024
@jaronoff97
Copy link
Contributor

this has been closed by #2653 thanks for reporting!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:collector Issues for deploying collector bug Something isn't working help wanted Extra attention is needed needs triage
Projects
None yet
Development

No branches or pull requests

5 participants