Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Renaming metrics with prometheus receiver produces unknown-type metrics #5001

Closed
dashpole opened this issue Dec 16, 2020 · 18 comments · Fixed by #25888
Closed

Renaming metrics with prometheus receiver produces unknown-type metrics #5001

dashpole opened this issue Dec 16, 2020 · 18 comments · Fixed by #25888
Labels
comp:prometheus Prometheus related issues receiver/prometheus Prometheus receiver

Comments

@dashpole
Copy link
Contributor

Describe the bug
If the __name__ label is changed using prometheus relabel configs, the prometheus receiver fails to populate metric metadata.

The root cause is that the prometheus receiver looks up metric metadata using the final metric name. However, the prometheus server updates the metadata cache before applying relabel rules, meaning the metadata cache stores metadata based on the initial metric name. The metadata lookup fails, and it is left empty.

Steps to reproduce
Run a collector with the config below.

The collector produces the logs:

2020-12-16T21:04:44.840Z	info	internal/metrics_adjuster.go:357	Adjust - skipping unexpected point	{"component_kind": "receiver", "component_type": "prometheus", "component_name": "prometheus", "type": "UNSPECIFIED"}
2020-12-16T21:04:45.035Z	INFO	loggingexporter/logging_exporter.go:361	MetricsExporter	{"#metrics": 1}
2020-12-16T21:04:45.035Z	DEBUG	loggingexporter/logging_exporter.go:388	ResourceMetrics #0
Resource labels:
     -> service.name: STRING(prometheusreceiver)
     -> host.name: STRING(localhost)
     -> port: STRING(8888)
     -> scheme: STRING(http)
InstrumentationLibraryMetrics #0
InstrumentationLibrary
Metric #0
Descriptor:
     -> Name:
     -> Description:
     -> Unit:
     -> DataType: None

Note the empty Name, Description, Unit, and DataType.

What did you expect to see?
I expected metrics to be renamed, and otherwise emitted normally.

What did you see instead?
Metrics are dropped during conversion from OpenCensus format to OpenTelemetry format because the metric descriptor type is Unknown.

What version did you use?
Version: otel/opentelemetry-collector-contrib-dev@sha256:0dbc61590cc04678997173fb378c875e2733ff2e443d75a7957a340d4b2bb9ef (latest)

What config did you use?
Config: (e.g. the yaml config file)

receivers:
  prometheus:
    config:
      global:
        scrape_interval: 10s
      scrape_configs:
      - job_name: 'prometheusreceiver'
        static_configs:
        - targets: [localhost:8888]
        metric_relabel_configs:
        # filter out all metrics except otelcol_process_cpu_seconds so it is easier to read logs
        - source_labels: [ __name__ ]
          regex: "otelcol_process_cpu_seconds"
          action: keep
        # rename otelcol_process_cpu_seconds to process_cpu_seconds by replacing the __name__ label
        - source_labels: [ __name__ ]
          regex: "otelcol_(.*)"
          action: replace
          target_label: __name__
exporters:
  logging:
    logLevel: debug
service:
  pipelines:
    metrics:
      receivers: [prometheus]
      exporters: [logging]

Additional context
I was able to determine the root cause above by adding additional debug statements and rebuilding the collector.

@dashpole
Copy link
Contributor Author

dashpole commented Jun 4, 2021

Given the metadata cache (and prometheus' target metadata endpoint) is intended to be used to attach metadata to samples, I opened an issue with upstream prometheus to see if they are interested in changing the name that is included in metadata. In the mean time, we should probably validate the prometheus configuration to ensure users don't set __name__ as a target label, since that won't work.

@bogdandrutu bogdandrutu transferred this issue from open-telemetry/opentelemetry-collector Aug 30, 2021
@alolita alolita added the comp:prometheus Prometheus related issues label Sep 2, 2021
@dashpole
Copy link
Contributor Author

Re-linking open-telemetry/opentelemetry-collector#3410, which is related, and the issue transfer seems to have dropped the link

@dashpole
Copy link
Contributor Author

Discussed this at the 9/22 meeting:

The prometheus server will also produce untyped metrics if you rename the metric, so we aren't necessarily out of compliance for producing untyped metrics when a metric was renamed.

Given that, to "fix" this, we just need to:

  1. Make sure unknown typed metrics produced by the receiver can be correctly sent through the collector to the remote write exporter
    1. In the description, I was seeing "Metrics are dropped during conversion from OpenCensus format to OpenTelemetry format because the metric descriptor type is Unknown". We need to make sure this is fixed.
  2. Remove the validation added in Disallow renaming metrics using the prometheus receiver opentelemetry-collector#3410

@dashpole
Copy link
Contributor Author

I believe step 1 would be fixed by #4743.

@Mario-Hofstaetter
Copy link

@dashpole Is there any progress on this?
I just hit the metric renaming using metric_relabel_configs is disallowed error introduced by #3410 while trying to use the feature.

@dashpole
Copy link
Contributor Author

I don't think we have proper support for untyped metrics right now. There is some discussion here about how best to accomplish that: open-telemetry/prometheus-interoperability-spec#69

hex1848 pushed a commit to hex1848/opentelemetry-collector-contrib that referenced this issue Jun 2, 2022
…emetry#5001)

* Remove "Attribute" part from common pdata collections names

All of the pdata wrappers for collections include Attribute part in its name because the fields used to be part of the attributes fields only. But it's not the case anymore since `pdata.LogRecord.Body` uses AnyValue that can contain any common collection. This change renames AttributeMap and AttributeValueSlice collections by removing Attribute part from their names and making them consistent.

* Change NewMapFromRaw to take interface{}

Co-authored-by: Bogdan Drutu <[email protected]>
@github-actions
Copy link
Contributor

github-actions bot commented Nov 8, 2022

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

@github-actions github-actions bot added the Stale label Nov 8, 2022
@Mario-Hofstaetter
Copy link

Mario-Hofstaetter commented Nov 8, 2022

Push, Edit: push again?

@dashpole dashpole removed the Stale label Dec 7, 2022
@KovtunV
Copy link

KovtunV commented Dec 9, 2022

Hi, is any progress?

@github-actions
Copy link
Contributor

github-actions bot commented Feb 8, 2023

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

@github-actions github-actions bot added the Stale label Feb 8, 2023
@Mario-Hofstaetter
Copy link

Push

@dashpole dashpole removed the Stale label Feb 8, 2023
@github-actions
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

@github-actions github-actions bot added the Stale label Apr 10, 2023
@Mario-Hofstaetter
Copy link

@dashpole should we continue to push this or let it go stale? Seems to be low prio.

Too bad there is no __type__ metadata label that could be utilized to just set the type again after renaming. Is there any other workaround?

@github-actions github-actions bot removed the Stale label Apr 12, 2023
@github-actions
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

@github-actions github-actions bot added the Stale label Jun 12, 2023
@Mario-Hofstaetter
Copy link

Bump

@github-actions github-actions bot removed the Stale label Jun 13, 2023
@github-actions
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

@dashpole
Copy link
Contributor Author

I don't think we can fix the underlying issue. But we can at least not fail when users try and do this. We convert unknown-typed metrics to gauges (without unit or description) today, which is better than failing entirely.

@crobert-1 crobert-1 added the receiver/prometheus Prometheus receiver label Oct 9, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Oct 9, 2023

Pinging code owners for receiver/prometheus: @Aneurysm9 @dashpole. See Adding Labels via Comments if you do not have permissions to add labels yourself.

mx-psi pushed a commit that referenced this issue Oct 20, 2023
…nfigs (#25888)

**Description:**

Fixes
#5001

When that issue was filed, we dropped unknown-typed metrics. Now that we
convert them to gauges, users may want to use metric_relabel_configs to
rename metrics, even if they lose metadata. In the future, we have other
enhancements planned to better-support unknown-typed metrics.

Change the error to a warning. It is still likely not a best practice to
use relabel configs for this purpose, but for users copy-pasting
prometheus configs, it is best not to fail if we can help it.
sigilioso pushed a commit to carlossscastro/opentelemetry-collector-contrib that referenced this issue Oct 27, 2023
…nfigs (open-telemetry#25888)

**Description:**

Fixes
open-telemetry#5001

When that issue was filed, we dropped unknown-typed metrics. Now that we
convert them to gauges, users may want to use metric_relabel_configs to
rename metrics, even if they lose metadata. In the future, we have other
enhancements planned to better-support unknown-typed metrics.

Change the error to a warning. It is still likely not a best practice to
use relabel configs for this purpose, but for users copy-pasting
prometheus configs, it is best not to fail if we can help it.
jmsnll pushed a commit to jmsnll/opentelemetry-collector-contrib that referenced this issue Nov 12, 2023
…nfigs (open-telemetry#25888)

**Description:**

Fixes
open-telemetry#5001

When that issue was filed, we dropped unknown-typed metrics. Now that we
convert them to gauges, users may want to use metric_relabel_configs to
rename metrics, even if they lose metadata. In the future, we have other
enhancements planned to better-support unknown-typed metrics.

Change the error to a warning. It is still likely not a best practice to
use relabel configs for this purpose, but for users copy-pasting
prometheus configs, it is best not to fail if we can help it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:prometheus Prometheus related issues receiver/prometheus Prometheus receiver
Projects
None yet
5 participants