Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: "http.method" is not a valid label name for metric #2772

Closed
albertteoh opened this issue Mar 19, 2021 · 5 comments
Closed

Error: "http.method" is not a valid label name for metric #2772

albertteoh opened this issue Mar 19, 2021 · 5 comments
Labels
bug Something isn't working

Comments

@albertteoh
Copy link
Contributor

albertteoh commented Mar 19, 2021

Describe the bug

Running Otel-collector from v0.22.0 with spanmetrics processor enabled in a trace pipeline results in errors and no metrics exported.

Steps to reproduce
Run Otel-collector with spanmetrics processor in a trace pipeline, then send traces containing span tags with periods . in them like span.kind or http.method.

What did you expect to see?
No error.

What did you see instead?

2021-03-19T21:37:24.397+1100	error	prometheusexporter/collector.go:259	failed to convert metric calls: "http.method" is not a valid label name for metric "calls"	{"component_kind": "exporter", "component_type": "prometheus", "component_name": "prometheus"}
go.opentelemetry.io/collector/exporter/prometheusexporter.(*collector).Collect
	/Users/albertteoh/go/pkg/mod/go.opentelemetry.io/[email protected]/exporter/prometheusexporter/collector.go:259
github.com/prometheus/client_golang/prometheus.(*Registry).Gather.func1
	/Users/albertteoh/go/pkg/mod/github.com/prometheus/[email protected]/prometheus/registry.go:448

What version did you use?
Version: v0.22.0

What config did you use?
Config:

 processors:
   batch:
   spanmetrics:
     metrics_exporter: prometheus
     dimensions:
     - name: http.method
       default: GET
     - name: http.status_code

 service:
   pipelines:
     traces:
       receivers: [jaeger]
       processors: [spanmetrics, batch]
       exporters: [jaeger]
     metrics/spanmetrics:
       receivers: [otlp/spanmetrics]
       exporters: [prometheus]
     metrics:
       receivers: [prometheus]
       exporters: [prometheusremotewrite]

Environment
OS: macOS
Compiler(if manually compiled): go 1.16

Additional context

I believe the root cause is from this line that checks label names only contain alphanumeric chars or _, triggering errors for labels containing . like span.kind and http.method.

That line of code is 5 years old, but there was a recent rewrite of prometheus exporter which calls prometheus' client_golang lib, and I think this is why we're seeing this error from v0.22.0.

Given . char is invalid in prometheus labels, I propose to fix this in spanmetrics processor by replacing all non-alphanumeric chars with _.

@albertteoh albertteoh added the bug Something isn't working label Mar 19, 2021
@yeya24
Copy link
Contributor

yeya24 commented Mar 23, 2021

Haven't tested it but has this issue fixed by open-telemetry/opentelemetry-collector#2707?

Update: this version works for me. But it introduces a possible duplicate labels problem.

For example service.name is a default reserved label name and it will be sanitized to service_name when exporting Prometheus metrics.
But if I add another service_name label to dimension manually, this additional dimension is still acceptable and will be sanitized to service_name again later, causing duplicate labels error.

Do you think this is a bug or something we can fix? I am happy to open a pr for this.

@albertteoh
Copy link
Contributor Author

@yeya24 Thanks for the heads up on open-telemetry/opentelemetry-collector#2707 and good catch on the duplicate labels issue.

I think we should check for the default reserved sanitized labels like service_name, span_kind and status_code, etc. on startup and report an error if these exist, warning the user that these are reserved labels and preventing startup of the collector.

If "overlapping" non-reserved optional dimensions are defined like http.status_code and http_status_code, I can think of two options:

  • pre-sanitize the label and overwrite the existing metric value
  • leave it to prometheusexporter to post-sanitize resulting in the duplicate label error and so the user will need to exclude that label from their optional dimensions config.

I prefer the latter because it's explicit and avoids surprises.

What do you think?

@yeya24
Copy link
Contributor

yeya24 commented Mar 24, 2021

Yes I think this makes to me. But for overlapping non-reserved dimensions, I think we should check them on startup and throw error as well?

@albertteoh
Copy link
Contributor Author

But for overlapping non-reserved dimensions, I think we should check them on startup and throw error as well?

@yeya24 yes, good idea. Did you still want to work on this PR?

@yeya24
Copy link
Contributor

yeya24 commented Mar 25, 2021

@albertteoh Yes, pr opened #2844. Please take a look when you have time.

punya pushed a commit to punya/opentelemetry-collector-contrib that referenced this issue Jul 21, 2021
alexperez52 referenced this issue in open-o11y/opentelemetry-collector-contrib Aug 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants