Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[processor/spanmetricsprocessor] Allow set metrics namespace #18199

Closed
wants to merge 7 commits into from

Conversation

Cluas
Copy link
Contributor

@Cluas Cluas commented Jan 31, 2023

Description:

The currently generated metric name differs from the description in the Tempo APM table document, such as calls_total vs traces_spanmetrics_calls_total. Why don't we simply open up the setting of the namespace on the processor side, so that I can easily integrate with the Tempo APM table?
so we can just do this:

processors:
  batch:
  spanmetrics:
    metrics_exporter: otlp/spanmetrics
    namespace: traces_spanmetrics
    latency_histogram_buckets: [100us, 1ms, 2ms, 6ms, 10ms, 100ms, 250ms]
    dimensions:
      - name: http.method
        default: GET
      - name: http.status_code
    dimensions_cache_size: 1000
    aggregation_temporality: "AGGREGATION_TEMPORALITY_CUMULATIVE"  

Link to tracking Issue:

Testing:

Documentation:

@Cluas Cluas requested review from a team and bogdandrutu January 31, 2023 15:18
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Jan 31, 2023

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: Cluas / name: Cluas (83a9736)

@github-actions github-actions bot added the processor/spanmetrics Span Metrics processor label Jan 31, 2023
Copy link
Contributor

@albertteoh albertteoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution, it looks good, just a comment about the test assertion.

@Cluas Cluas requested review from albertteoh and removed request for bogdandrutu February 3, 2023 16:45
Copy link
Contributor

@albertteoh albertteoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏼

@devrimdemiroz
Copy link

I am looking forward on this change as well 🥇

Thanks @Cluas !

@devrimdemiroz
Copy link

devrimdemiroz commented Feb 10, 2023

@Cluas

I hope it resolves with no side effects. I noticed some ugly other stuff like it uses the same export config with spanmetrics. I documented here

Currently, metrictransform processor eased my headache at least.

@Cluas
Copy link
Contributor Author

Cluas commented Feb 10, 2023

metricstransform:
transforms:
- include: calls_total
match_type: strict
action: update
new_name: traces_spanmetrics_calls_total
operations:
- action: update_label
label: operation
new_label: span.name
- action: update_label
label: service.name
new_label: service
- include: latency
match_type: strict
action: update
new_name: traces_spanmetrics_latency

@Cluas
Copy link
Contributor Author

Cluas commented Feb 10, 2023

@dashpole PTAL

@kovrus
Copy link
Member

kovrus commented Feb 10, 2023

Thanks for addressing this issue!

However, It seems that calls_total mixes the Prometheus naming convention in the metric name. It seems that there was some significant work done in this area, see #8950 and attached PRs, also see prometheus-metric-points-to-otlp spec. Imho, the spanmetrics processor should probably generate a valid OTel metrics name which can be further converted to the Prometheus metrics name format with means of Prom exporters or transformed with certain processors if needed.

I am wondering whether we shall provide the ability to set histogram and counter names explicitly via configuration instead of only providing a prefix? cc: @dashpole @gouthamve @albertteoh

@gouthamve
Copy link
Member

I agree that the OTLP metric should be calls. When its exported to Prometheus, the exporter will automatically add _total.

I am wondering whether we shall provide the ability to set histogram and counter names explicitly via configuration instead of only providing a prefix?

I like the idea, but I am also a little conflicted. This means if we are generating 5 different metrics, we'll need to set 5 different names? A prefix makes it easier. However if we know that we will only export a couple of metrics from this processor, maybe explicit naming makes sense.

@kovrus
Copy link
Member

kovrus commented Feb 10, 2023

I agree that the OTLP metric should be calls. When it's exported to Prometheus, the exporter will automatically add _total.

Yes, we should drop _total.

I am wondering whether we shall provide the ability to set histogram and counter names explicitly via configuration instead of only providing a prefix?

I like the idea, but I am also a little conflicted. This means if we are generating 5 different metrics, we'll need to set 5 different names? A prefix makes it easier. However if we know that we will only export a couple of metrics from this processor, maybe explicit naming makes sense.

For now, it is only 2. Right, maybe the prefix makes more sense. Also, shall we use . instead of _ to compose OTel metrics name?

@gouthamve
Copy link
Member

I would use . yep.

@Cluas
Copy link
Contributor Author

Cluas commented Feb 11, 2023

@gouthamve @kovrus Thank you both for your review suggestions!

@kovrus
Copy link
Member

kovrus commented Feb 11, 2023

@Cluas thanks! Do you mind opening a separate PR with the breaking renaming change (calls_total to calls)?

@Cluas
Copy link
Contributor Author

Cluas commented Feb 11, 2023

@Cluas thanks! Do you mind opening a separate PR with the breaking renaming change (calls_total to calls)?

ok, I'll remove this change first and submit a separate PR change

@albertteoh
Copy link
Contributor

I like the idea of dropping _total (and good idea to move it to a separate PR) to keep to OTEL standards and be agnostic to the metrics storage backend.

Re: join on ., it's not clear to me what the consensus was from this discussion: open-telemetry/opentelemetry-specification#2740. Either way, I think it'll be okay since, IIUC, it looks like prometheus exporter will normalise the names and replace . with _.

When we do make these changes, I think we'd need to add clear instructions to include the --feature-gates=pkg.translator.prometheus.NormalizeName feature flag when running otel collector.

@runforesight
Copy link

runforesight bot commented Feb 12, 2023

Foresight Summary

    
Major Impacts

build-and-test duration(33 minutes 45 seconds) has decreased 31 minutes 53 seconds compared to main branch avg(1 hour 5 minutes 38 seconds).
View More Details

⭕  build-and-test-windows workflow has finished in 6 seconds (40 minutes 44 seconds less than main branch avg.) and finished at 14th Feb, 2023.


Job Failed Steps Tests
windows-unittest-matrix -     🔗  N/A See Details
windows-unittest -     🔗  N/A See Details

✅  check-links workflow has finished in 1 minute 21 seconds (1 minute 14 seconds less than main branch avg.) and finished at 14th Feb, 2023.


Job Failed Steps Tests
changed files -     🔗  N/A See Details
check-links -     🔗  N/A See Details

✅  telemetrygen workflow has finished in 1 minute 49 seconds (1 minute 18 seconds less than main branch avg.) and finished at 14th Feb, 2023.


Job Failed Steps Tests
publish-latest -     🔗  N/A See Details
build-dev -     🔗  N/A See Details
publish-stable -     🔗  N/A See Details

✅  changelog workflow has finished in 2 minutes 20 seconds (1 minute 23 seconds less than main branch avg.) and finished at 14th Feb, 2023.


Job Failed Steps Tests
changelog -     🔗  N/A See Details

✅  prometheus-compliance-tests workflow has finished in 12 minutes 21 seconds (⚠️ 3 minutes 32 seconds more than main branch avg.) and finished at 14th Feb, 2023.


Job Failed Steps Tests
prometheus-compliance-tests -     🔗  ✅ 21  ❌ 0  ⏭ 0    🔗 See Details

❌  build-and-test workflow has finished in 33 minutes 45 seconds (31 minutes 53 seconds less than main branch avg.) and finished at 14th Feb, 2023. 3 jobs failed. There are 6 test failures.


Job Failed Steps Tests
correctness-metrics -     🔗  ✅ 2  ❌ 0  ⏭ 0    🔗 See Details
correctness-traces -     🔗  ✅ 17  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.18, extension) -     🔗  ✅ 537  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, extension) -     🔗  ✅ 537  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.18, internal) -     🔗  ✅ 561  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, internal) -     🔗  ✅ 561  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.18, processor) Run Unit Tests     🔗  ✅ 1123  ❌ 6  ⏭ 0    🔗 See Details
unittest-matrix (1.18, receiver-0) -     🔗  ✅ 743  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, receiver-0) -     🔗  ✅ 365  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, receiver-1) -     🔗  ✅ 297  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.18, receiver-1) -     🔗  ✅ 333  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.18, other) -     🔗  ✅ 0  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, exporter) -     🔗  ✅ 1036  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, other) -     🔗  ✅ 0  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, processor) -     🔗  ✅ 800  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.18, exporter) -     🔗  ✅ 1547  ❌ 0  ⏭ 0    🔗 See Details
integration-tests -     🔗  ✅ 55  ❌ 0  ⏭ 0    🔗 See Details
setup-environment -     🔗  N/A See Details
check-codeowners -     🔗  N/A See Details
check-collector-module-version -     🔗  N/A See Details
checks -     🔗  N/A See Details
build-examples -     🔗  N/A See Details
lint-matrix (receiver-0) -     🔗  N/A See Details
lint-matrix (receiver-1) -     🔗  N/A See Details
lint-matrix (processor) -     🔗  N/A See Details
lint-matrix (exporter) -     🔗  N/A See Details
lint-matrix (extension) -     🔗  N/A See Details
lint-matrix (internal) -     🔗  N/A See Details
lint-matrix (other) -     🔗  N/A See Details
unittest (1.19) Interpret result     🔗  N/A See Details
unittest (1.18) Interpret result     🔗  N/A See Details
lint -     🔗  N/A See Details
cross-compile -     🔗  N/A See Details
windows-msi -     🔗  N/A See Details
build-package -     🔗  N/A See Details
publish-check -     🔗  N/A See Details
publish-stable -     🔗  N/A See Details
publish-dev -     🔗  N/A See Details

✅  e2e-tests workflow has finished in 15 minutes 10 seconds and finished at 14th Feb, 2023.


Job Failed Steps Tests
kubernetes-test -     🔗  N/A See Details

❌  load-tests workflow has finished in 25 minutes 50 seconds (⚠️ 8 minutes 48 seconds more than main branch avg.) and finished at 14th Feb, 2023. 1 job failed. There are 2 test failures.


Job Failed Steps Tests
loadtest (TestIdleMode) -     🔗  ✅ 1  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestTraceAttributesProcessor) -     🔗  ✅ 3  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestMetric10kDPS|TestMetricsFromFile) -     🔗  ✅ 6  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestMetricResourceProcessor|TestTrace10kSPS) -     🔗  ✅ 12  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestTraceNoBackend10kSPS|TestTrace1kSPSWithAttrs) -     🔗  ✅ 8  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestTraceBallast1kSPSWithAttrs|TestTraceBallast1kSPSAddAttrs) -     🔗  ✅ 10  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestBallastMemory|TestLog10kDPS) Loadtest     🔗  ✅ 17  ❌ 2  ⏭ 0    🔗 See Details
setup-environment -     🔗  N/A See Details

🔎 See details on Foresight

*You can configure Foresight comments in your organization settings page.

@kovrus
Copy link
Member

kovrus commented Feb 14, 2023

@Cluas thanks! Do you mind opening a separate PR with the breaking renaming change (calls_total to calls)?

ok, I'll remove this change first and submit a separate PR change

Let's wait till #18535 is merged and open a pr with dropping _total against the new spanmetrics connector component to not break the spanmetrics processor.

@Cluas Cluas closed this Feb 15, 2023
@Cluas
Copy link
Contributor Author

Cluas commented Feb 15, 2023

So let's do that on the new spanmetrics connector component and avoid breaking change

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
processor/spanmetrics Span Metrics processor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants