Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define messaging metrics and add error.type attribute to spans #163

Merged
merged 8 commits into from
Nov 30, 2023

Conversation

lmolkova
Copy link
Contributor

@lmolkova lmolkova commented Jul 5, 2023

Copy link
Contributor

@pyohannes pyohannes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting a stake in the ground here, this is a great start.

specification/messaging/messaging-metrics.md Outdated Show resolved Hide resolved
specification/messaging/messaging-metrics.md Outdated Show resolved Hide resolved
specification/messaging/messaging-metrics.md Outdated Show resolved Hide resolved
specification/messaging/messaging-metrics.md Outdated Show resolved Hide resolved
@lmolkova lmolkova force-pushed the messaging-metrics branch 3 times, most recently from 11af3e7 to 4dcbade Compare July 26, 2023 20:02
docs/messaging/messaging-metrics.md Outdated Show resolved Hide resolved
docs/messaging/messaging-metrics.md Outdated Show resolved Hide resolved
docs/messaging/messaging-metrics.md Outdated Show resolved Hide resolved
docs/messaging/messaging-metrics.md Outdated Show resolved Hide resolved
docs/messaging/messaging-metrics.md Outdated Show resolved Hide resolved
docs/messaging/messaging-metrics.md Outdated Show resolved Hide resolved
docs/messaging/messaging-metrics.md Show resolved Hide resolved
@cyrille-leclerc
Copy link
Member

cyrille-leclerc commented Aug 7, 2023

I love the duration metrics.
FYI Related to this, I'm brainstorming with @jpkrohling and others on the idea to adopt OTel Semantic Conventions metrics in the OTel Collector Service Graph Connector and we are looking at standard metrics for client/producer and server/consumer durations.

Did we consider stronger consistency with existing http.server.duration, http.client.duration, rpc.{client, server}.duration that are aligned with the SpanKind={server, client, producer, consumer, internal} and look at messaging metrics like:

  • Preferring name messaging.producer.duration over messaging.publish.duration for the definition "Measures the duration of publish operation."
  • Introducing messaging.consumer.duration with the definition "Measures the duration to consume messages."
    • This would be consistent with http.server.duration: Measures the duration of inbound HTTP requests.

@pyohannes
Copy link
Contributor

  • Preferring name messaging.producer.duration over messaging.publish.duration for the definition "Measures the duration of publish operation."
  • Introducing messaging.consumer.duration with the definition "Measures the duration to consume messages."
    • This would be consistent with http.server.duration: Measures the duration of inbound HTTP requests.

On the messaging side, the current metric names relate to the messaging specific operation names. For the consumer side I definitely see value in having separate consumer metrics pull-based (receive) and push-based (deliver) scenarios. The duration of pull and push durations aren't semantically consistent, as the push duration usually also involves the duration of processing the message, whereas the pull duration doesn't. We shouldn't mix both in one single metric.

@lmolkova lmolkova marked this pull request as ready for review September 18, 2023 05:15
@lmolkova lmolkova requested review from a team September 18, 2023 05:15
@lmolkova lmolkova changed the title Messaging metrics Define messaging metrics and add error.type attribute to spans Nov 9, 2023
docs/messaging/messaging-metrics.md Outdated Show resolved Hide resolved
docs/messaging/messaging-metrics.md Outdated Show resolved Hide resolved
docs/messaging/messaging-metrics.md Outdated Show resolved Hide resolved
docs/messaging/messaging-metrics.md Outdated Show resolved Hide resolved
docs/messaging/messaging-metrics.md Outdated Show resolved Hide resolved
docs/messaging/messaging-metrics.md Outdated Show resolved Hide resolved
docs/messaging/messaging-metrics.md Outdated Show resolved Hide resolved
docs/messaging/messaging-metrics.md Outdated Show resolved Hide resolved
docs/messaging/messaging-metrics.md Outdated Show resolved Hide resolved
docs/messaging/messaging-metrics.md Outdated Show resolved Hide resolved
@pyohannes
Copy link
Contributor

For spans, messaging systems add system-specific attributes to the spans.

Do we want to treat messaging system specific metric dimensions the same way, so that messaging systems extend existing metrics? Or do we require them to send different system-specific metrics?

The first approach makes cardinality hard to control (imagining a single service using two different messaging systems), the second one will likely duplicate information.

@lmolkova
Copy link
Contributor Author

lmolkova commented Nov 22, 2023

Do we want to treat messaging system specific metric dimensions the same way, so that messaging systems extend existing metrics? Or do we require them to send different system-specific metrics?

I think there a couple of options here:

  1. Allow to extend generic metrics.

    • I've created update circleci to use go 1.14 opentelemetry-specification#553 to follow up and will prototype how it looks like for one of the Azure SDKs.
    • The only downside of this approach is that applications that see metrics from multiple messaging systems will need to have slightly different dashboards/alerts/queries for them.
  2. Make every system come up with their own set of additional metrics that would sometimes overlap with the generic ones.

    • We should probably change this PR to just describe how to create custom messaging metrics semconv and not attempt to define generic metrics
    • The downside is that every system would need to define something custom (even if it's just a metric name)

I suggest to start with Option 1 - as messaging semconv progresses towards stability, we should get more feedback from messaging systems and instrumentation prototypes and can change the approach.

Copy link
Contributor

@pyohannes pyohannes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still some details to clarify, but this provides a great starting point.

Copy link
Member

@joaopgrassi joaopgrassi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks like a good start! Left some non blocking comments.

docs/messaging/messaging-metrics.md Outdated Show resolved Hide resolved
docs/messaging/messaging-metrics.md Outdated Show resolved Hide resolved
docs/messaging/messaging-metrics.md Outdated Show resolved Hide resolved
@lmolkova
Copy link
Contributor Author

lmolkova commented Nov 27, 2023

@open-telemetry/specs-semconv-approvers this PR is approved by the messaging WG members, please take a look

@joaopgrassi joaopgrassi merged commit f51df2f into open-telemetry:main Nov 30, 2023
9 checks passed
pyohannes pushed a commit to pyohannes/semantic-conventions that referenced this pull request Jan 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Define metric semantic conventions for messaging systems
9 participants