Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define span attributes for messaging systems #418

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
332ea52
Carve out messaging from semantic conventions draft
arminru Jan 22, 2020
6e59559
Clarify message destination name
arminru Jan 22, 2020
1f0e5d1
Rename msg to messaging and avoid other abbrevs
arminru Jan 22, 2020
93bd69b
Remove WIP marker
arminru Jan 22, 2020
2a1791a
dst -> destination
arminru Jan 23, 2020
c8d2c96
Require the messaging system's IP or hostname
arminru Jan 23, 2020
0975ab2
typo
arminru Jan 23, 2020
8d1788d
Add messaging to list of span conventions
arminru Jan 23, 2020
8b41bc7
Fix op -> operation
arminru Jan 23, 2020
b97bf67
Remove peek operation
arminru Jan 23, 2020
9491f14
Formatting
arminru Jan 23, 2020
80727c4
Merge suggestions from #395
arminru Jan 23, 2020
d976443
Fix samples for messaging.system
arminru Jan 24, 2020
9a7254a
Remove `component` attribute as per #271
arminru Jan 29, 2020
d30a56d
Apply suggestions from code review
arminru Feb 18, 2020
de0afba
Define message_id as string
arminru Feb 18, 2020
a8b6e1b
Typo
arminru Feb 18, 2020
e8931de
Add guidance for span names for temp destinations
arminru Feb 18, 2020
5da7c84
Add section on RabbitMQ
arminru Feb 18, 2020
f2ea69e
Add note that messaging.destination might be equal to span name
arminru Feb 18, 2020
cc4fb06
Add note on receive/process timing
arminru Feb 18, 2020
daf1874
Remove colon 👨‍⚕️
arminru Feb 18, 2020
a9b8209
Add examples
arminru Feb 19, 2020
bb968b7
Make markdownlint happy
arminru Feb 19, 2020
4613bd1
Clarify processing span kind
arminru Feb 20, 2020
e00d1e8
Fix table in example
arminru Feb 24, 2020
d73aa5e
Make destination_kind optional
arminru Feb 24, 2020
0be1d95
Add batch processor example
arminru Feb 24, 2020
6355871
Apply suggestions from code review
arminru Feb 25, 2020
713b78c
Rename spans in examples
arminru Feb 25, 2020
feeb2c5
Declare messaging.protocol to be without version
arminru Mar 17, 2020
c639012
Add `messaging.protocol_version`
arminru Mar 17, 2020
92748ab
Recommend setting `net.transport`
arminru Apr 1, 2020
1cf72e5
Move file after directories were restructured upstream
arminru Apr 7, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions specification/trace/semantic_conventions/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,5 @@ The following semantic conventions for spans are defined:
* [HTTP](http.md): Spans for HTTP client and server.
* [Database](database.md): Spans for SQL and NoSQL client calls.
* [RPC/RMI](rpc.md): Spans for remote procedure calls (e.g., gRPC).
* [Messaging](messaging.md): Spans for interaction with messaging systems (queues, publish/subscribe, etc.).
* [General](span-general.md): General semantic attributes that may be used in describing different kinds of operations.
182 changes: 182 additions & 0 deletions specification/trace/semantic_conventions/messaging.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,182 @@
# Messaging systems

<!-- Re-generate TOC with `markdown-toc --no-first-h1 -i` -->

<!-- toc -->

- [Definitions](#definitions)
- [Conventions](#conventions)
- [Messaging attributes](#messaging-attributes)
- [Examples](#examples)

<!-- tocstop -->

## Definitions

Although messaging systems are not as standardized as, e.g., HTTP, it is assumed that the following definitions are applicable to most of them that have similar concepts at all (names borrowed mostly from JMS):

A *message* usually consists of headers (or properties, or meta information) and an optional body. It is sent by a single message *producer* to:

* Physically: some message *broker* (which can be e.g., a single server, or a cluster, or a local process reached via IPC). The broker handles the actual routing, delivery, re-delivery, persistence, etc. In some messaging systems the broker may be identical or co-located with (some) message consumers.
* Logically: some particular message *destination*.

A destination is usually identified by some name unique within the messaging system instance, which might look like an URL or a simple one-word identifier.
Two kinds of destinations are distinguished: *topic*s and *queue*s.
A message that is sent (the send-operation is often called "*publish*" in this context) to a *topic* is broadcasted to all *subscribers* of the topic.
A message submitted to a queue is processed by a message *consumer* (usually exactly once although some message systems support a more performant at-least-once mode for messages with [idempotent][] processing).

The consumption of a message can happen in multiple steps.
First, the lower-level receiving of a message at a consumer, and then the logical processing of the message.
Often, the waiting for a message is not particularly interesting and hidden away in a framework that only invokes some handler function to process a message once one is received
(in the same way that the listening on a TCP port for an incoming HTTP message is not particularly interesting).
However, in a synchronous conversation, the wait time for a message is important.

In some messaging systems, a message can receive a reply message that answers a particular other message that was sent earlier. All messages that are grouped together by such a reply-relationship are called a *conversation*. The grouping usually happens through some sort of "In-Reply-To:" meta information or an explicit conversation ID. Sometimes a conversation can span multiple message destinations (e.g. initiated via a topic, continued on a temporary one-to-one queue).

Some messaging systems support the concept of *temporary destination* (often only temporary queues) that are established just for a particular set of communication partners (often one to one) or conversation. Often such destinations are unnamed or have an auto-generated name.

[idempotent]: https://en.wikipedia.org/wiki/Idempotence

## Conventions

Given these definitions, the remainder of this section describes the semantic conventions that shall be followed for Spans describing interactions with messaging systems.

**Span name:** The span name should usually be set to the message destination name.
The conversation ID should be used instead when it is expected to have lower cardinality.
In particular, the conversation ID must be used if the message destination is unnamed or temporary unless multiple conversations can be combined to a logical destination of lower cardinality.

**Span kind:** A producer of a message should set the span kind to `PRODUCER` unless it synchronously waits for a response: then it should use `CLIENT`.
The processor of the message should set the kind to `CONSUMER`, unless it always sends back a reply that is directed to the producer of the message
(as opposed to e.g., a queue on which the producer happens to listen): then it should use `SERVER`.

## Messaging attributes

| Attribute name | Notes and examples | Required? |
| -------------- | ---------------------------------------------------------------------- | --------- |
| `messaging.system` | A string identifying the messaging system vendor such as `kafka`, `rabbitmq` or `activemq`. | Yes |
| `messaging.destination` | The message destination name, e.g. `MyQueue` or `MyTopic`. This might be equal to the span name but is required nevertheless. | Yes |
| `messaging.destination_kind` | The kind of message destination: Either `queue` or `topic`. | Yes, if either of them applies. |
| `messaging.temp_destination` | A boolean that is `true` if the message destination is temporary. | If temporary (assumed to be `false` if missing). |
| `messaging.protocol` | The name of the transport protocol such as `AMQP` or `MQTT`. | No |
| `messaging.protocol_version` | The version of the transport protocol such as `0.9.1`. | No |
| `messaging.url` | Connection string such as `tibjmsnaming://localhost:7222` or `https://queue.amazonaws.com/80398EXAMPLE/MyQueue`. | No |
| `messaging.message_id` | A value used by the messaging system as an identifier for the message, represented as a string. | No |
| `messaging.conversation_id` | A value identifying the conversation to which the message belongs, represented as a string. Sometimes called "Correlation ID". | No |

Additionally at least one of `net.peer.name` or `net.peer.ip` from the [network attributes][] is required and `net.peer.port` is recommended.
Furthermore, it is strongly recommended to add the [`net.transport`][] attribute and follow its guidelines, especially for in-process queueing systems (like [Hangfire][], for example).
These attributes should be set to the broker to which the message is sent/from which it is received.

[network attributes]: data-span-general.md#general-network-connection-attributes
[`net.transport`]: data-span-general.md#nettransport-attribute
[Hangfire]: https://www.hangfire.io/

For message consumers, the following additional attributes may be set:

| Attribute name | Notes and examples | Required? |
| -------------- | ---------------------------------------------------------------------- | --------- |
| `messaging.operation` | A string identifying which part and kind of message consumption this span describes: either `receive` or `process`. (If the operation is `send`, this attribute must not be set: the operation can be inferred from the span kind in that case.) | No |

The _receive_ span is be used to track the time used for receiving the message(s), whereas the _process_ span(s) track the time for processing the message(s).
Note that one or multiple Spans with `messaging.operation` = `process` may often be the children of a Span with `messaging.operation` = `receive`.
Even though in that case one might think that the processing span's kind should be `INTERNAL`, that kind MUST NOT be used.
Instead span kind should be set to either `CONSUMER` or `SERVER` according to the rules defined above.

### Attributes specific to certain messaging systems

#### RabbitMQ

In RabbitMQ, the destination is defined by an _exchange_ and a _routing key_.
`messaging.destination` MUST be set to the name of the exchange. This will be an empty string if the default exchange is used.
The routing key MUST be provided to the attribute `messaging.rabbitmq.routing_key`, unless it is empty.

## Examples

### Topic with multiple consumers

Given is a process P, that publishes a message to a topic T on messaging system MS, and two processes CA and CB, which both receive the message and process it.

```
Process P: | Span Prod1 |
--
Process CA: | Span CA1 |
--
Process CB: | Span CB1 |
```

| Field or Attribute | Span Prod1 | Span CA1 | Span CB1 |
|-|-|-|-|
| Name | `"T"` | `"T"` | `"T"` |
| Parent | | Span Prod1 | Span Prod1 |
| Links | | | |
| SpanKind | `PRODUCER` | `CONSUMER` | `CONSUMER` |
| Status | `Ok` | `Ok` | `Ok` |
| `net.peer.name` | `"ms"` | `"ms"` | `"ms"` |
| `net.peer.port` | `1234` | `1234` | `1234` |
| `messaging.system` | `"kafka"` | `"kafka"` | `"kafka"` |
| `messaging.destination` | `"T"` | `"T"` | `"T"` |
| `messaging.destination_kind` | `"topic"` | `"topic"` | `"topic"` |
| `messaging.operation` | | `"process"` | `"process"` |
| `messaging.message_id` | `"a1"` | `"a1"`| `"a1"` |

### Batch receiving

Given is a process P, that sends two messages to a queue Q on messaging system MS, and a process C, which receives both of them in one batch (Span Recv1) and processes each message separately (Spans Proc1 and Proc2).

Since a span can only have one parent and the propagated trace and span IDs are not known when the receiving span is started, the receiving span will have no parent and the processing spans are correlated with the producing spans using links.

```
Process P: | Span Prod1 | Span Prod2 |
--
Process C: | Span Recv1 |
| Span Proc1 |
| Span Proc2 |
```

| Field or Attribute | Span Prod1 | Span Prod2 | Span Recv1 | Span Proc1 | Span Proc2 |
|-|-|-|-|-|-|
| Name | `"Q"` | `"Q"` | `"Q"` | `"Q"` | `"Q"` |
| Parent | | | | Span Recv1 | Span Recv1 |
| Links | | | | Span Prod1 | Span Prod2 |
| SpanKind | `PRODUCER` | `PRODUCER` | `CONSUMER` | `CONSUMER` | `CONSUMER` |
| Status | `Ok` | `Ok` | `Ok` | `Ok` | `Ok` |
| `net.peer.name` | `"ms"` | `"ms"` | `"ms"` | `"ms"` | `"ms"` |
| `net.peer.port` | `1234` | `1234` | `1234` | `1234` | `1234` |
| `messaging.system` | `"kafka"` | `"kafka"` | `"kafka"` | `"kafka"` | `"kafka"` |
| `messaging.destination` | `"Q"` | `"Q"` | `"Q"` | `"Q"` | `"Q"` |
| `messaging.destination_kind` | `"queue"` | `"queue"` | `"queue"` | `"queue"` | `"queue"` |
| `messaging.operation` | | | `"receive"` | `"process"` | `"process"` |
| `messaging.message_id` | `"a1"` | `"a2"` | | `"a1"` | `"a2"` |

### Batch processing

Given is a process P, that sends two messages to a queue Q on messaging system MS, and a process C, which receives both of them separately (Span Recv1 and Recv2) and processes both messages in one batch (Span Proc1).

Since each span can only have one parent, C3 should not choose a random parent out of C1 and C2, but rather rely on the implicitly selected parent as defined by the [tracing API spec](api-tracing.md).
Similarly, only one value can be set as `message_id`, so C3 cannot report both `a1` and `a2` and therefore attribute is left out.
Depending on the implementation, the producing spans might still be available in the meta data of the messages and should be added to C3 as links.
The client library or application could also add the receiver span's span context to the data structure it returns for each message. In this case, C3 could also add links to the receiver spans C1 and C2.

The status of the batch processing span is selected by the application. Depending on the semantics of the operation. A span status `Ok` could, for example, be set only if all messages or if just at least one were properly processed.

```
Process P: | Span Prod1 | Span Prod2 |
--
Process C: | Span Recv1 | Span Recv2 |
| Span Proc1 |
```

| Field or Attribute | Span Prod1 | Span Prod2 | Span Recv1 | Span Recv2 | Span Proc1 |
|-|-|-|-|-|-|
| Name | `"Q"` | `"Q"` | `"Q"` | `"Q"` | `"Q"` |
| Parent | | | Span Prod1 | Span Prod2 | |
| Links | | | | | Span Prod1 + Prod2 |
| SpanKind | `PRODUCER` | `PRODUCER` | `CONSUMER` | `CONSUMER` | `CONSUMER` |
| Status | `Ok` | `Ok` | `Ok` | `Ok` | `Ok` |
| `net.peer.name` | `"ms"` | `"ms"` | `"ms"` | `"ms"` | `"ms"` |
| `net.peer.port` | `1234` | `1234` | `1234` | `1234` | `1234` |
| `messaging.system` | `"kafka"` | `"kafka"` | `"kafka"` | `"kafka"` | `"kafka"` |
| `messaging.destination` | `"Q"` | `"Q"` | `"Q"` | `"Q"` | `"Q"` |
| `messaging.destination_kind` | `"queue"` | `"queue"` | `"queue"` | `"queue"` | `"queue"` |
| `messaging.operation` | | | `"receive"` | `"receive"` | `"process"` |
| `messaging.message_id` | `"a1"` | `"a2"` | `"a1"` | `"a2"` | |