-
Notifications
You must be signed in to change notification settings - Fork 894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Common event attribute names #397
Common event attribute names #397
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I'm a but worried that we're rushing into semantic conventions for a logging system, here. I'd like to keep it as restricted as possible until there's more attention to give this broad topic.
specification/data-events.md
Outdated
| :------------- | :------------------------------------- | --------- | | ||
| `error.kind` | The type or "kind" of an error. E.g., `"Exception"`, `"OSError"` | Yes | | ||
| `error.message` | A concise, human-readable, one-line message explaining the event. E.g., `"Could not connect to backend"`, `"Cache invalidation succeeded"` | Yes | | ||
| `error.object` | For languages that support such a thing (e.g., Java, Python), the actual Throwable/Exception/Error object instance itself. E.g., A `java.lang.UnsupportedOperationException` instance, a python `exceptions.NameError` instance | No | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest error.type
here. It's not clear how to set error.kind
when you have a definite type name, though. Do we need both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fine with error.type
. These came from the OpenTracing spec.
specification/data-events.md
Outdated
| `error.kind` | The type or "kind" of an error. E.g., `"Exception"`, `"OSError"` | Yes | | ||
| `error.message` | A concise, human-readable, one-line message explaining the event. E.g., `"Could not connect to backend"`, `"Cache invalidation succeeded"` | Yes | | ||
| `error.object` | For languages that support such a thing (e.g., Java, Python), the actual Throwable/Exception/Error object instance itself. E.g., A `java.lang.UnsupportedOperationException` instance, a python `exceptions.NameError` instance | No | | ||
| `error.stack` | A stack trace in platform-conventional format; may or may not pertain to an error. E.g., `"File \"example.py\", line 7, in \<module\>\ncaller()\nFile \"example.py\", line 5, in caller\ncallee()\nFile \"example.py\", line 2, in callee\nraise Exception(\"Yikes\")\n"` | No | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you allow this to include the caused-by fragments of a Java exception?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you ever want to record the location where the error is being recorded as distinct from the location where an exception occurred? I think of these as the location of the log statement vs the location of the throw statement, for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have submitted an OTEP open-telemetry/oteps#69 addressing what the value of error.object
should be. It would include all of the data items mentioned above.
specification/data-events.md
Outdated
| Attribute name | Notes and examples | Required? | | ||
| :------------- | :------------------------------------- | --------- | | ||
| `error.kind` | The type or "kind" of an error. E.g., `"Exception"`, `"OSError"` | Yes | | ||
| `error.message` | A concise, human-readable, one-line message explaining the event. E.g., `"Could not connect to backend"`, `"Cache invalidation succeeded"` | Yes | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's going to be a desire to record information events that are not errors. It would be nice to have a "message" attribute and a reserved event name for those.
|
||
Event `"name"` MUST be `"message"`. | ||
|
||
Message events MUST be associated with a tracing span. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel there needs to be some guidance as to when we add these attributes on an event vs on a span. It's fine to say that they should always be an event, even when the span only have one "event" such as an http request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we should always use them as event attributes even though we know only one event can occur per Span.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Messages, errors, DBs, and other semantic conventions are developing type systems. I feel like we need to review these type systems; right now there is no guidance as to what types we support, and what the semantics for that particular type may be.
For example, If my error type is Exception
, how should I fill in the remaining fields? Does it matter?
specification/data-events.md
Outdated
|
||
| Attribute name | Notes and examples | Required? | | ||
| :------------- | :------------------------------------- | --------- | | ||
| `error.kind` | The type or "kind" of an error. E.g., `"Exception"`, `"OSError"` | Yes | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In some places we use kind
and in other places we use type
to describe the same thing. So now we have message.type
and error.kind
, along with span.kind
and db.type
. We should standardize on a single convention for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW in the Metrics API spec I've avoided using "type" except when it refers to an actual programming language type, not a more abstract term. So, Instruments have kinds. The values they produce have types.
I don't quite understand what the proposed message event is about. When would I use this one and when would I create a (child) span with the attributes defined in #395? |
| `message.type` | Either `"SENT"` or `"RECEIVED"`. | Yes | | ||
| `message.id` | Incremented integer value within the parent span starting with `1`. | Yes | | ||
| `message.compressed_size` | Compressed size in bytes. | No | | ||
| `message.uncompressed_size` | Uncompressed size in bytes. | No | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just as an idea, maybe we should think about a bit shorter names :) if you have any suggestion I would be happy to hear :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't really need the word "uncompressed" in my opinion.
message.size
- uncompressed sizemessage.compressed
ormessage.compressed_size
- compressed size
specification/data-events.md
Outdated
However, logging-only exporters will likely want to log it as this information | ||
is highly useful during the early stages of developing a new application. | ||
|
||
## Error event attributes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we split this PR into two (message and errors). I think we should consider if for errors we want to have a more specific helper API than just attributes, also the correlation with the Status.
So to make progress we can probably soon merge the message part and keep discussing the errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on splitting it up :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will take care of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
Event `"name"` MUST be `"message"`. | ||
|
||
Message events MUST be associated with a tracing span. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we should always use them as event attributes even though we know only one event can occur per Span.
specification/data-events.md
Outdated
| `message.content` | The body or main contents of the message. If binary, then message should be Base64 encoded. | No | | ||
|
||
The `message.id` MUST be calculated as two different counters starting from `1` | ||
one for sent messages and one for received message. This way we guarantee that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Three different uses of one/1 in this sentence, which refer to message id and types of message ids are confusing. I would go with something along the lines of:
The
message.id
MUST be calculated as two different counters starting from1
, the first for sent messages and the second for received messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
|
||
## Message event attributes | ||
|
||
Each message sent/received within a span should be recorded as an event. In the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm inclined to say that messages "MAY" be recorded as events. For a gRPC stream, in particular, I'm not sure that message events are always desired. I'd like the semantic convention to apply when the decision is made to record message events, but not to specify when you SHOULD record message events.
specification/data-events.md
Outdated
| `message.uncompressed_size` | Uncompressed size in bytes. | No | | ||
| `message.content` | The body or main contents of the message. If binary, then message should be Base64 encoded. | No | | ||
|
||
The `message.id` MUST be calculated as two different counters starting from `1`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be called a sequence number? It's not clear that message-ids must be sequential integers, nor who is responsible for calculating them. I'd be more comfortable calling them message-ids and not mentioning counters. Some implementations may use counters, some may not.
specification/data-events.md
Outdated
In case of unary calls only one sent and one received message will be recorded | ||
for both client and server spans. | ||
|
||
Most exporters will likely drop the `message.content` attribute if present. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tangentially, it makes me wonder why we don't have an API to declare keys with metadata like a description, some kind of priority as discussed here, potentially type information, as well as a clearly stated namespace.
I'm confused about the use case for when you would use this |
@kbrockhoff when would you use |
Message span events are for spans where multiple messages are sent in the same span. For example, gRPC client-side streaming, server-side streaming, bidirectional streaming variants. Another example is HTTP server-sent events. I also see this used even for RPC-type spans where only one message is sent and received. It would be the consistent place where message size and content is recorded rather than in span attributes. |
Most of these event attributes are already mentioned in the RPC semantic conventions. https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/data-rpc.md This PR just separates them out to emphasize they are applicable to more than gRPC. The only addition is 'message.content' which is important for OpenTelemetry adoption by applications that currently don't have any observability instrumentation. |
I would think you should simply create multiple spans in that case. I can see the benefit of having to specify the messaging destination only once though (maybe related: #274). So are these events supposed to only occur on messaging spans, or any span? If any span, are they related to "messaging"/pubsub at all? Or are these for any network communication? |
@kbrockhoff I would also expect that one would just create multiple (child) spans with the messaging attributes (#418) set on them. With the events as well we would have two competing ways of reporting messages without a clear distinction on when to use what. @bogdandrutu You introduced the events defined in https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/data-rpc.md#events in PR #7. Do you think we could settle for reporting messages as spans or are there reasons for representing them as events? |
It does not make sense to treat each message as a separate span in some streaming cases. This is real code backed by a process that takes 10-15 minutes: @GetMapping(path = "/awards-processes/{id}/events", produces = MediaType.APPLICATION_STREAM_JSON_VALUE)
public Flux<AwardEvent> getAwardProcessEventStream(@PathVariable("id") String id) {
AwardsProcessor processor = processorMap.get(id);
if (processor == null ||
"COMPLETED".equals(processor.getCurrentStatus()) || "FAILED".equals(processor.getCurrentStatus())) {
return Flux.fromStream(() -> processAwardsService.getEventsForAwardProcess(id).stream());
} else {
return processor.getEventStream();
}
} |
@kbrockhoff I assume you would want to trace every single message in the event stream on the span here? Maybe we could solve this in/after #418 by amending it with wording such as
This sounds in fact useful for any semantic convention, so maybe we can find a generic wording to add to https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/data-semantic-conventions.md#span-conventions. |
## Message event attributes | ||
|
||
Each message sent/received within a span should be recorded as an event. In the | ||
case of synchronous RPC calls there will be one sent and one received event per |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should the recommended limit on number of messages associated with the span defined?
| :------------- | :------------------------------------- | --------- | | ||
| `message.type` | Either `"SENT"` or `"RECEIVED"`. | Yes | | ||
| `message.id` | Unique identifier within a span and `message.type` for the individual message. Further specifications for common protocols are discussed below. | No | | ||
| `message.compressed_size` | Compressed size in bytes. | No | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it always known that the message was compressed? Perhaps size
will be size of a message, and than two separate sizes for compressed and uncompressed be specified.
| `message.id` | Unique identifier within a span and `message.type` for the individual message. Further specifications for common protocols are discussed below. | No | | ||
| `message.compressed_size` | Compressed size in bytes. | No | | ||
| `message.uncompressed_size` | Uncompressed size in bytes. | No | | ||
| `message.content` | The body or main contents of the message. If binary, then message should be Base64 encoded. | No | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should there be an attribute for the metadata/headers? May be more valuable than the content of the message and cheaper to collect.
|
||
To conserve bandwith and/or storage, exporters MAY drop the `message.content` | ||
attribute if present. Logging-only exporters bundled with default OpenTelemetry | ||
SDKs SHOULD provide affordances for logging this information as it is highly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case of instrumentation adapters passing an additional configuration may sounds ok. However if messaging SDK is pre-instrumented with OpenTelemetry - it will be a setting on messaging SDK itself. Perhaps this paragraph can be rephrased to express this
| Attribute name | Notes and examples | Required? | | ||
| :------------- | :------------------------------------- | --------- | | ||
| `message.type` | Either `"SENT"` or `"RECEIVED"`. | Yes | | ||
| `message.id` | Unique identifier within a span and `message.type` for the individual message. Further specifications for common protocols are discussed below. | No | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
specify the attribute type. string
?
|
||
| Attribute name | Notes and examples | Required? | | ||
| :------------- | :------------------------------------- | --------- | | ||
| `message.type` | Either `"SENT"` or `"RECEIVED"`. | Yes | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should it be boolean or more types are expected?
This PR was marked stale due to lack of activity. It will be closed in 7 days. |
This was merged in a different PR. Available in https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/semantic_conventions/messaging.md |
Standard attributes for message and error events.