Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Common event attribute names #397

Closed

Conversation

kbrockhoff
Copy link
Member

Standard attributes for message and error events.

Copy link
Contributor

@jmacd jmacd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I'm a but worried that we're rushing into semantic conventions for a logging system, here. I'd like to keep it as restricted as possible until there's more attention to give this broad topic.

| :------------- | :------------------------------------- | --------- |
| `error.kind` | The type or "kind" of an error. E.g., `"Exception"`, `"OSError"` | Yes |
| `error.message` | A concise, human-readable, one-line message explaining the event. E.g., `"Could not connect to backend"`, `"Cache invalidation succeeded"` | Yes |
| `error.object` | For languages that support such a thing (e.g., Java, Python), the actual Throwable/Exception/Error object instance itself. E.g., A `java.lang.UnsupportedOperationException` instance, a python `exceptions.NameError` instance | No |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest error.type here. It's not clear how to set error.kind when you have a definite type name, though. Do we need both?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with error.type. These came from the OpenTracing spec.

| `error.kind` | The type or "kind" of an error. E.g., `"Exception"`, `"OSError"` | Yes |
| `error.message` | A concise, human-readable, one-line message explaining the event. E.g., `"Could not connect to backend"`, `"Cache invalidation succeeded"` | Yes |
| `error.object` | For languages that support such a thing (e.g., Java, Python), the actual Throwable/Exception/Error object instance itself. E.g., A `java.lang.UnsupportedOperationException` instance, a python `exceptions.NameError` instance | No |
| `error.stack` | A stack trace in platform-conventional format; may or may not pertain to an error. E.g., `"File \"example.py\", line 7, in \<module\>\ncaller()\nFile \"example.py\", line 5, in caller\ncallee()\nFile \"example.py\", line 2, in callee\nraise Exception(\"Yikes\")\n"` | No |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you allow this to include the caused-by fragments of a Java exception?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you ever want to record the location where the error is being recorded as distinct from the location where an exception occurred? I think of these as the location of the log statement vs the location of the throw statement, for example.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have submitted an OTEP open-telemetry/oteps#69 addressing what the value of error.object should be. It would include all of the data items mentioned above.

| Attribute name | Notes and examples | Required? |
| :------------- | :------------------------------------- | --------- |
| `error.kind` | The type or "kind" of an error. E.g., `"Exception"`, `"OSError"` | Yes |
| `error.message` | A concise, human-readable, one-line message explaining the event. E.g., `"Could not connect to backend"`, `"Cache invalidation succeeded"` | Yes |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's going to be a desire to record information events that are not errors. It would be nice to have a "message" attribute and a reserved event name for those.


Event `"name"` MUST be `"message"`.

Message events MUST be associated with a tracing span.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel there needs to be some guidance as to when we add these attributes on an event vs on a span. It's fine to say that they should always be an event, even when the span only have one "event" such as an http request.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we should always use them as event attributes even though we know only one event can occur per Span.

Copy link
Contributor

@tedsuo tedsuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Messages, errors, DBs, and other semantic conventions are developing type systems. I feel like we need to review these type systems; right now there is no guidance as to what types we support, and what the semantics for that particular type may be.

For example, If my error type is Exception, how should I fill in the remaining fields? Does it matter?


| Attribute name | Notes and examples | Required? |
| :------------- | :------------------------------------- | --------- |
| `error.kind` | The type or "kind" of an error. E.g., `"Exception"`, `"OSError"` | Yes |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some places we use kind and in other places we use type to describe the same thing. So now we have message.type and error.kind, along with span.kind and db.type. We should standardize on a single convention for this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW in the Metrics API spec I've avoided using "type" except when it refers to an actual programming language type, not a more abstract term. So, Instruments have kinds. The values they produce have types.

@arminru
Copy link
Member

arminru commented Dec 19, 2019

I don't quite understand what the proposed message event is about. When would I use this one and when would I create a (child) span with the attributes defined in #395?

| `message.type` | Either `"SENT"` or `"RECEIVED"`. | Yes |
| `message.id` | Incremented integer value within the parent span starting with `1`. | Yes |
| `message.compressed_size` | Compressed size in bytes. | No |
| `message.uncompressed_size` | Uncompressed size in bytes. | No |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just as an idea, maybe we should think about a bit shorter names :) if you have any suggestion I would be happy to hear :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't really need the word "uncompressed" in my opinion.

  • message.size - uncompressed size
  • message.compressed or message.compressed_size - compressed size

However, logging-only exporters will likely want to log it as this information
is highly useful during the early stages of developing a new application.

## Error event attributes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we split this PR into two (message and errors). I think we should consider if for errors we want to have a more specific helper API than just attributes, also the correlation with the Status.

So to make progress we can probably soon merge the message part and keep discussing the errors.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on splitting it up :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will take care of it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


Event `"name"` MUST be `"message"`.

Message events MUST be associated with a tracing span.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we should always use them as event attributes even though we know only one event can occur per Span.

| `message.content` | The body or main contents of the message. If binary, then message should be Base64 encoded. | No |

The `message.id` MUST be calculated as two different counters starting from `1`
one for sent messages and one for received message. This way we guarantee that
Copy link
Member

@dyladan dyladan Jan 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Three different uses of one/1 in this sentence, which refer to message id and types of message ids are confusing. I would go with something along the lines of:

The message.id MUST be calculated as two different counters starting from 1, the first for sent messages and the second for received messages.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed


## Message event attributes

Each message sent/received within a span should be recorded as an event. In the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm inclined to say that messages "MAY" be recorded as events. For a gRPC stream, in particular, I'm not sure that message events are always desired. I'd like the semantic convention to apply when the decision is made to record message events, but not to specify when you SHOULD record message events.

| `message.uncompressed_size` | Uncompressed size in bytes. | No |
| `message.content` | The body or main contents of the message. If binary, then message should be Base64 encoded. | No |

The `message.id` MUST be calculated as two different counters starting from `1`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be called a sequence number? It's not clear that message-ids must be sequential integers, nor who is responsible for calculating them. I'd be more comfortable calling them message-ids and not mentioning counters. Some implementations may use counters, some may not.

In case of unary calls only one sent and one received message will be recorded
for both client and server spans.

Most exporters will likely drop the `message.content` attribute if present.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tangentially, it makes me wonder why we don't have an API to declare keys with metadata like a description, some kind of priority as discussed here, potentially type information, as well as a clearly stated namespace.

@dyladan
Copy link
Member

dyladan commented Jan 28, 2020

I'm confused about the use case for when you would use this message event instead of a messaging type span?

@dyladan
Copy link
Member

dyladan commented Feb 4, 2020

@kbrockhoff when would you use message span events rather than a messaging type span?

@kbrockhoff
Copy link
Member Author

Message span events are for spans where multiple messages are sent in the same span. For example, gRPC client-side streaming, server-side streaming, bidirectional streaming variants. Another example is HTTP server-sent events. I also see this used even for RPC-type spans where only one message is sent and received. It would be the consistent place where message size and content is recorded rather than in span attributes.

@kbrockhoff
Copy link
Member Author

Most of these event attributes are already mentioned in the RPC semantic conventions. https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/data-rpc.md This PR just separates them out to emphasize they are applicable to more than gRPC.

The only addition is 'message.content' which is important for OpenTelemetry adoption by applications that currently don't have any observability instrumentation.

@Oberon00
Copy link
Member

Oberon00 commented Feb 5, 2020

I would think you should simply create multiple spans in that case. I can see the benefit of having to specify the messaging destination only once though (maybe related: #274). So are these events supposed to only occur on messaging spans, or any span? If any span, are they related to "messaging"/pubsub at all? Or are these for any network communication?

@arminru
Copy link
Member

arminru commented Feb 13, 2020

@kbrockhoff I would also expect that one would just create multiple (child) spans with the messaging attributes (#418) set on them. With the events as well we would have two competing ways of reporting messages without a clear distinction on when to use what.

@bogdandrutu You introduced the events defined in https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/data-rpc.md#events in PR #7. Do you think we could settle for reporting messages as spans or are there reasons for representing them as events?

@kbrockhoff
Copy link
Member Author

It does not make sense to treat each message as a separate span in some streaming cases. This is real code backed by a process that takes 10-15 minutes:

    @GetMapping(path = "/awards-processes/{id}/events", produces = MediaType.APPLICATION_STREAM_JSON_VALUE)
    public Flux<AwardEvent> getAwardProcessEventStream(@PathVariable("id") String id) {
        AwardsProcessor processor = processorMap.get(id);
        if (processor == null ||
                "COMPLETED".equals(processor.getCurrentStatus()) || "FAILED".equals(processor.getCurrentStatus())) {
            return Flux.fromStream(() -> processAwardsService.getEventsForAwardProcess(id).stream());
        } else {
            return processor.getEventStream();
        }
    }

@Oberon00
Copy link
Member

Oberon00 commented Feb 14, 2020

@kbrockhoff I assume you would want to trace every single message in the event stream on the span here?

Maybe we could solve this in/after #418 by amending it with wording such as

Any subset of these attributes MAY be set on events with the name "messaging" on the span. In that case, attributes that are always the same for all messaging events SHOULD be set on the span. This might even result in emtpy "messaging" events. If there is nothing in common between the messaging events, a full Span SHOULD be created for each message instead of using events.

This sounds in fact useful for any semantic convention, so maybe we can find a generic wording to add to https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/data-semantic-conventions.md#span-conventions.

## Message event attributes

Each message sent/received within a span should be recorded as an event. In the
case of synchronous RPC calls there will be one sent and one received event per
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the recommended limit on number of messages associated with the span defined?

| :------------- | :------------------------------------- | --------- |
| `message.type` | Either `"SENT"` or `"RECEIVED"`. | Yes |
| `message.id` | Unique identifier within a span and `message.type` for the individual message. Further specifications for common protocols are discussed below. | No |
| `message.compressed_size` | Compressed size in bytes. | No |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it always known that the message was compressed? Perhaps size will be size of a message, and than two separate sizes for compressed and uncompressed be specified.

| `message.id` | Unique identifier within a span and `message.type` for the individual message. Further specifications for common protocols are discussed below. | No |
| `message.compressed_size` | Compressed size in bytes. | No |
| `message.uncompressed_size` | Uncompressed size in bytes. | No |
| `message.content` | The body or main contents of the message. If binary, then message should be Base64 encoded. | No |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should there be an attribute for the metadata/headers? May be more valuable than the content of the message and cheaper to collect.


To conserve bandwith and/or storage, exporters MAY drop the `message.content`
attribute if present. Logging-only exporters bundled with default OpenTelemetry
SDKs SHOULD provide affordances for logging this information as it is highly
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of instrumentation adapters passing an additional configuration may sounds ok. However if messaging SDK is pre-instrumented with OpenTelemetry - it will be a setting on messaging SDK itself. Perhaps this paragraph can be rephrased to express this

| Attribute name | Notes and examples | Required? |
| :------------- | :------------------------------------- | --------- |
| `message.type` | Either `"SENT"` or `"RECEIVED"`. | Yes |
| `message.id` | Unique identifier within a span and `message.type` for the individual message. Further specifications for common protocols are discussed below. | No |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

specify the attribute type. string?


| Attribute name | Notes and examples | Required? |
| :------------- | :------------------------------------- | --------- |
| `message.type` | Either `"SENT"` or `"RECEIVED"`. | Yes |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it be boolean or more types are expected?

@carlosalberto carlosalberto added the area:semantic-conventions Related to semantic conventions label Jun 12, 2020
@reyang reyang added release:required-for-ga Must be resolved before GA release, or nice to have before GA release:after-ga Not required before GA release, and not going to work on before GA and removed release:required-for-ga Must be resolved before GA release, or nice to have before GA labels Jul 6, 2020
@github-actions
Copy link

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Aug 12, 2020
@bogdandrutu
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:semantic-conventions Related to semantic conventions release:after-ga Not required before GA release, and not going to work on before GA Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants