-
Notifications
You must be signed in to change notification settings - Fork 182
Conversation
rfc/trace_identifiers.md
Outdated
# Risk Assessment | ||
Because this proposal includes the exposure of new information, and adds entirely new concepts to the interface, some risks exist. | ||
|
||
Existing tracers may not be able to support this feature, as their internal model does not include and client-side trace identifiers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
include any client-side trace identifiers
rfc/trace_identifiers.md
Outdated
|
||
The `Trace-Parent` header contains the following fields: `version`, `trace-id`, `span-id`, and `trace-options`. | ||
|
||
`Trace-id` is the ID of the whole trace forest. It is represented as a 16-bytes array, e.g.,4bf92f3577b34da6a3ce929d0e0e4736. All bytes 0 is considered invalid. Implementation may decide to completely ignore the Trace-Parent if the trace-id is invalid. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: make these bullet points, for easier reading
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, consider a table:
field | format | example | description |
---|---|---|---|
trace-id |
16-byte array | 4bf92f3577b34da6a3ce929d0e0e4736 | The ID of the whole trace forest. If all bytes are 0, the Trace-Parent may be ignored. |
span-id |
8-byte array | 00f067aa0ba902b7 | The ID of the caller span (parent). If all bytes are 0, the Trace-Parent may be ignored. |
(here's the raw markdown)
| field | format | example | description |
| :--- | :--- | :--- | :--- |
| `trace-id` | 16-byte array | 4bf92f3577b34da6a3ce929d0e0e4736 | The ID of the whole trace forest. If all bytes are 0, the `Trace-Parent` may be ignored. |
| `span-id` | 8-byte array | 00f067aa0ba902b7 | The ID of the caller span (parent). If all bytes are 0, the `Trace-Parent` may be ignored. |
rfc/trace_identifiers.md
Outdated
## B3 HTTP Headers | ||
The [B3 HTTP headers](https://github.com/openzipkin/b3-propagation) are widely adopted, mostly by Zipkin-like tracing systems. The B3 protocol includes `X-B3-TraceId` and `X-B3-SpanId` as required headers. | ||
|
||
`TraceId` is 64 or 128-bit in length and indicates the overall ID of the trace. Every span in a trace shares this ID. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bullet points
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, consider a table:
| field | format | description |
| :--- | :--- | :--- |
| `TraceId` | 64 or 128-bit | The ID of the trace. Every span in a trace shares this ID. |
| `SpanId` | 64-bit | Indicates the position of the current operation in the trace tree. The value may or may not be derived from the value of the `traceId`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the table approach!
rfc/trace_identifiers.md
Outdated
* Strongly supported across many languages, and commonly used for transferring data between independent subsystems. | ||
|
||
## Alternate Formats | ||
In some cases, additional formats may be appropriate, if a language supports multiple common transport formats. Exposing accessors in other formats should be done to prevent double allocations while formating the identifiers. For example, converting from a tracer’s native format to a string may trigger an allocation. If there are many systems which want to consume the identifier in a format which requires an allocation when converting from a string, a second allocation could occur. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comments about memory allocations might be better moved to the Risks section, as a performance consideration. It could be relevant even in cases when we don't introduce additional exposition formats, e.g. when the internal representation that the tracer uses is not a string. Many existing tracers represent span id as uint64, so to support the new API change efficiently they might want to cache the string representation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, makes sense.
rfc/trace_identifiers.md
Outdated
|
||
The `Trace-Parent` header contains the following fields: `version`, `trace-id`, `span-id`, and `trace-options`. | ||
|
||
`Trace-id` is the ID of the whole trace forest. It is represented as a 16-bytes array, e.g.,4bf92f3577b34da6a3ce929d0e0e4736. All bytes 0 is considered invalid. Implementation may decide to completely ignore the Trace-Parent if the trace-id is invalid. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, consider a table:
field | format | example | description |
---|---|---|---|
trace-id |
16-byte array | 4bf92f3577b34da6a3ce929d0e0e4736 | The ID of the whole trace forest. If all bytes are 0, the Trace-Parent may be ignored. |
span-id |
8-byte array | 00f067aa0ba902b7 | The ID of the caller span (parent). If all bytes are 0, the Trace-Parent may be ignored. |
(here's the raw markdown)
| field | format | example | description |
| :--- | :--- | :--- | :--- |
| `trace-id` | 16-byte array | 4bf92f3577b34da6a3ce929d0e0e4736 | The ID of the whole trace forest. If all bytes are 0, the `Trace-Parent` may be ignored. |
| `span-id` | 8-byte array | 00f067aa0ba902b7 | The ID of the caller span (parent). If all bytes are 0, the `Trace-Parent` may be ignored. |
rfc/trace_identifiers.md
Outdated
## B3 HTTP Headers | ||
The [B3 HTTP headers](https://github.com/openzipkin/b3-propagation) are widely adopted, mostly by Zipkin-like tracing systems. The B3 protocol includes `X-B3-TraceId` and `X-B3-SpanId` as required headers. | ||
|
||
`TraceId` is 64 or 128-bit in length and indicates the overall ID of the trace. Every span in a trace shares this ID. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, consider a table:
| field | format | description |
| :--- | :--- | :--- |
| `TraceId` | 64 or 128-bit | The ID of the trace. Every span in a trace shares this ID. |
| `SpanId` | 64-bit | Indicates the position of the current operation in the trace tree. The value may or may not be derived from the value of the `traceId`. |
rfc/trace_identifiers.md
Outdated
|
||
The `Trace-Parent` header contains the following fields: `version`, `trace-id`, `span-id`, and `trace-options`. | ||
|
||
`Trace-id` is the ID of the whole trace forest. It is represented as a 16-bytes array, e.g.,4bf92f3577b34da6a3ce929d0e0e4736. All bytes 0 is considered invalid. Implementation may decide to completely ignore the Trace-Parent if the trace-id is invalid. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the Trace-Context spec changes, these format specifications will become stale. Perhaps we could include the spec version described here as a signal to make sure you check the Trace-Context spec as the source of truth. (Same with the B3 protocol below.) Alternatively, just add a warning that these formats are subject to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point!
## Trace-Context HTTP Headers | ||
[Trace-Context HTTP headers](https://github.com/w3c/distributed-tracing) are in the process of being standardized via the w3c. The tracing community has voiced strong support in implementing these headers for use in tracing interop. | ||
|
||
The `Trace-Parent` header contains the following fields: `version`, `trace-id`, `span-id`, and `trace-options`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence nicely orients you to the context of trace-id
and span-id
within the header. A similar sentence under the B3 Headers section would also be nice.
rfc/trace_identifiers.md
Outdated
**Current State:** Draft | ||
**Author:** [tedsuo](https://github.com/tedsuo) | ||
|
||
The Opentracing model of computation specifies two primary object types, `Spans` and `Traces`, but does not specify identifiers for these objects. The lack of identifiers makes it very difficult to correlate tracing data with data in other systems. This complicates a number of important tasks, and prevents the creation of reusable trace observers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/Opentracing/OpenTracing. In fact, I might reword the opening with a more positive bent and explain the change first, then the why, and link to the use cases you explain further down.
The OpenTracing
SpanContext
interface extends to includeSpanID
andTraceID
accessors. Identifiers for the two primary object types make it easier to correlate tracing data with data in other systems, simplify important tasks, and allow the creation of reusable trace observers. Some use cases are detailed below.
## Backwards Compatibility and Optional Support | ||
The OpenTracing specification does not currently require trace and span identifiers. To continue support for existing tracers, the empty string value can be returned when no ID has been set. | ||
|
||
# Use Cases |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to use cases! At the top you mention that the lack of identifiers "complicates a number of important tasks" in addition to preventing use in systems like logging and reusable tracer observers (which are each called out in separate sections). What are these other important tasks that IDs facilitate?
rfc/trace_identifiers.md
Outdated
# Use Cases | ||
|
||
## Log Correlation | ||
The primary expected consumer for Trace-Context identifiers are logging systems which run independently from the tracing system. Request level log indexing has become a common practice in logging, and the tracing system contains the mechanism for propagating the relevant identifiers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this is missing the "therefore" at the end, i.e.
- Logging systems are primary consumers (sources?) of trace data but they can't directly access the data
- Including request identifiers is a common logging practice
- Tracing systems know these identifiers, therefore it makes sense for them to provide accessors
rfc/trace_identifiers.md
Outdated
* Correlating logs, as mentioned above. | ||
|
||
# Risk Assessment | ||
Because this proposal includes the exposure of new information, and adds entirely new concepts to the interface, some risks exist. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably be more specific about these risks and implications. On the last call, Chris raised the issue that the ID returned might not be the one you're expecting (for example, the implementation's concept of a trace ID rather than the one propagated in the header).
Is there also concern about being too closely tied to protocol specifics? Should we be explicit the decision that trace and span IDs are "special cases" and worth the risk?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there's general agreement that this is useful for secondary systems, and since most tracers support these identifiers (and many are looking to standardize on Trace-Context) this is a good time to add them.
Part of the spec is that you should not assume anything about the value returned, other than it's uniqueness. Traces are globally unique, spans are unique within a trace. These are the only two properties a caller should depend on if they want to write observers which are vendor-neutral.
Happy to add more language making this clearer.
rfc/trace_identifiers.md
Outdated
## Trace Observers | ||
The OpenTracing community would like to develop secondary observation systems which utilize the tracing runtime, but are tracer independent. Examples include: | ||
|
||
* Generating metrics from tracing data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick that bullet points typically do not include periods. Also, it might be nice to add a "therefore" to the introduction, like
Trace and span identifiers would allow these tracer observers to create trace metadata without having to pay attention to the request headers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! Get it? Point? Like bullet point?
Sorry.
rfc/trace_identifiers.md
Outdated
* `TraceID`, accessible as a **string**. | ||
* `SpanID`, accessible as a **string**. | ||
|
||
**String** values are used for identifiers. In this context, a string is defined as an immutable, variable length sequence of unicode characters. A string is preferred over other formats for the following reasons: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps call out here that an empty string is a valid return type.
* Improved Risks and Use Cases * Tables for API descritions * Nits
@erabug @yurishkuro thanks for the feedback! I've made changes based on your suggestions. One pedantic note: I like the way the tables look. But, technically, tables are not part of basic markdown, only GitHub. We've discussed avoiding whatever-flavored markdown in our docs, since it complicated which tool you use to render them. Is this a red flag for RFCs as well? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! 👍
Just curious: Has it been discussed if using Object
instead of String
would be an option to prevent allocations? I guess it doesn' make much sense as most users of these IDs need a human-readable format, so having e.g. a byte array wouldn't be of much help anyway, right?
rfc/trace_identifiers.md
Outdated
|
||
The OpenTracing SpanContext interface is extended to include `SpanID` and `TraceID` accessors. | ||
|
||
The Opentracing model of computation specifies two primary object types, `Spans` and `Traces`, but does not specify identifiers for these objects. Identifiers for the two primary object types make it easier to correlate tracing data with data in other systems, simplify important tasks, and allow the creation of reusable trace observers. Some use cases are detailed below. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "OpenTracing" instead of "Opentracing"
rfc/trace_identifiers.md
Outdated
## Extra Allocations and Overhead | ||
Internally, tracers do not always use strings to represent their identifiers. So there is a conversion cost when using these accessors. | ||
|
||
While a single allocation may be inevitable, exposing accessors in additional formats could be done to prevent double allocations while formating the identifiers. For example, converting from a tracer’s native format to a string may trigger an allocation. If there are many systems which want to consume the identifier in a format which requires an allocation when converting from a string, a second allocation could occur. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
formating
should be formatting
rfc/trace_identifiers.md
Outdated
## Extra Allocations and Overhead | ||
Internally, tracers do not always use strings to represent their identifiers. So there is a conversion cost when using these accessors. | ||
|
||
While a single allocation may be inevitable, exposing accessors in additional formats could be done to prevent double allocations while formating the identifiers. For example, converting from a tracer’s native format to a string may trigger an allocation. If there are many systems which want to consume the identifier in a format which requires an allocation when converting from a string, a second allocation could occur. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I picture for a C API would not incur allocation in the library. Just ask the user for a buffer.
#define TRACE_ID_STR_LEN 16
int span_context_trace_id(const custom_span_context* ctx, void* buf, int len)
{
return snprintf(buf, len, "%" PRIx64 "%016" PRIx64, span_context->trace_id.high, span_context->trace_id.low);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good approach! In many languages, this option will probably not be available, as the APIs you are trying to put the data into often will not support buffer types (or the language may not have them at all), but for C this is good.
@cwe1ss if the only issue were how to represent the array, this at least could be done statically as a fixed-size array. I think the bigger issue here is why expose only a string interface and not also a raw byte interface? |
@isaachier @cwe1ss regarding the use of |
@tedsuo can we set a cap on the maximum size the string can be. This would simplify the C API considerably. |
@isaachier a cap is a good idea. There is surely a reasonable upper bound, possibly 16-bytes. |
@isaachier I added a section on length and formatting under Risks: Would you be willing to move forwards with testing a |
rfc/trace_identifiers.md
Outdated
@@ -60,12 +60,12 @@ The primary expected consumer for Trace-Context identifiers are logging systems | |||
Log indexing has become a common practice, often by including a request identifier in the log. In the past, this has involved manually propagating these identifiers as headers. However, systems which using OpenTracing automatically propagate these identifiers via the Inject/Extract interface. Some of these identifiers are user-generated, and contained in Baggage. However, the most relevant identifiers for log indexing are the Trace and Span IDs. Therefore, exposing these values would be immensley valuable. | |||
|
|||
## Trace Observers | |||
The OpenTracing community would like to develop secondary observation systems which utilize the tracing runtime, but are tracer independent. Examples include: | |||
The OpenTracing community would like to develop secondary observation systems which utilize the tracing runtime, but are tracer independent. Trace and span identifiers would allow these observers to correlate tracing data without having knowledge of the wire protocol or tracing implemnetation. Examples include: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "implemnetation"
Ya no worries I'm fine with this approach. |
rfc/trace_identifiers.md
Outdated
|
||
| field | format | description | | ||
| :--- | :--- | :--- | | ||
| `TraceId` | 64 or 128-bit | The ID of the trace. Every span in a trace shares this ID. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Be consistent in describing the length in bits vs bytes (in TraceContext, you wrote the length is 16 bytes) . Bits is more commonly used I feel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I normalized on bits, and included a note about character space (HEXDIG vs opaque).
* bits instead of bytes * include description of character space
Proposal
https://github.com/opentracing/specification/blob/tedsuo/trace-identifiers/rfc/trace_identifiers.md
The Opentracing model of computation specifies two primary object types, Spans and Traces, but does not specify identifiers for these objects. The lack of identifiers makes it very difficult to correlate tracing data with data in other systems. This complicates a number of important tasks, and prevents the creation of reusable trace observers.
To address these difficulties, the OpenTracing SpanContext interface is extended to include SpanID and TraceID accessors.
Prior Proposals
#107 Trace-Parent Accessors