-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Add HTTP Correlation protocol #18887
Add HTTP Correlation protocol #18887
Conversation
Could you please share some information about the planned process of this standard. My main questions are:
Since this involves interoperability with other systems/languages, I'd like to ping some people who are involved in open source tracers and the opentracing project - maybe someone could share his opinion about this proposed standard. @adriancole (Zipkin) @basvanbeek (Zipkin) @bhs (Lightstep) @pavolloffay (Hawkular) @wu-sheng (Sky-Walking) @yurishkuro (Jaeger) (feel free to ping other people) PS: As always, I'm excited with the intention to get distributed tracing features in .NET. My main concern here is that a) this standard will not be adopted by anyone outside of Microsoft and b) hard-coding this standard will make it very hard to integrate .NET apps with existing tracers. Also, any changes during standardization will lead to more legacy code in the .NET framework. I'd prefer to see all of this shipped in a later version of .NET but with a well discussed standard. |
This document provide guidance for implementations of [HTTP Correlation Protocol](HttpCorrelationProtocol.md) without [Hierarchical Request-Id](HierarchicalRequestId.md) support or interoperability with services that do not support it. | ||
|
||
We strongly recommend every implementation to support [Hierarchical Request-Id](HierarchicalRequestId.md) wherever possible. If implementation do not support it, it still MUST ensure essential requirements are met: | ||
* `Request-Id` uniquely identifies every HTTP request involved in operation processing and MUST be generated for every incoming and outgoing request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this include parent ids? (since they are noted in above docs and also later here in the examples)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Request-Id is generated for every outgoing request and becomes parent for incoming request on downstream service.
Message | Component Name | Context |
---|---|---|
user starts request to service-a | user | |
incoming request | service-a | Request-Id=abc; Parent-Request-Id=; Id=123 |
request to service-b | service-a | Request-Id=def; Parent-Request-Id=abc, Id=123 |
incoming request | service-b | Request-Id=ghi; Parent-Request-Id=def; Id=123 |
response | service-b | Request-Id=ghi; Parent-Request-Id=def; Id=123 |
response from service-b | service-a | Request-Id=def; Parent-Request-Id=abc; Id=123 |
response | service-a | Request-Id=abc; Parent-Request-Id=; Id=123 |
response from service-a | user |
So, logging system logs Request-Id from the incoming request headers as Parent-Request-Id and logs new unique Request-Id.
- [HTTP Header encoding RFC5987](https://tools.ietf.org/html/rfc5987) | ||
- De-facto overall HTTP headers size is limited to several kilobytes (depending on a web server) | ||
|
||
# Industry standards |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for some definition of standard, there's a context propagation format you can see (if for nothing else another example) w3c/trace-context#1 There are other issues available on the same repo as well, including many parties who are exploring it w3c/trace-context#4 cc @bogdandrutu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks! this is really good to know. So this is standardization of Zipkin format, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lmolkova it's part of a broader proposal that a couple parties including Zipkin and Google are working on for distributed trace propagation and submission. We're planning on reaching out more widely once we have more documentation and code samples available, but I'd be glad to lay out the plan and longer-term goals if you're interested.
Implementations SHOULD support hierarchical structure for the Request-Id, described in [Hierarchical Request-Id document](HierarchicalRequestId.md). | ||
See [Flat Request-Id](FlatRequestId.md) for non-hierarchical Request-Id requirements. | ||
|
||
## Correlation-Context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this is created once then passed read-only afterwards. Interesting.. I think what is interesting in general is that your formats seem to include, but are not limited to tracing
cc @nbmorgan who recently asked about an agnostic place to do exactly what's described here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is mostly meant to control logging: service may pass sampling rate/flags there that could be used by downstream services. It's also valid to propagate other tracing-related information: feature flags, some Ids that could be useful to group traces by.
We discourage anyone from using it for non-tracing purposes. Headers have significant size limitations, parsing, propagating and just storing Correlation-Context in memory involved performance overhead.
Service in general does not know which web-server is used somewhere downstream and exceeding header size limit may create a problem that is hard to find.
Also, depending on the implementation, service may have little control or no control at all over where Correlation-Context is propagated to: it could be external systems, cloud storage, etc. Even if service trust immediate callee, callee will propagate it further. So it also creates security concerns.
And finally, there are better ways and cleaner designs to pass non-tracing data between the services :)
What morgan said :) incidentally, we did reverse document the format used
primarily in zipkin a while back as well
https://github.com/openzipkin/b3-propagation
…On 27 Apr 2017 4:45 am, "Liudmila Molkova" ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/System.Diagnostics.DiagnosticSource/src/HttpCorrelationProtocol.md
<#18887 (comment)>:
> +`Correlation-Context` is represented as comma separated list of key value pairs, where each pair is represented in key=value format:
+
+`Correlation-Context: key1=value1, key2=value2`
+
+Neither keys nor values MUST NOT contain "=" (equals) or "," (comma) characters.
+
+Overall Correlation-Context length MUST NOT exceed 1024 bytes, key and value length should stay well under the combined limit of 1024 bytes.
+
+Note that uniqueness of the key within the Correlation-Context is not guaranteed. Context received from upstream service is read-only and implementation MUST not remove or aggregate duplicated keys.
+
+# HTTP Guidelines and Limitations
+- [HTTP 1.1 RFC2616](https://tools.ietf.org/html/rfc2616)
+- [HTTP Header encoding RFC5987](https://tools.ietf.org/html/rfc5987)
+- De-facto overall HTTP headers size is limited to several kilobytes (depending on a web server)
+
+# Industry standards
thanks! this is really good to know. So this is standardization of Zipkin
format, correct?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#18887 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAD613-IdotaWEgoQ-6JeValjaq702chks5rz6z3gaJpZM4NG5_w>
.
|
Great. I think i meant to say the parent id should be defined in the file
with the other terms (currently is not)
…On 27 Apr 2017 4:41 am, "Liudmila Molkova" ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/System.Diagnostics.DiagnosticSource/src/FlatRequestId.md
<#18887 (comment)>:
> @@ -0,0 +1,99 @@
+# Flat Request-Ids
+This document provide guidance for implementations of [HTTP Correlation Protocol](HttpCorrelationProtocol.md) without [Hierarchical Request-Id](HierarchicalRequestId.md) support or interoperability with services that do not support it.
+
+We strongly recommend every implementation to support [Hierarchical Request-Id](HierarchicalRequestId.md) wherever possible. If implementation do not support it, it still MUST ensure essential requirements are met:
+* `Request-Id` uniquely identifies every HTTP request involved in operation processing and MUST be generated for every incoming and outgoing request
Request-Id is generated for every outgoing request and becomes parent for
incoming request on downstream service.
Message Component Name Context
user starts request to service-a user
incoming request service-a Request-Id=abc; Parent-Request-Id=; Id=123
request to service-b service-a Request-Id=def; Parent-Request-Id=abc,
Id=123
incoming request service-b Request-Id=ghi; Parent-Request-Id=def; Id=123
response service-b Request-Id=ghi; Parent-Request-Id=def; Id=123
response from service-b service-a Request-Id=def; Parent-Request-Id=abc;
Id=123
response service-a Request-Id=abc; Parent-Request-Id=; Id=123
response from service-a user
So, logging system logs Request-Id from the incoming request headers as
Parent-Request-Id and generates new unique Request-Id.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#18887 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAD61y7sYR6R-jWAsjmL4wvuQeuSdUIeks5rz6wHgaJpZM4NG5_w>
.
|
:) so I might put some version of this advisory in the text because some
folks will otherwise hijack this field to carry random things like auth
tokens or feature flags to your chagrin!
…On 27 Apr 2017 5:03 am, "Liudmila Molkova" ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/System.Diagnostics.DiagnosticSource/src/HttpCorrelationProtocol.md
<#18887 (comment)>:
> +When Request-Id is provided by upstream service, there is no guarantee that it is unique within the entire system.
+Implementation SHOULD make it unique by adding small suffix to incoming Request-Id to represent internal activity and use it for outgoing requests, see more details in [Hierarchical Request-Id document](HierarchicalRequestId.md).
+
+`Request-Id` is required field, which means that every instrumented request MUST have it. If implementation does not find `Request-Id` in the incoming request headers, it should consider it as non-traced and MAY not look for `Correlation-Context`.
+
+It is essential that 'incoming' and 'outgoing' Request-Ids are included in the telemetry events, so implementation of this protocol MUST provide read access to Request-Id for logging systems.
+
+### Request-Id Format
+`Request-Id` is a string up to 1024 bytes length. It contains only [Base64](https://en.wikipedia.org/wiki/Base64) and "-" (hyphen), "|" (vertical bar), "." (dot), and "_" (underscore) characters.
+
+Vertical bar, dot and underscore are reserved characters that used to mark and delimit hierarchical Request-Id, and must not be present in the nodes. Hyphen may be used in the nodes.
+
+Implementations SHOULD support hierarchical structure for the Request-Id, described in [Hierarchical Request-Id document](HierarchicalRequestId.md).
+See [Flat Request-Id](FlatRequestId.md) for non-hierarchical Request-Id requirements.
+
+## Correlation-Context
This is mostly meant to control logging: service may pass sampling
rate/flags there that could be used by downstream services. It's also valid
to propagate other tracing-related information: feature flags, some Ids
that could be useful to group traces by.
We discourage anyone from using it for non-tracing purposes. Headers have
significant size limitations, parsing, propagating and just storing
Correlation-Context in memory involved performance overhead.
Service in general does not know which web-server is used somewhere
downstream and exceeding header size limit may create a problem that is
hard to find.
Also, depending on the implementation, service may have little control or
no control at all over where Correlation-Context is propagated to: it could
be external systems, cloud storage, etc. It also creates security concerns.
And finally, there are better ways and cleaner designs to pass data
between the services :)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#18887 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAD6134Dtvl1BGzdeYD2fc7itgCUwMxgks5rz7EEgaJpZM4NG5_w>
.
|
@cwe1ss Thanks for sharing it! I believe we already touched this topic in #16393: There is no worldwide standard (at least for now), that's why we are creating this one. And in any case we want to support interoperability with other tracing systems in .NET. Regarding the plans for the standard: we are considering publishing it as Informational RFC first in a next few months. |
Coming here from OpenTracing. Glad to see .Net and MSFT interested in this. What types of interoperability are expected to be solved by this standard? I would like to have more context so I can attune my feedback correctly. For example, is this expected to enable blind interoperability between tracing systems? What does interoperability mean in this case? While these headers look like a reasonable implementation for an "indexed log" style of tracer, hierarchical request ids and uncompressed baggage have severe limitations. How are tracing systems with more effective implementations expected to interact with this standard? |
FYI: @JonathanMace could you see Tracing Plane working with these headers? Or are they separate? |
@tedsuo @adriancole and everyone else, We really appreciate your feedback and want to address any concerns you have. Unfortunately I'm quite busy with some other things now. I'll get back to you in the next few days, sorry for the delay. |
@lmolkova is this part of the 2.0 release, or not part of that? |
@danmosemsft @tedsuo @adriancole |
ps if you want, you can discuss this at our next tracing workshop
https://docs.google.com/document/d/12erSNS8gUIH6t2x4Om-HC4wUH20RrfyxMN2dJ0ILf6U/edit#
…On Fri, May 5, 2017 at 3:15 PM, Liudmila Molkova ***@***.***> wrote:
@danmosemsft <https://github.com/danmosemsft>
it is a part of 2.0, however it is pure documentation update and could be
merged any time after 2.0 freeze.
@tedsuo <https://github.com/tedsuo> @adriancole
<https://github.com/adriancole>
Sorry, it takes longer than I expected to answer your questions. I did not
forget about you.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#18887 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAD610c1mE6zWOHPm6ZjF5iElMzqAZU9ks5r2sypgaJpZM4NG5_w>
.
|
@lmolkova any update on this? Can you please move the PR forward towards merge? |
80e8a8d
to
bab520c
Compare
This standard does not intend to solve interoperability issues. We made it quite flexible, so we allow almost ANY format on the incoming Request-Id. This protocol and it's implementation is the first iteration and the minimum required to correlate telemetry. We may change things based on your and other community feedback. We will also review TraceContext and the proposal Zipkin and Google are working on (as @mtwo mentioned) and see what we can do to interoperate with it or support it. |
I added more details on 'parent' Id logging and Correlation-Context limitations/misuse, thanks! Would be great to discuss it on workshop, unfortunately I will be on the trip during it and probably without internet access. Will try to attend remote sessions when it will be possible. Thanks for the information anyway! |
Finally moving it towards merge. I'm fine with having docs in master (2.1) only. |
Ping? |
@lmolkova thanks for your response. Personally, I like many things about this but I'm not totally sold on the HierarchicalRequestId approach. It seems like you need to either have short ids, few hops between services, or hit the header limit. In practice, most systems will work by indexing on the trace id, request id and parent id, so the rest of the hierarchy will probably not be necessary in practice. Another issue is accounting for joins, when codepaths have multiple parents. I don't need answers for any of these questions, just food for thought when you work on future iterations. Cheers, |
@lmolkova ping? If you're not ready yet, let's close the PR and reopen it when you are ready to finish it off ... |
@karelz liudmila is on vacation. Can this PR wait till after the weekend? |
Sure. If you won't have time to push it further in a week or two, please close it and reopen when you are ready. Thanks! |
Hierarchical Request-Id look like `|<root-id>.<local-id1>.<local-id2>.` (e.g. `|9e74f0e5-efc4-41b5-86d1-3524a43bd891.bcec871c_1.`) and holds all information needed to trace whole operation and particular request. | ||
Root-id serves as common identifier for all requests involved in operation processing and local-ids represent internal activities (and requests) done within scope of this operation. | ||
|
||
[CorrelationVector](https://osgwiki.com/wiki/CorrelationVector) is valid hierarchical Request-Id, except it does not start with "|". Implementation SHOULD allow other schemes for incoming request identifiers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this link is available from outside of MS. Please remove or copy the content publicly.
CC: @jacpull
bab520c
to
6569f0a
Compare
@SergeyKanzhelev private link is removed @karelz |
@lmolkova in that case please work with area owners to finalize the review and squash/merge the PR. Thanks! |
@vancem are you ok to merge it? |
…relationProtocol Add HTTP Correlation protocol Commit migrated from dotnet/corefx@e1c2b4d
…relationProtocol Add HTTP Correlation protocol Commit migrated from dotnet/corefx@e1c2b4d
We created HTTP Correlation protocol as a part of correlation feature: it describes Id format and generation (that is implemented in
System.Diagnostics.Activity
) and HTTP headers that carryActivity.Id
andActivity.Baggage
(used in System.Net.Http and Asp.Net Core).The protocol will be submitted to IETF, but until it's officially published as RFC, we want to store it in corefx repo unless someone has a better suggestion.
/cc @vancem @SergeyKanzhelev