Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates to http_protocol_proposal_v1.md #14

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 9 additions & 9 deletions http_protocol_proposal_v1.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Overview
# Overview
One of the common problems in microservices development is ability to trace request flow from client (application, browser) through all the services involved in processing.

Typical scenarios include:
Expand All @@ -7,12 +7,12 @@ Typical scenarios include:
2. Performance analysis and optimization: whole stack of request needs to be analyzed to find where performance issues come from
3. A/B testing: metrics for requests with experimental features should be distinguished and compared to 'production' data.

These scenarios require every request to carry additional context and services to enrich their telemetry events with this context, so it would possible to correlate telemetry from all services involved in operation processing.
These scenarios require every request to carry additional context and services to enrich their telemetry events with this context, so it would be possible to correlate telemetry from all services involved in operation processing.

Tracing an operation involves an overhead on application performance and should always be considered as optional, so application may not trace anything, trace only particular operations or some percent of all operations.
Tracing should be consistent: operation should be either fully traced, or not traced at all.

This document provides guidance on the context needed for telemetry correlation and describes its format in HTTP communication. The context is not specific to HTTP protocol, it represents set of identifiers that are needed or helpful for end-to-end tracing. Application widely use distributed queues for asynchronous processing so operation may start (or continue) from a queue message; applications should propagate the context through the queues and restore (create) it when they start processing received task.
This document provides guidance on the context needed for telemetry correlation and describes its format in HTTP communication. The context is not specific to HTTP protocol, it represents a set of identifiers that is needed or helpful for end-to-end tracing. Applications widely use distributed queues for asynchronous processing so operation may start (or continue) from a queue message; applications should propagate the context through the queues and restore (create) it when they start processing received task.

# HTTP Protocol proposal
| Header name | Format | Description |
Expand All @@ -26,20 +26,20 @@ This document provides guidance on the context needed for telemetry correlation
Request-Id is generated on the caller side and passed to callee.

Implementation of this protocol should expect to receive `Request-Id` in header of incoming request.
Absence of Request-Id indicates that it is either first instrumented service in the system or this request was not traced by upstream service and therefore does not have any context associated with it.
Absence of Request-Id indicates that it is either the first instrumented service in the system or this request was not traced by an upstream service and therefore does not have any context associated with it.
To start tracing the request, implementation MUST generate new `Request-Id` (see [Root Request Id Generation](#root-request-id-generation)) for the incoming request.

When Request-Id is provided by upstream service, there is no guarantee that it is unique within the entire system.
Implementation SHOULD make it unique by adding small suffix to incoming Request-Id to represent internal activity and use it for outgoing requests, see more details in [Hierarchical Request-Id document](hierarchical_request_id.md).

`Request-Id` is required field, which means that every instrumented request MUST have it. If implementation does not find `Request-Id` in the incoming request headers, it should consider it as non-traced and MAY not look for `Correlation-Context`.
`Request-Id` is a required field, i.e., every instrumented request MUST have it. If implementation does not find `Request-Id` in the incoming request headers, it should consider it as non-traced and MAY not look for `Correlation-Context`.

It is essential that 'incoming' and 'outgoing' Request-Ids are included in the telemetry events, so implementation of this protocol MUST provide read access to Request-Id for logging systems.

### Request-Id Format
`Request-Id` is a string up to 1024 bytes length. It contains only [Base64](https://en.wikipedia.org/wiki/Base64) and "-" (hyphen), "|" (vertical bar), "." (dot), and "_"( underscore) characters.

Vertical bar, dot and underscore are reserved characters that used to mark and delimit hierarchical Request-Id, and must not be present in the nodes. Hyphen may be used in the nodes.
Vertical bar, dot and underscore are reserved characters that are used to mark and delimit hierarchical Request-Id(s), and must not be present in the nodes. Hyphen may be used in the nodes.

Implementations SHOULD support hierarchical structure for the Request-Id, described in [Hierarchical Request-Id document](hierarchical_request_id.md).
See [Flat Request-Id](flat_request_id.md) for non-hierarchical Request-Id requirements.
Expand All @@ -49,9 +49,9 @@ Root service can add state (key value pairs) that will automatically propagate t

We anticipate that there will be common well-known Correlation-Context keys. If you wish to use this for you own custom (not well-known) context key, prefix it with @.

It is important to keep the size of any property because they get serialized in HTTP headers and headers have significant size restrictions. The Correlation-Context MUST NOT be used as generic data passing mechanism (between services or within service).
It is important to keep the size of any property small because these get serialized into HTTP headers which have significant size restrictions. The Correlation-Context MUST NOT be used as generic data passing mechanism (between services or within service).

`Correlation-Context` is optional, which means that it may or may not be provided by upstream service.
`Correlation-Context` is optional. It may or may not be provided by upstream service.

If `Correlation-Context` is provided by upstream service, implementation MUST propagate it further to downstream services.

Expand All @@ -62,7 +62,7 @@ Implementation MUST provide read access to `Correlation-Context` for logging sys

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when you say Implementation MUST provide read access to Correlation-Context for logging systems and MUST support adding properties to Correlation-Context. that feels a little confusing, because throughout the document you say Correlation-Context is read-only, which implies properties cannot be added to the collection. I think what you really mean is existing properties in the collection should be treated as read only but new properties are allowed.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

I think our guidance here is a bit different:

  1. First service should define correlation context.
  2. Next service(s) MUST propagate it further without modifying properties, and SHOULD NOT add other properties.

So the Correlation-Context is indeed read only for every service except the first one, but it is not a strict requirement.

Implementation does not really know which service is the first one so it must allow to add properties anyway.

I agree that I should be more clear about it in the spec. BTW, there is an open PR to move this protocol to .net core repo. I'll do your changes there and also improve description.
Once PR is merged, I'll change this doc to refer to the corefx repo.

`Correlation-Context: key1=value1, key2=value2`

Neither keys nor values MUST NOT contain "="(equals) or "," (comma) characters.
NEITHER keys NOR values MAY contain "="(equals) or "," (comma) characters.

Overall Correlation-Context length MUST NOT exceed 1024 bytes, key and value length should stay well under the combined limit of 1024 bytes.

Expand Down