Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
/ corefx Public archive

Commit

Permalink
Move HTTP Correlation protocol
Browse files Browse the repository at this point in the history
  • Loading branch information
Liudmila Molkova committed Apr 25, 2017
1 parent 93fa7c4 commit 80e8a8d
Show file tree
Hide file tree
Showing 4 changed files with 297 additions and 5 deletions.
10 changes: 5 additions & 5 deletions src/System.Diagnostics.DiagnosticSource/src/ActivityUserGuide.md
Original file line number Diff line number Diff line change
Expand Up @@ -194,7 +194,7 @@ Applications start Activity to represent logical piece of work to be done; one A
The whole operation may be represented as a tree of Activities. All operations done by the distributed system may be represented as a forest of Activities trees.
Id uniquely identifies Activity in the forest. It has an hierarchical structure to efficiently describe the operation as Activity tree.

Activity.Id serves as hierarchical Request-Id in terms of [HTTP standard proposal for correlation](https://github.com/lmolkova/correlation/blob/master/http_protocol_proposal_v1.md)
Activity.Id serves as hierarchical Request-Id in terms of [HTTP Correlation Protocol](HttpCorrelationProtocol.md)

### Id Format

Expand All @@ -206,16 +206,16 @@ e.g.

It starts with '|' followed by [root-id](#root-id) followed by '.' and small identifiers of local Activities, separated by '.' or '_'.

[Root-id](#root-id) identifies the whole operation and 'Id' identifies particular Actvity involved in operation processing.
[Root-id](#root-id) identifies the whole operation and 'Id' identifies particular Activity involved in operation processing.

'|' indicates Id has hierarchcal structure, which is useful information for logging system.
'|' indicates Id has hierarchical structure, which is useful information for logging system.

* Id is 1024 bytes or shorter
* Id consist of [Base64](https://en.wikipedia.org/wiki/Base64), '-' (hyphen), '.' (dot), '_' (underscore) and '#' (pound) characters.
Where base64 and '-' are used in nodes and other characters delimit nodes. Id always ends with one of the delimiters.

### Root Id
When you start the first Activity for the operation, you may optionaly provide root-id through `Activity.SetParentId(string)` API.
When you start the first Activity for the operation, you may optionally provide root-id through `Activity.SetParentId(string)` API.

If you don't provide it, Activity will generate root-id: e.g. `Server-5d183ab6-a000b421`

Expand Down Expand Up @@ -276,7 +276,7 @@ Id is passed to external dependencies and considered as [ParentId](#parentid) fo
`string ParentId { get; private set; }` - Activity may have either an in-process [Parent](#parent) or an external Parent if it was deserialized from request. ParentId together with Id represent the parent-child relationship in logs and allows you to correlate outgoing and incoming requests.

### RootId
`string RootId { get; private set; }` - Returns [root id](#root-id): Id (or ParentId) substring from '|' to first '.' occurence.
`string RootId { get; private set; }` - Returns [root id](#root-id): Id (or ParentId) substring from '|' to first '.' occurrence.

### Current
`static Activity Current { get; }` - Returns current Activity which flows across async calls.
Expand Down
99 changes: 99 additions & 0 deletions src/System.Diagnostics.DiagnosticSource/src/FlatRequestId.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# Flat Request-Ids
This document provide guidance for implementations of [HTTP Correlation Protocol](HttpCorrelationProtocol.md) without [Hierarchical Request-Id](HierarchicalRequestId.md) support or interoperability with services that do not support it.

We strongly recommend every implementation to support [Hierarchical Request-Id](HierarchicalRequestId.md) wherever possible. If implementation do not support it, it still MUST ensure essential requirements are met:
* `Request-Id` uniquely identifies every HTTP request involved in operation processing and MUST be generated for every incoming and outgoing request
* `Correlation-Context` has `Id` property serving as single unique identifier of the whole operation and implementation MUST generate one if it is missing.

[Root Request Id](HierarchicalRequestId.md#root-request-id-generation) requirements and generation considerations must be used for flat Request-Id

## Correlation Id
Many applications and tracing systems use single correlation id to identify whole operation through all services and client applications.

In case of heterogeneous environment (where some services generate hierarchical Request-Ids and others generate flat Ids) having single identifier, common for all requests, helps to make telemetry query simple and efficient.

Implementations MUST use `Id` property in `Correlation-Context` if they need propagate correlation id across the cluster.
Implementation it MUST ensure `Id` is present in `Correlation-Context` or [generate](#correlation-id-generation) new one and add to the `Correlation-Context`.

### Correlation Id generation
If implementation needs to add `Id` property to `Correlation-Context`:
* SHOULD use root node of the Request-Id received from upstream service if it has hierarchical structure.
* MUST follow [Root Request Id Generation](HierarchicalRequestId.md#root-request-id-generation) rules otherwise

## Non-hierarchical Request-Id example
1. A: service-a receives request
* scans through its headers does not find Request-Id.
* generates a new one: `abc`
* adds extra property to CorrelationContext `Id=123`
* logs event that operation was started along with `Request-Id: abc`, `Correlation-Context: Id=123`
2. A: service-a makes request to service-b:
* generates new `Request-Id: def`
* logs that outgoing request is about to be sent with all the available context: `Request-Id: def`, `Correlation-Context: Id=123`
* sends request to service-b
3. B: service-b receives request
* scans through its headers and finds `Request-Id: ghi`, `Correlation-Context: Id=123`
* logs event that operation was started along with all available context: `Request-Id: ghi`, `Correlation-Context: Id=123`
* processes request and responds to service-a
4. A: service-a receives response from service-b
* logs response with context: `Request-Id: def`, `Correlation-Context: Id=123`
* Processes request and responds to caller

As a result log records may look like:

| Message | Component Name | Context |
| ---------| --------------- | ------- |
| user starts request to service-a | user | |
| incoming request | service-a | `Request-Id=abc; Parent-Request-Id=; Id=123` |
| request to service-b | service-a | `Request-Id=def; Parent-Request-Id=abc, Id=123` |
| incoming request | service-b | `Request-Id=ghi; Parent-Request-Id=def; Id=123` |
| response | service-b |`Request-Id=ghi; Parent-Request-Id=def; Id=123` |
| response from service-b | service-a | `Request-Id=def; Parent-Request-Id=abc; Id=123` |
| response | service-a |`Request-Id=abc; Parent-Request-Id=; Id=123` |
| response from service-a | user | |

#### Remarks
* Logs for operation may be queried by `Id=123` match, logs for particular request may be queried by exact Request-Id match
* Note that since hierarchical request Id was not used, Id must be logged with every trace. Parent-Request-Id must be logged to restore parent-child relationships between incoming/outgoing requests.

## Mixed hierarchical and non-hierarchical scenario
In heterogeneous environment, some services may support hierarchical Request-Id generation and others may not.

Requirements listed [Request-Id](HttpCorrelationProtocol.md#request-id) help to ensure all telemetry for the operation still is accessible:
- if implementation supports hierarchical Request-Id, it MUST propagate `Correlation-Context` and **MAY** add `Id` if missing
- if implementation does NOT support hierarchical Request-Id, it MUST propagate `Correlation-Context` and **MUST** add `Id` if missing

Let's imagine service-a supports hierarchical Request-Id and service-b does not:

1. A: service-a receives request
* scans through its headers and does not find `Request-Id`.
* generates a new one: `|Guid.`
* logs event that operation was started along with `Request-Id: |Guid.`
2. A: service-a makes request to service-b:
* generates new `Request-Id: |Guid.1_`
* logs that outgoing request is about to be sent
* sends request to service-b
3. B: service-b receives request
* scans through its headers and finds `Request-Id: |Guid.1_`
* generates a new Request-Id: `def`
* does not see `Correlation-Context`. It parses parent Request-Id, extracts root node: `Guid` and adds `Id` property to `CorrelationContext : Id=abc`
* logs event that operation was started
* processes request and responds to service-a
4. A: service-a receives response from service-b
* logs response with context: `Request-Id: |Guid.1_`
* Processes request and responds to caller

As a result log records may look like:

| Message | Component Name | Context |
| ---------| --------------- | ------- |
| incoming request | service-a | `Request-Id=|Guid.` |
| request to service-b | service-a | `Request-Id=|Guid.1_` |
| incoming request | service-b | `Request-Id=def; Parent-Request-Id=|Guid.1_; Id=Guid` |
| response | service-b |`Request-Id=def; Parent-Request-Id=|Guid.1_; Id=Guid` |
| response from service-b | service-a | `Request-Id=|Guid.1_; Parent-Request-Id=|abc.bcec871c; Id=Guid` |
| response | service-a |`Request-Id=|Guid.` |

#### Remarks
* Note, that even if service-b does not **generate** hierarchical Request-Id, it still could benefit from hierarchical structure, by assigning `Correlation-Context: Id` to the root node of Request-Id
* Retrieving all log records then could be done by query like `Id == Guid || RequestId.startsWith('|Guid')`
* If the first service to process request does not support hierarchical ids, then it sets `Correlation-Context: Id` immediately and it's propagated further and still may be used to query all logs.
110 changes: 110 additions & 0 deletions src/System.Diagnostics.DiagnosticSource/src/HierarchicalRequestId.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
# Hierarchical Request-Id
This document describes hierarchical Request-Id schema for [HTTP Correlation Protocol](HttpCorrelationProtocol.md) for telemetry correlation.

## Overview
The main requirement for Request-Id is uniqueness, any two requests processed by the cluster must not collide.
Guids or big random number help to achieve it, but they require other identifiers to query all requests related to the operation.

Hierarchical Request-Id look like `|<root-id>.<local-id1>.<local-id2>.` (e.g. `|9e74f0e5-efc4-41b5-86d1-3524a43bd891.bcec871c_1.`) and holds all information needed to trace whole operation and particular request.
Root-id serves as common identifier for all requests involved in operation processing and local-ids represent internal activities (and requests) done within scope of this operation.

[CorrelationVector](https://osgwiki.com/wiki/CorrelationVector) is valid hierarchical Request-Id, except it does not start with "|". Implementation SHOULD allow other schemes for incoming request identifiers.

### Formatting Hierarchical Request-Id
If `Request-Id` was not provided from upstream service and implementation decides to trace the request, it MUST generate new `Request-Id` (see [Root Request Id Generation](#root-request-id-generation)) to represent incoming request.

In heterogeneous environment implementations of this protocol with hierarchical `Request-Id` may interact with other services that do not implement this protocol, but still have notion of request Id. Implementation or logging system should be able unambiguously identify if given `Request-Id` has hierarchical schema.

Therefore every implementation which support hierarchical structure MUST prepend "|" (vertical bar) to generated `Request-Id`.

It also MUST append "." (dot) to the end of generated Request-Id to unambiguously mark end of it (e.g. search for `|123` may return `|1234`, but search for `|123.` would be exact)

#### Root Request Id Generation
Root Request-Id is the top most Request-Id generated by the first instrumented service. In a hierarchical Request-Id, it is a root node and common for all requests involved in operation processing.
It MUST be unique to every high-level operation in the system, so for every traced operation, implementation MUST generate sufficiently large identifier: e.g. GUID, 64-bit or 128-bit random number.
Note that random numbers could be encoded to string to decrease Request-Id length.

Root Request-Id MUST contain only [Base64](https://en.wikipedia.org/wiki/Base64) and "-" (hyphen) characters.

Same considerations are applied to client applications making HTTP requests and generating root Request-Id.

Note that in addition to unique part, it may be useful to include some meaningful information such as host name, device or process id, etc. Implementation is free to do it, keeping root id relatively short.

#### Incoming Request
When Request-Id is provided by upstream service, there is no guarantee that it is unique within the entire system.

Implementation SHOULD make it unique by adding small suffix to incoming Request-Id to represent internal activity and use it for outgoing requests.
If implementation does not trust incoming Request-Id in the least, suffix may be as long as [Root Request Id](HttpCorrelationProtocol.md#root-request-id-generation).
We recommend appending random string of 8 characters length (e.g. 32-bit hex-encoded random integer).

Suffix MUST contain only [Base64](https://en.wikipedia.org/wiki/Base64) and "-" (hyphen) characters

Implementation MUST append "_" (underscore) to mark the end of generated incoming Request-Id.

#### Outgoing Request
When making request to downstream service, implementation MUST append small id to the incoming Request-Id and pass a new Request-Id to downstream service.

- Suffix MUST be unique for every outgoing HTTP request sent while processing the incoming request; monotonically incremented number of outgoing request within the scope of this incoming operation, is a good candidate.
- Suffix MUST contain only [Base64](https://en.wikipedia.org/wiki/Base64) and "-" (hyphen) characters

Implementation MUST append "." (dot) to mark the end of generated outgoing Request-Id.

It may be useful to split incoming request processing to multiple logical sub-operations and assign different identifiers to them, similarly as it is done for outgoing request, except the sub-operation is processed within the same service.

#### Request-Id Overflow
Extending `Request-Id` may cause it to exceed length limit.
To handle overflow, implementation:
* MUST generate suffix that keeps possibility of collision with any of the previous or future Request-Id within the same operation neglectable.
* MUST append "#" symbol to suffix to indicate that overflow happened.
* MUST trim end of existing Request-Id to make a room for generated LocalId. Implementation MUST trim whole nodes (separated with ".", "_") without preceding delimiter, i.e. it's invalid to trim only part of node.
* Suffix MUST contain only [Base64](https://en.wikipedia.org/wiki/Base64) and '-' (hyphen) characters

As a result Request-Id will look like:

`Beginning-Of-Incoming-Request-Id.LocalId#`

Thus, to the extent possible, Request-Id will keep valid part of hierarchical Id.

Overflow suffix should be large enough to ensure new Request-Id does not collide with one of previous/future Request-Ids within the same operation. Using random 32-bytes integer (or 8 chars string) is a good candidate for it.
Note that applications could asynchronously start multiple outgoing requests almost at the same time, which makes timestamp even with ticks precision bad candidate for overflow suffix.

## Example
Let's consider three services: service-a, service-b and service-c. User calls service-a, which calls service-b to fulfill the user request

`User -> service-a -> service-b`

1. A: service-a receives request
* does not find `Request-Id` and generates a new root Request-Id `|Guid.`
* trace that incoming request was started along with `Request-Id: |Guid.`
2. A: service-a makes request to service-b:
* generates new `Request-Id` by appending request number to the parent request id: `|Guid.1.`
* logs that outgoing request is about to be sent with all the available context: `Request-Id: |Guid.1.`
* sends request to service-b
3. B: service-b receives request
* scans through its headers and finds `Request-Id: |Guid.1.`
* it generates a new Request-Id: `|Guid.1.da4e9679_` to uniquely describe operation within service-b
* logs event that operation was started along with all available context: `Request-Id: |Guid.1.da4e9679_`
* processes request and responds to service-a
4. A: service-a receives response from service-b
* logs response with context: `Request-Id: |Guid.1.`
* Processes request and responds to caller

As a result log records may look like:

| Message | Component name | Context |
| ---------| --------------- | ------- |
| user starts request to service-a | user | |
| incoming request | service-a | `Request-Id=|Guid.` |
| request to service-b | service-a | `Request-Id=|Guid.1.` |
| incoming request | service-b | `Request-Id=|Guid.1.da4e9679_` |
| response | service-b | `Request-Id=|Guid.1.da4e9679_` |
| response from service-b | service-a | `Request-Id=|Guid.1.` |
| response | service-a | `Request-Id=|Guid.` |
| response from service-a | user | |

### Remarks
* All operation logs may be queried by Request-Id prefix `|Guid.`, logs for particular request may be queried by exact Request-Id match
* When service-a generates a new Request-Id, it does not append suffix, since it generates a root Request-Id and ensures its uniqueness

# See also
- [Flat Request-Id](FlatRequestId.md)
Loading

0 comments on commit 80e8a8d

Please sign in to comment.