Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify uniqueness of tracestate keys as the sender's responsibility #477

Merged
merged 4 commits into from
Jan 4, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions spec/20-http_request_header_format.md
Original file line number Diff line number Diff line change
Expand Up @@ -293,19 +293,19 @@ chr = %x20 / nblk-chr

### Combined Header Value

The `tracestate` value is the concatenation of trace graph key/value pairs
The `tracestate` value is the concatenation of trace graph key/value pairs.

Example: `vendorname1=opaqueValue1,vendorname2=opaqueValue2`

Only one entry per key is allowed because the entry represents that last position in the trace. Hence vendors must overwrite their entry upon reentry to their tracing system.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"because the entry represents that last position in the trace." had been called out here by @yurishkuro as not making a lot of sense, and I agree. Hence, I removed that part.

Also, I think update could be a better phrase instead of overwrite because we also use that in the section "Mutating the traceparent Field".

Finally "upon reentry to their tracing system" is really unwieldy, IMO.


For example, if a vendor name is Congo and a trace started in their system and then went through a system named Rojo and later returned to Congo, the `tracestate` value would not be:
Only one entry per key is allowed. For example, if a vendor name is Congo and a trace started in their system and then went through a system named Rojo and later returned to Congo, the `tracestate` value would not be:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"is allowed" is not a normative language. I would change it to "There SHOULD be only one entry per key".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The normative language around this is in the section "Mutating the tracestate Field":

Adding a key/value pair MUST NOT result in the same key being present multiple times.

I would restrict the normative description of that aspect to the "Mutating" section because it describes what an implementation is allowed to do with the header and what not. I do not see any added benefit by duplicating this aspect in normative language here. "Only one entry per key is allowed." is given in the context of the example (see the line above) and in my understanding, examples are never normative.


`congo=congosFirstPosition,rojo=rojosFirstPosition,congo=congosSecondPosition`

Instead, the entry would be rewritten to only include the most recent position:
`congo=congosSecondPosition,rojo=rojosFirstPosition`

See [Mutating the tracestate Field](#mutating-the-tracestate-field) for details.

#### tracestate Limits:

Vendors SHOULD propagate at least 512 characters of a combined header. This length includes commas required to separate list items and optional white space (`OWS`) characters.
Expand Down Expand Up @@ -353,6 +353,6 @@ Vendors receiving a `tracestate` request header MUST send it to outgoing request

Following are allowed mutations:

- **Add a new key/value pair**. The new key/value pair SHOULD be added to the beginning of the list.
- **Add a new key/value pair**. The new key/value pair SHOULD be added to the beginning of the list. Adding a key/value pair MUST NOT result in the same key being present multiple times.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This had been called out in "Combined Header Value" already, but it actually belong here, IMO.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like how this formulated. It eliminates concern that if platform doesn't deal with tracestate keys, just propagates it as is - it is not responsible for the duplicates. We may want to spell it out explicitly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See this part of my PR comment:

There seems to be no consensus yet whether an implementation should
clean up a tracestate value it receives by discarding duplicate keys
that it does not own. Thus, that aspect is not considered in this
change.

And my message here #446 (comment)

TL;DR: I agree with you that this is how it should work (vendors should not need to care about foreign keys at all) but I am not sure if this is consensus, so for now I was reluctant to add this aspect to this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that a tracing system should be responsible for deduping their own keys and shouldn't need to analyze or dedupe other keys.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added that aspect as well now: ccbabd6

- **Update an existing value**. The value for any given key can be updated. Modified keys SHOULD be moved to the beginning (left) of the list.
- **Delete a key/value pair**. Any key/value pair MAY be deleted. Vendors SHOULD NOT delete keys that were not generated by them. The deletion of an unknown key/value pair will break correlation in other systems. This mutation enables two scenarios. The first is that proxies can block certain `tracestate` keys for privacy and security concerns. The second scenario is a truncation of long `tracestate`s.
- **Delete a key/value pair**. Any key/value pair MAY be deleted. Vendors SHOULD NOT delete keys that were not generated by them. The deletion of an unknown key/value pair will break correlation in other systems. This mutation enables three scenarios. The first is that proxies can block certain `tracestate` keys for privacy and security concerns. The second scenario is a truncation of long `tracestate`s. Finally, vendors MAY also discard duplicate keys that were not generated by them.
11 changes: 6 additions & 5 deletions test/test.py
Original file line number Diff line number Diff line change
Expand Up @@ -549,37 +549,38 @@ def test_tracestate_multiple_headers_different_keys(self):
def test_tracestate_duplicated_keys(self):
'''
harness sends a request with an invalid tracestate header with duplicated keys
expects the tracestate to be discarded
expects the tracestate to be inherited, and the duplicated keys to be either kept as-is or one of them
to be discarded
'''
traceparent, tracestate = self.make_single_request_and_get_tracecontext([
['traceparent', '00-12345678901234567890123456789012-1234567890123456-00'],
['tracestate', 'foo=1,foo=1'],
])
self.assertEqual(traceparent.trace_id.hex(), '12345678901234567890123456789012')
self.assertRaises(KeyError, lambda: tracestate['foo'])
self.assertTrue('foo=1' in str(tracestate))

traceparent, tracestate = self.make_single_request_and_get_tracecontext([
['traceparent', '00-12345678901234567890123456789012-1234567890123456-00'],
['tracestate', 'foo=1,foo=2'],
])
self.assertEqual(traceparent.trace_id.hex(), '12345678901234567890123456789012')
self.assertRaises(KeyError, lambda: tracestate['foo'])
self.assertTrue('foo=1' in str(tracestate) or 'foo=2' in str(tracestate))

traceparent, tracestate = self.make_single_request_and_get_tracecontext([
['traceparent', '00-12345678901234567890123456789012-1234567890123456-00'],
['tracestate', 'foo=1'],
['tracestate', 'foo=1'],
])
self.assertEqual(traceparent.trace_id.hex(), '12345678901234567890123456789012')
self.assertRaises(KeyError, lambda: tracestate['foo'])
self.assertTrue('foo=1' in str(tracestate))

traceparent, tracestate = self.make_single_request_and_get_tracecontext([
['traceparent', '00-12345678901234567890123456789012-1234567890123456-00'],
['tracestate', 'foo=1'],
['tracestate', 'foo=2'],
])
self.assertEqual(traceparent.trace_id.hex(), '12345678901234567890123456789012')
self.assertRaises(KeyError, lambda: tracestate['foo'])
self.assertTrue('foo=1' in str(tracestate) or 'foo=2' in str(tracestate))

def test_tracestate_all_allowed_characters(self):
'''
Expand Down
8 changes: 5 additions & 3 deletions test/tracecontext/tracestate.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,9 +56,11 @@ def from_string(self, string):
if not match:
raise ValueError('illegal key-value format {!r}'.format(member))
key, eq, value = match.groups()
if key in self._traits:
raise ValueError('conflict key {!r}'.format(key))
self._traits[key] = value
if key not in self._traits:
self._traits[key] = value
# If key is already in self._traits, the incoming tracestate header contained a duplicated key.
# According to the spec, two behaviors are valid: Either pass on the duplicated key as-is or drop
# it. We opt for dropping it.
return self

def to_string(self):
Expand Down