Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconsider null behavior for span attributes. #797

Closed
thisthat opened this issue Aug 13, 2020 · 32 comments · Fixed by #992
Closed

Reconsider null behavior for span attributes. #797

thisthat opened this issue Aug 13, 2020 · 32 comments · Fixed by #992
Assignees
Labels
area:api Cross language API specification issue priority:p1 Highest priority level release:required-for-ga Must be resolved before GA release, or nice to have before GA spec:trace Related to the specification/trace directory

Comments

@thisthat
Copy link
Member

The current spec of span attribute defines null value as a way to delete an existing attribute: https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/api.md#set-attributes

Another approach is to ignore null values as the call to Set Attributes was never made and thus, preserving the original value of the attribute.

This came up in this discussion: #777 (comment)

This issue is meant to discuss which of the two approaches we should follow.

@thisthat thisthat added the spec:trace Related to the specification/trace directory label Aug 13, 2020
@arminru arminru added area:api Cross language API specification issue release:required-for-ga Must be resolved before GA release, or nice to have before GA labels Aug 13, 2020
@Oberon00 Oberon00 changed the title Clarify null behavior for span attributes. Reconsider null behavior for span attributes. Aug 13, 2020
@andrewhsu andrewhsu added the priority:p2 Medium priority level label Aug 18, 2020
@jmacd
Copy link
Contributor

jmacd commented Aug 19, 2020

See also #503 where I made the same request.
I believe we should treat "null" as a first-class value. At this point we've added support for list-valued and map-valued attributes. The only part of JSON syntax we do not support, at this point, is "null". The fact that "null" is specified to mean "delete this attribute" is especially problematic because some languages do not have a "null" string. For example, in Golang, there is no way to delete an attribute because there is no such thing as a null string.

Should we add a first-class API to support removing attributes? I would say no. This has already been debated (in #503) -- OpenTracing alsodid not support deleting attributes.

@bogdandrutu
Copy link
Member

@jmacd I think there are two different concerns here:

  1. Support for a proper "null" value
  2. What to do in languages where some of the supported values (like String in Java) is called with null (same example if the array version in go is called with nil).

@jmacd
Copy link
Contributor

jmacd commented Aug 19, 2020

In keeping with JSON as a target, I see nothing wrong with placing "null" inside an array. If you have an Array in Java and some elements are null, I'd expect the output value to be a list with some strings and some nulls.

@bogdandrutu
Copy link
Member

@jmacd I think this is again about the "Array object" being null not an object inside the array. We already cover null inside arrays, see https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/common/common.md#attributes

null values within arrays MUST be preserved as-is (i.e., passed on to span processors / exporters as null).

@Oberon00
Copy link
Member

Oberon00 commented Aug 26, 2020

@jmacd

I believe we should treat "null" as a first-class value

I don't think we should. Not all languages have a concept of null and if we want that, we would need to change the attribute definition to allow an additional primitive type of null (that would be distinct from a null-String for example).

@andrewhsu
Copy link
Member

From the issue triage mtg today, talked with @carlosalberto and moving this to P1 to highlight the need for a decision to either have a yes or no on this issue before the trace portion of the spec is frozen.

@andrewhsu andrewhsu added priority:p1 Highest priority level and removed priority:p2 Medium priority level labels Sep 9, 2020
@andrewhsu
Copy link
Member

@arminru would you be able to take this issue and drive it to resolution?

@tedsuo
Copy link
Contributor

tedsuo commented Sep 10, 2020

I agree with @jmacd that if a user makes a valid call to SetAttribute, we should record the value. Silently ignoring a call to SetAttribute due to an unexpected null would make some scenarios really difficult to debug. The most straight forward and consistent thing is to record all valid values in some way. Since we do not want to guard against nulls, we should record them in some manner (see below). Languages which do not have null simply do not record any nulls. These languages already have to deal with parsing data formats which include null, so this is nothing new.

I also recommend we remove the deleting of values. OpenTelemetry does not have any kind of command pattern in its data, except for this one case. I believe it will create trouble for us down the road. Instead, the Collector should be used for this kind of data massaging. Given that (as mentioned) not all languages have null, the delete pattern cannot be implemented everywhere anyways.

Honestly, it is a bit bizarre that we would accept nulls in an array, but also use nulls to delete values. I think this is just case where SetAttribute ended up evolving in a different direction since the delete feature was added, and we now need to account for it.

May I suggest that the simplest and most regular solution is to accept nulls, and apply the current definition of null handling in arrays to null handling in general.

null values MUST be preserved as-is (i.e., passed on to span processors / exporters as null). If exporters do not support exporting null values, they MAY replace those values by 0, false, or empty strings.

@arminru
Copy link
Member

arminru commented Sep 10, 2020

@andrewhsu Sure 🙂

@arminru arminru self-assigned this Sep 10, 2020
@arminru
Copy link
Member

arminru commented Sep 10, 2020

@jmacd

I believe we should treat "null" as a first-class value.

If we would allow null as a meaningful value, this would have to be properly handled by the protocols, exporters and consumers. The current version of OTLP would not allow expressing a null value but we would rather have to convert it to the default value of either string, bool, int or double. In languages where null can be passed without specifying which "type" of null it is, we would already have to guess here. If we would allow null and express it using the type's default value, we would lose information as those two values would no longer be distinguishable. I am certainly against that, since I don't think we can just export null as "" or 0 or 0.0 or false since that could make a remarkable difference as null would rather convey "value not available" than "the value is X", which it actually isn't. I also wouldn't specify null values to always be an empty string, for example, since that would also make it more complex for consumers that expect a certain type for a given attribute name, as specified by the semantic conventions.

I think the attribute not being present expresses the meaning of null better than the default values for each type.

If we would add support for a proper null value, we would have to extend OTLP to express this properly, and I don't think we will be able to or should do this before GA. I am open, however, to discuss this for after GA but would only go for a proper end-to-end solution here rather than making some last minute compromises.

@arminru
Copy link
Member

arminru commented Sep 10, 2020

@jmacd

The fact that "null" is specified to mean "delete this attribute" is especially problematic because some languages do not have a "null" string. For example, in Golang, there is no way to delete an attribute because there is no such thing as a null string.

and @tedsuo

I also recommend we remove the deleting of values.

I don't think it was intended to introduce a new "delete attribute" feature but rather to specify how to deal with a null being passed, which is currently not deemed a valid value in the spec. Attempting to set null could either be ignored at all or drop the previously set value. I think there are arguments for both but that out of these two options, it is less surprising for users that after setting null there is nothing left (many APIs return null to express "nothing there"/"not found", in languages that have null) rather than the previous value still being present.

@arminru
Copy link
Member

arminru commented Sep 10, 2020

@tedsuo

OpenTelemetry does not have any kind of command pattern in its data, except for this one case. I believe it will create trouble for us down the road.

What do you mean by that? I don't see a big difference in allowing overwriting existing values with something else (e.g., an empty string if "deleting" is not possible) or removing the previously set attribute entirely.

@arminru
Copy link
Member

arminru commented Sep 10, 2020

@tedsuo

Honestly, it is a bit bizarre that we would accept nulls in an array, but also use nulls to delete values.

If I remember correctly, we decided to preserve null values in arrays to allow for "companion arrays" where the values in one array and the values at the same index in one (or more) other arrays form tuples (i.e., (a[i],b[i],c[i])). If null values were not preserved, this would not be possible since the arrays would go out of sync instead of keeping the same length and positions. I'm not sure if something like that is currently used or ever will, however. The fact that arrays are homogeneous, make at least the type determination easier (unless it's all nulls). The expressed meaning as "the thing you were looking for is not there" is the same in both cases - an attribute not being present at all or no value at the index in one of the companion arrays. The fact that on transport the nulls have to be replaced as you mentioned above is already part of a compromise to allow for this use case and comes with information loss since null and the default values are not distinguishable.

@tedsuo
Copy link
Contributor

tedsuo commented Sep 11, 2020

Attribute values of null are considered to be not set and get discarded as if that Attribute has never been created. As an exception to this, if overwriting of values is supported, this results in removing the attribute.

@arminru what I mean by command is the following: if null is not a valid value, but it is accepted as input and results in a change, then it is a command. Specifically, it removes a previous key which has been set. This is unexpected behavior. What @jmacd and I are requesting is that instead of deleting the key, the key is set to null. You can think of it as a tombstone marker rather than a removal of the key. Given that our protocol already supports null as a value, this would be consistent and expected behavior. I understand that the current proto representation of null is a bit of a hack, but I don't think that is an issue to be considered when designing the API.

To put it another way: If we want to have the ability to delete attribute keys, we should add a DeleteAttribute method. This is much cleaner than "if you happen to have null in your language you might get a delete command." It is also much safer, as it is explicit, and an accidental call to SetAttribute with null would not result in a situation which is difficult to debug. But I don't think we actually want DeleteAttribute. We just want to deal with nulls consistently. If we do need delete, that should be a separate issue from supporting null values for attributes in our protocol and our SetAttribute API.

I understand how there are conventions in some languages around null being used as a stand-in for delete. But I hope my explanation clarifies how there is a difference between recording null and deleting a key.

@Oberon00
Copy link
Member

Oberon00 commented Sep 11, 2020

@tedsuo

if null is not a valid value, but it is accepted as input

I think we should make clear that if possible null should not be accepted as input. However, for exmaple in Java, your options for that are limited. You could document it in Javadoc and add @Notnull annotations but in most cases passing null will still compile without warnings.

Given that our protocol already supports null as a value, this would be consistent and expected behavior

Our protocol DOES NOT support null currently. The protobuf definitions would indeed support it, but it's not part of the allowed set of value types for attributes now, and consumers that do more than just forwarding are probably not be prepared to handle it.

Honestly, it is a bit bizarre that we would accept nulls in an array, but also use nulls to delete values.

I have also recently adopted the opinion that deleting on null was not the best choice according to the "least-surprising" principle.

Attribute values of null are considered to be not set and get discarded as if that Attribute has never been created. As an exception to this, if overwriting of values is supported, this results in removing the attribute.

The "as an exception" wording (which is warranted here) is an indicator of this too. However, I think the safest option for now is that such calls are just always ignored (maybe with an off-by-default option for logging them).

A technical problem for accepting nulls are dynamically typed languages where you have no null-string/integer but just "null" with it's own "NullType" (or None and NoneType in Python) where the ""/0/false replacement is not possible since you can't choose a tpye. A semantic problem is the lost distinction between "not there" and "there and false". Which also exists to some extent in arrays (though null in an array at least conveys "we expected something to be there / an empty slot was there") but the companion-array case is more important there IMHO.

An alternative I could live with would be:

null values MUST be preserved as-is (i.e., passed on to span processors / exporters as null). If exporters do not support exporting null values, they SHOULD discard the attribute as if it was never there in the first place. They SHOULD NOT replace it with any value that would be valid for some normal non-null attribute type (for example, using empty string/0/false could cause wrong interpretations on the consumer side more likely than not sending the attribute at all).

But note that IMHO the OTELP protocol cannot be extended until after we reach the next "breaking change allowed" cycle (the trace part of that protocol has been declared stable). No strong opinion here though, as Dynatrace does not yet use OTELP.

@tedsuo
Copy link
Contributor

tedsuo commented Sep 14, 2020

@Oberon00 I think you have nailed it. All other details aside, the core issue is "the principal of least surprise."

IMHO, the most uniform solution would be to say that nulls are encoded as empty string. This could also scoop up other odds and ends, such as "undefined" in javascript. It is not perfect of course, but it would be implementable today, and universal across languages. I would prefer this as a temporary solution over dropping nulls, only because silence is hard to debug. I do understand the resistance, as it is not entirely semantically correct. My point is that dropping nulls is not semantically correct either, and empty string is easier to deal with. There is no great solution here. 🤷‍♀️

Either way, it sounds like there is agreement that, in the long run, we should add null as a value type.

@Oberon00
Copy link
Member

Oberon00 commented Sep 15, 2020

IMHO, the most uniform solution would be to say that nulls are encoded as empty string.

I disagree. In Java null does not have it's own type. Instead you have a null String a null Boolean, etc. So if you used Boolean b = null; span.setAttribute("myattr", b);, it would be very surprising if an empty string was the result in the exported data.

@Oberon00
Copy link
Member

Either way, it sounds like there is agreement that, in the long run, we should add null as a value type.

Actually, I don't like that. The problem is just that in some languages you have to somehow deal with null. Following the principle of least surprise, null as first-class value would indeed seem to be best for these languages. But then we may need awkward APIs in other languages for generating these nulls. And having one more value type may lead to surprising semantic conventions, etc.

@Oberon00
Copy link
Member

Maybe the question we should ask is: Is null a semantically useful value type? We already can use empty strings, empty arrays, boolean false, unset attributes. Do we need null in addition? It might be used to signal "I looked for this attribute, but did not find it". Is that useful? Are there any other use cases?

@tigrannajaryan
Copy link
Member

The current version of OTLP would not allow expressing a null value but we would rather have to convert it to the default value of either string, bool, int or double.

@arminru the protocol does support null values, see https://github.com/open-telemetry/opentelemetry-proto/blob/313a868be259dce6c6516dd417d3ad5fd3321acf/opentelemetry/proto/common/v1/common.proto#L29

// The value is one of the listed fields. It is valid for all values to be unspecified
// in which case this AnyValue is considered to be "null".

AnyValue is defined so that any JSON can be represented in it and null needed to be representable.

I agree with @jmacd that being able to represent anything that JSON can represent is a valid goal.

OTLP allows this today, we just chose to limit it in the API.

@Oberon00
Copy link
Member

I agree with @jmacd that being able to represent anything that JSON can represent is a valid goal.

I don't understand the reason behind that. Can you elaborate?

@tigrannajaryan
Copy link
Member

I agree with @jmacd that being able to represent anything that JSON can represent is a valid goal.

I don't understand the reason behind that. Can you elaborate?

If I have a source data that is a JSON I would want to be able to record it in an attribute in a loseless manner. This is already supported in Log data model (both Body and Attributes explicitly allow any value - which is defined to be JSON-like) where it is needed due to legacy. Even if there is not an immediate need to record JSON data in Span attributes, just for consistency and uniformness of traces and logs I would support it. Again, in OTLP it is already completely uniform (both Span and Log attribute values are represented by AnyValue, which supports null).

@arminru
Copy link
Member

arminru commented Sep 16, 2020

@tedsuo I don't really see the difference between overwriting and explicitly deleting an attribute. If we want to avoid deleting attributes (by either setting null in languages that have it or overwriting it with an empty string, for example, and thus effectively deleting the previous value), we would have to prohibit overwriting of attributes entirely. This would be fine with me and would particularly make sense for attributes that were used for a sampling decision (i.e., already present at span start), but that's a different discussion.

Again, deleting the attribute if null is set is not an intended feature but rather a well-defined error handling behavior if this (currently deemed invalid) input is passed. I would also be fine with defining SetAttribute calls with invalid input as a no-op and therefore leaving the previously set value untouched since I think both are valid options.

@arminru
Copy link
Member

arminru commented Sep 16, 2020

@tigrannajaryan Thanks for pointing me to that line in OTLP. I missed the comment there that specifies how null can be expressed.

I'm not convinced that representing any JSON value to be set as an attribute is a requirement that we need to fulfill. One can set an entire JSON object (including the literal string "null") already and if the JSON is decomposed and each value is set as a separate attribute, a null value can be expressed by simply not setting that attribute.

Is there any other use case that an explicit null value would solve? I can't come up with anything were we would need to distinguish between an attribute not being set or being set to null/undefined. If there is any, please let me know so I can understand the problem better.

@tigrannajaryan
Copy link
Member

I'm not convinced that representing any JSON value to be set as an attribute is a requirement that we need to fulfill.

@arminru I agree, it is not a requirement, but a nice to have feature. The problem with the current definition of SetAttribute(key, null) is that it will be a breaking change if later we decide that we do want to be able to represent JSON values. In that case the semantics of SetAttribute(key, null) will have to change incompatibly.

I am not convinced of the opposite: that it is a good idea to use null as a magic value for erasing attributes. This is what spec says today:

As an exception to this, if overwriting of values is supported, this results in removing the attribute.

I don't like this part of the spec. I'd rather have DeleteAttribute API instead.

span.SetAttribute("key.to.remove",null)

with

span.DeleteAttribute("key.to.remove")

I prefer the second version more since it is immediately clear what it does. The first version, unless I read the documentation I will likely misunderstand what it does.

I don't see the point of the current definition of null values . I think it is unnatural, uses magic values, ignores the benefit of being able to represent JSON data and is not ergonomic as an API.

Is there any other use case that an explicit null value would solve?

I think there are. For example:
OpenTelemetry conventions sometime use the fact of presence of an attribute as an indicator (remember we removed "component" and decided that presence of attribute indicates it instead). It means if I want the indicator to be present I have to record a specific attribute. However, what if I don't know the value of the attribute? Recording an empty string can be misleading since empty string may be a valid specific value for that attribute. In such case semantic convention could say that if the value is null then it should not be considered equal to empty string but it is an unknown value not equal to anything (this is very similar to NULL semantics in SQL). The backends could then both enjoy the benefit of knowing that the indicator is present and avoid equating its value to empty string. Today it is not possible.

@arminru
Copy link
Member

arminru commented Sep 16, 2020

@tigrannajaryan

I see the deletion of attributes as a different concern.
If only this was the problem, as stated above (#797 (comment)), we could just change this error handling behavior for null being passed to SetAttribute to a no-op and therefore preserving the previous value or prohibit overwriting of attributes entirely (which could make sense IMHO, particularly for attributes that were used for a sampling decision (i.e., already present at span start), but that's a different discussion).

Changing the behavior from either deleting or no-op to preserving null as a meaningful value, however, would be a breaking change in both cases so we should settle for either of the three options before GA.

Regarding span attributes representing JSON values, see my other comment above (#797 (comment)).

Thanks for explaining that potential use case. I'm still not really convinced of the benefit of saying "I know that there is an attribute named X but don't know or can't determine the value I should set" by explicitly setting it to null but in general it makes sense to me. If others see the need for that independent of the deletion issue that could be solved differently as mentioned above as well, I would agree on making null a valid, meaningful value. Consumers can still decide if they want to treat the null value as such or consider the attribute as not set in that case.

If we do that, we should do so before GA, make consumers aware of it and also consider exporters and protocols that cannot express null values by using a definition like @Oberon00 suggested above:

Null values for attributes MUST be preserved as-is (i.e., passed on to span processors / exporters as null).
If exporters do not support exporting null values, they SHOULD discard the attribute as if it was
never there in the first place.
They SHOULD NOT replace it with any value that would be valid for some non-null attribute type (using an
empty string, `0` or a `false`, for example, could cause wrong interpretations on the consumer side more
likely than not sending the attribute at all).

(probably with MUST (NOT) instead of SHOULD (NOT))

@tigrannajaryan
Copy link
Member

My opinion is that we should do this:

  • Make null a valid value for attributes. We don't necessarily need to have a special API for recording null values, just don't explicitly prohibit null in the spec. Remove from the spec the sentence that says setting the attribute value to null deletes the attribute. The consequence will be that for some languages SetAttribute will allow recording null value naturally (e.g. in Go SetAttribute("somekey", nil) should just work). For languages where null does not exist or needs a special API there will be no way to record it, which is fine.
  • Add language that @Oberon00 suggested that recommends how exporters that can't represent null should behave. (SHOULD sounds good to me. I think MUST is too prescriptive. Exporter authors may have better knowledge on what's best way to represent it).

If there is a desire for DeleteAttribute it can be added any time in the future. Also, if there a desire to allow null in languages where null value does not exist or needs a separate call we can add that as another API in the future.

@arminru
Copy link
Member

arminru commented Sep 17, 2020

@tigrannajaryan

For languages where null does not exist or needs a special API there will be no way to record it, which is fine.

That would, however, not allow recording any JSON value on attributes as desired above, right?
Either way, for GA I would also not increase scope by having to discuss an API to allow artificially recording null values since this could be solve by a compatible change after GA.

SHOULD sounds good to me. I think MUST is too prescriptive. Exporter authors may have better knowledge on what's best way to represent it

I also think that exporter authors will know best on how to represent it. Making it a SHOULD and stating the problems that would arise from using indistinguishable default values should be sufficient, yes.

If there is a desire for DeleteAttribute it can be added any time in the future. Also, if there a desire to allow null in languages where null value does not exist or needs a separate call we can add that as another API in the future.

Do you think prohibiting overwriting of values (either for all attributes or only the ones used at start for sampling decisions) is worth discussing before GA? This would be an incompatible change but prohibiting it now and allowing it later on would be compatible (if we declare it as undefined behavior rather than a reliable no-op). This would also remove the "implicit" DeleteAttribute feature we have now by overwriting with empty strings and such and soon also null, which effectively deletes the attribute's value (yet leaving back the key, which is only a subtle difference imho).

@tigrannajaryan
Copy link
Member

That would, however, not allow recording any JSON value on attributes as desired above, right?

Correct.

Do you think prohibiting overwriting of values (either for all attributes or only the ones used at start for sampling decisions) is worth discussing before GA?

I am not sure I fully understand the connection here. What in the current definition of the spec regarding overwriting attributes you see as problematic that needs a discussion? Spec says:

Setting an attribute with the same key as an existing attribute SHOULD overwrite the existing attribute's value.

This seems like a fine requirement to me and appears to be a natural behavior for a Set function.

@arminru
Copy link
Member

arminru commented Sep 17, 2020

@tigrannajaryan It was stated above that deleting of attributes is deemed problematic or undesirable but I don't see a big difference between allowing to overwrite attributes (especially with empty strings or null) and deleting them entirely, so I was wondering if there are concerns regarding overwriting as well.
I think you personally, however, were not opposed to deleting per se but only the fact that SetAttribute(key, null) leading to a deletion is not as descriptive as a dedicated DeleteAttribute API, right?

@tigrannajaryan
Copy link
Member

I think you personally, however, were not opposed to deleting per se but only the fact that SetAttribute(key, null) leading to a deletion is not as descriptive as a dedicated DeleteAttribute API, right?

Yes, that's correct.

@arminru
Copy link
Member

arminru commented Sep 17, 2020

Sounds like we have an agreement 🙂
I'll prepare a PR allowing null values for attributes and having them recorded and exported tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:api Cross language API specification issue priority:p1 Highest priority level release:required-for-ga Must be resolved before GA release, or nice to have before GA spec:trace Related to the specification/trace directory
Projects
None yet
8 participants