-
Notifications
You must be signed in to change notification settings - Fork 896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reconsider null behavior for span attributes. #797
Comments
See also #503 where I made the same request. Should we add a first-class API to support removing attributes? I would say no. This has already been debated (in #503) -- OpenTracing alsodid not support deleting attributes. |
@jmacd I think there are two different concerns here:
|
In keeping with JSON as a target, I see nothing wrong with placing "null" inside an array. If you have an Array in Java and some elements are null, I'd expect the output value to be a list with some strings and some nulls. |
@jmacd I think this is again about the "Array object" being null not an object inside the array. We already cover null inside arrays, see https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/common/common.md#attributes
|
I don't think we should. Not all languages have a concept of null and if we want that, we would need to change the attribute definition to allow an additional primitive type of |
From the issue triage mtg today, talked with @carlosalberto and moving this to P1 to highlight the need for a decision to either have a yes or no on this issue before the trace portion of the spec is frozen. |
@arminru would you be able to take this issue and drive it to resolution? |
I agree with @jmacd that if a user makes a valid call to SetAttribute, we should record the value. Silently ignoring a call to SetAttribute due to an unexpected null would make some scenarios really difficult to debug. The most straight forward and consistent thing is to record all valid values in some way. Since we do not want to guard against nulls, we should record them in some manner (see below). Languages which do not have null simply do not record any nulls. These languages already have to deal with parsing data formats which include null, so this is nothing new. I also recommend we remove the deleting of values. OpenTelemetry does not have any kind of command pattern in its data, except for this one case. I believe it will create trouble for us down the road. Instead, the Collector should be used for this kind of data massaging. Given that (as mentioned) not all languages have null, the delete pattern cannot be implemented everywhere anyways. Honestly, it is a bit bizarre that we would accept nulls in an array, but also use nulls to delete values. I think this is just case where SetAttribute ended up evolving in a different direction since the delete feature was added, and we now need to account for it. May I suggest that the simplest and most regular solution is to accept nulls, and apply the current definition of null handling in arrays to null handling in general.
|
@andrewhsu Sure 🙂 |
If we would allow null as a meaningful value, this would have to be properly handled by the protocols, exporters and consumers. The current version of OTLP would not allow expressing a null value but we would rather have to convert it to the default value of either string, bool, int or double. In languages where null can be passed without specifying which "type" of null it is, we would already have to guess here. If we would allow null and express it using the type's default value, we would lose information as those two values would no longer be distinguishable. I am certainly against that, since I don't think we can just export null as I think the attribute not being present expresses the meaning of null better than the default values for each type. If we would add support for a proper null value, we would have to extend OTLP to express this properly, and I don't think we will be able to or should do this before GA. I am open, however, to discuss this for after GA but would only go for a proper end-to-end solution here rather than making some last minute compromises. |
and @tedsuo
I don't think it was intended to introduce a new "delete attribute" feature but rather to specify how to deal with a |
What do you mean by that? I don't see a big difference in allowing overwriting existing values with something else (e.g., an empty string if "deleting" is not possible) or removing the previously set attribute entirely. |
If I remember correctly, we decided to preserve null values in arrays to allow for "companion arrays" where the values in one array and the values at the same index in one (or more) other arrays form tuples (i.e., |
@arminru what I mean by command is the following: if To put it another way: If we want to have the ability to delete attribute keys, we should add a I understand how there are conventions in some languages around null being used as a stand-in for delete. But I hope my explanation clarifies how there is a difference between recording null and deleting a key. |
I think we should make clear that if possible
Our protocol DOES NOT support null currently. The protobuf definitions would indeed support it, but it's not part of the allowed set of value types for attributes now, and consumers that do more than just forwarding are probably not be prepared to handle it.
I have also recently adopted the opinion that deleting on
The "as an exception" wording (which is warranted here) is an indicator of this too. However, I think the safest option for now is that such calls are just always ignored (maybe with an off-by-default option for logging them). A technical problem for accepting nulls are dynamically typed languages where you have no null-string/integer but just "null" with it's own "NullType" (or None and NoneType in Python) where the ""/0/false replacement is not possible since you can't choose a tpye. A semantic problem is the lost distinction between "not there" and "there and false". Which also exists to some extent in arrays (though null in an array at least conveys "we expected something to be there / an empty slot was there") but the companion-array case is more important there IMHO. An alternative I could live with would be:
But note that IMHO the OTELP protocol cannot be extended until after we reach the next "breaking change allowed" cycle (the trace part of that protocol has been declared stable). No strong opinion here though, as Dynatrace does not yet use OTELP. |
@Oberon00 I think you have nailed it. All other details aside, the core issue is "the principal of least surprise." IMHO, the most uniform solution would be to say that nulls are encoded as empty string. This could also scoop up other odds and ends, such as "undefined" in javascript. It is not perfect of course, but it would be implementable today, and universal across languages. I would prefer this as a temporary solution over dropping nulls, only because silence is hard to debug. I do understand the resistance, as it is not entirely semantically correct. My point is that dropping nulls is not semantically correct either, and empty string is easier to deal with. There is no great solution here. 🤷♀️ Either way, it sounds like there is agreement that, in the long run, we should add null as a value type. |
I disagree. In Java |
Actually, I don't like that. The problem is just that in some languages you have to somehow deal with null. Following the principle of least surprise, null as first-class value would indeed seem to be best for these languages. But then we may need awkward APIs in other languages for generating these nulls. And having one more value type may lead to surprising semantic conventions, etc. |
Maybe the question we should ask is: Is null a semantically useful value type? We already can use empty strings, empty arrays, boolean false, unset attributes. Do we need null in addition? It might be used to signal "I looked for this attribute, but did not find it". Is that useful? Are there any other use cases? |
@arminru the protocol does support null values, see https://github.com/open-telemetry/opentelemetry-proto/blob/313a868be259dce6c6516dd417d3ad5fd3321acf/opentelemetry/proto/common/v1/common.proto#L29
AnyValue is defined so that any JSON can be represented in it and I agree with @jmacd that being able to represent anything that JSON can represent is a valid goal. OTLP allows this today, we just chose to limit it in the API. |
I don't understand the reason behind that. Can you elaborate? |
If I have a source data that is a JSON I would want to be able to record it in an attribute in a loseless manner. This is already supported in Log data model (both Body and Attributes explicitly allow |
@tedsuo I don't really see the difference between overwriting and explicitly deleting an attribute. If we want to avoid deleting attributes (by either setting null in languages that have it or overwriting it with an empty string, for example, and thus effectively deleting the previous value), we would have to prohibit overwriting of attributes entirely. This would be fine with me and would particularly make sense for attributes that were used for a sampling decision (i.e., already present at span start), but that's a different discussion. Again, deleting the attribute if null is set is not an intended feature but rather a well-defined error handling behavior if this (currently deemed invalid) input is passed. I would also be fine with defining |
@tigrannajaryan Thanks for pointing me to that line in OTLP. I missed the comment there that specifies how null can be expressed. I'm not convinced that representing any JSON value to be set as an attribute is a requirement that we need to fulfill. One can set an entire JSON object (including the literal string "null") already and if the JSON is decomposed and each value is set as a separate attribute, a null value can be expressed by simply not setting that attribute. Is there any other use case that an explicit null value would solve? I can't come up with anything were we would need to distinguish between an attribute not being set or being set to null/undefined. If there is any, please let me know so I can understand the problem better. |
@arminru I agree, it is not a requirement, but a nice to have feature. The problem with the current definition of SetAttribute(key, null) is that it will be a breaking change if later we decide that we do want to be able to represent JSON values. In that case the semantics of SetAttribute(key, null) will have to change incompatibly. I am not convinced of the opposite: that it is a good idea to use null as a magic value for erasing attributes. This is what spec says today:
I don't like this part of the spec. I'd rather have DeleteAttribute API instead. span.SetAttribute("key.to.remove",null) with span.DeleteAttribute("key.to.remove") I prefer the second version more since it is immediately clear what it does. The first version, unless I read the documentation I will likely misunderstand what it does. I don't see the point of the current definition of null values . I think it is unnatural, uses magic values, ignores the benefit of being able to represent JSON data and is not ergonomic as an API.
I think there are. For example: |
I see the deletion of attributes as a different concern. Changing the behavior from either deleting or no-op to preserving null as a meaningful value, however, would be a breaking change in both cases so we should settle for either of the three options before GA. Regarding span attributes representing JSON values, see my other comment above (#797 (comment)). Thanks for explaining that potential use case. I'm still not really convinced of the benefit of saying "I know that there is an attribute named X but don't know or can't determine the value I should set" by explicitly setting it to null but in general it makes sense to me. If others see the need for that independent of the deletion issue that could be solved differently as mentioned above as well, I would agree on making null a valid, meaningful value. Consumers can still decide if they want to treat the null value as such or consider the attribute as not set in that case. If we do that, we should do so before GA, make consumers aware of it and also consider exporters and protocols that cannot express null values by using a definition like @Oberon00 suggested above:
(probably with MUST (NOT) instead of SHOULD (NOT)) |
My opinion is that we should do this:
If there is a desire for |
That would, however, not allow recording any JSON value on attributes as desired above, right?
I also think that exporter authors will know best on how to represent it. Making it a SHOULD and stating the problems that would arise from using indistinguishable default values should be sufficient, yes.
Do you think prohibiting overwriting of values (either for all attributes or only the ones used at start for sampling decisions) is worth discussing before GA? This would be an incompatible change but prohibiting it now and allowing it later on would be compatible (if we declare it as undefined behavior rather than a reliable no-op). This would also remove the "implicit" |
Correct.
I am not sure I fully understand the connection here. What in the current definition of the spec regarding overwriting attributes you see as problematic that needs a discussion? Spec says:
This seems like a fine requirement to me and appears to be a natural behavior for a |
@tigrannajaryan It was stated above that deleting of attributes is deemed problematic or undesirable but I don't see a big difference between allowing to overwrite attributes (especially with empty strings or null) and deleting them entirely, so I was wondering if there are concerns regarding overwriting as well. |
Yes, that's correct. |
Sounds like we have an agreement 🙂 |
The current spec of span attribute defines
null
value as a way to delete an existing attribute: https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/api.md#set-attributesAnother approach is to ignore
null
values as the call toSet Attributes
was never made and thus, preserving the original value of the attribute.This came up in this discussion: #777 (comment)
This issue is meant to discuss which of the two approaches we should follow.
The text was updated successfully, but these errors were encountered: