-
Notifications
You must be signed in to change notification settings - Fork 172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should event fields be part of the global attribute registry? #505
Comments
Just to document my take on this and this will include some generic definitions on how "events" should be defined. By Default for the
Optionally
Optional (additional) Attributes (not the
|
I’ve stated my opinion on this before and engaged in lengthy discussions in my Event payload proposal document. I’ll recap my stance and offer some old and new arguments: Event fields should be part of the global attribute registry and should be subject to the same naming rules as other attributes. In a nutshell, the discussion comes down to a question of whether the constrained environments (limited network bandwidth, CPU, package size) of certain client scenarios should influence how OpenTelemetry models semantic conventions for Events (and presumably other signals as well). The answer should be no. We should clearly decouple modeling the shape of telemetry (i.e. semantic conventions) from how telemetry is sent over a network. Accepting that Event fields are different from attributes on other Signals would result in:
Another aspect of the discussion is whether including Event fields in the global attribute registry is unnecessarily cumbersome for people looking to contribute Event semantic conventions. Do we want a bunch of domain-specific one-off attributes? Do we want contributors of Event semantic conventions to have to understand the universe of attributes such that they re-use an existing attribute when the concept has the same semantics? Do we want domain experts to have to define their Events in collaboration with semantic convention maintainers? Yes, yes, and yes. The process of defining semantic conventions is intended to be low overhead and accommodate domain specific concepts. Defining Event semantic conventions should be subject to the same process and scrutiny as any other convention. There’s no difference between a group wanting to define domain specific Event conventions and wanting to define domain specific Span conventions. Both require defining how domain specific concepts fit into semantic conventions. Up to this point, I’ve been expressing my own opinion. However, we talked about this issue in the 11/15/23 TC meeting and there was agreement among the present TC members that Event fields should follow the same process as other attributes defined in semantic-conventions. We discussed that OTLP as currently defined may not be suitable for certain extremely resource constrained environments, and that the Client WG in collaboration with the Event WG should work to prototype changes to the protocol to accommodate these environments. Ideas included various versions of the symbol dictionary, or using a columnar format like the OpenTelemetry Arrow Project, or even a stateful domain specific protocol. Note that this opinion from the TC reflects consistency on a previous vote about whether metric attributes should be namespaced. The last comment on that issue is particularly relevant here:
An interesting aspect of this problem is that it is not limited in scope to the Client WG, or to Event WG. It involves questions of data modeling that squarely intersect with the Semantic Convention WG, it has implications on how the different telemetry signals interact with each other which is a clear specification level issue which is maintained by the TC, and it has implications on all maintainers since Events will be used outside of the client instrumentation space. So far, we’ve heard a lot from the perspective of the Client and Event WG, and we’ve discussed this in the TC. I’m interested in whether there are any additional perspectives from other semantic-conventions contributors (other than the maintainers who are also on the TC) and from language maintainers? Are all the arguments on the table or are there additional things we haven’t yet considered? |
I want to document my VERY STRONG disagreement with this approach on several levels that I'm not quite sure where to start... Strongly agree:
Strongly disagree:
Possible "tweaks" which would make the "logical" prefixing of an
A "general" OTLP transport improvement(s)
I'm going to (try to) stop now and I'm sure we will (and need) to discuss further in the Event WG (and other SIGs), for reference I have create a draft white paper (not yet public -- being reviewed and may not be able to be fully public) on why Microsoft is not yet using OpenTelemetry for the Web (some of this is summarized above, and I've talked in lots of SIG's about the reasons), this draft internal paper is 17 pages long with all of the issues and what is required. And for clients breaking things down and trying to address (and / or punt) different concerns down the road doesn't work because they are all inter-related to different degrees because
To Date OpenTelemetry (appears) (IMHO) to have been driven by the "final" storage format used by many organizations and then creating wrappings around these "storage" formats for creating on servers to enable the data to be quickly dumped, and for high volume servers this is an excellent approach -- it just doesn't work for ALL client scenarios / languages. |
No. The attribute naming rules use strong normative language OpenTelemetry authors, but we can't enforce what application authors do for custom events. At best, we can define recommendations. For events defined outside OpenTelemetry and bridged over, we should minimally transform the original payload with mapping rules. The naming rules for are OpenTelemetry semantic conventions.
We shouldn't jeopardize the coherence and clarity of the telemetry semantic conventions for the requirements of extremely constrained environments. We must decouple data modeling from transmission, and work to accommodate constrained environments with transmission level solutions.
This is an example of a "transmission optimization" described in this comment, which is compatible with the notion of treating event fields as attributes.
You're describing a breaking change to the OTLP JSON protobuf encoding. A variation which doesn't include the type is possible with a new protocol. This is still possible, since as this comment points out, "any transmission optimizations can still be built".
It sounds like your issues go well beyond the representation of Event fields, bubbling into the SDK specification. The SDK specification are stable of course, and its hard to imagine evolving them in a way that would accommodate the extreme requirements you describe. If the issues exist for all client use cases, it may suggest that a domain specific SDK might need to be developed for the specialized conditions which is not subject to the normal SDK specification language. If the issue only occurs in client application of a certain scale (or under certain constraints), it may be worth considering developing a custom SDK for your specific use case. The OpenTelemetry decoupling of API and SDK was designed for this purpose. |
My view is the following - With that said, I am not 100% convinced that event payload fields need to be in the same global registry as regular attributes. Currently, attribute semantic conventions have a global scope, while event fields are scoped to the particular event. The combination of the event name and a field name should be enough to understand what the field represents. And in order to enable reusability, perhaps it should be allowed to use an attribute from the global registry as an event field. I want to give one example. Assume that there is an event that represents navigation in a browser environment. We want to capture the previous URL and the current URL. We cannot reuse I would like to hear from @open-telemetry/specs-semconv-maintainers and @open-telemetry/specs-semconv-approvers about this topic. If there is a consensus that event fields should be in the global registry from this group and the TC, then I will be in favor as well in order to make progress. |
If the answer to this is "no", where else would those fields be defined? Wouldn't they be defined at all? Or would there be another event-specific registry independent of the semantic conventions attribute registry? |
I agree with everything that @jack-berg wrote in #505 (comment). Some additional perspective:
@martinkuba In ECS, we have the concept of reusing field definitions in other namespaces to address exactly this use case. For example, we define We also discussed in the SemConv WG to introduce a similar concept in OTel semantic conventions and have agreement that it's something we want to do. (Next steps is to work on a proposal for the syntax and tooling to realize it). Maybe that would help with the above concern. |
@AlexanderWert can you please point to an example in ECS describing the fields of payload for a given event type? |
@scheler Sure, it's mainly defined by the Elastic integrations which are with their documentation pages to some extent comparable to the semantic conventions in OTel. Let's consider a very concrete example: NGINX Ingress Controller. That page describes in very detail, how an access log from the ingress controller is parsed into ECS fields. In OTel terms, that page describes the semantic conventions for NGINX Ingress Controller logs. (Please bear in mind that with ECS there's no differentiation between top-level fields, attributes and payload as it is with OTel / OTLP. But, that doesn't make a difference for the below reasoning regarding payload). In the "Exported fields table" you will find both:
Some inspirations we can get from this example are:
As part of the ECS contribution we also hope to contribute semantic conventions and parsers / integrations for many of these integrations to the OTel ecosystem (which will turn the ECS addition into actual, consumable value for the community). The event / logs signal will be crucial for that. |
@pyohannes They would still be defined here, but along with the definition of the event. For every event name, there would be a list of fields that should be included in the payload for that event. I suppose this is similar to how attributes are associated with metric names right now. |
@AlexanderWert Thank you for the context around ECS. This makes sense, especially if we are looking to adopt existing proven conventions from ECS. I think based on that it comes down to two questions: where are event-specific fields defined and should they have unique namespaced names? I see that ECS can combine common fields (akin to OTel global-registry attributes) and fields specific to the integration. The NGINX-specific fields are defined along with the integration, but not in the ECS schema? In the OTel semantic conventions, if we have a document that describes an event with its payload fields, and one of those fields is specific to this event (similar to
I think the key takeaway for me is that we want all field names to be self-descriptive without the need to know the event name. Is this correct? The way I have been thinking about this is that the payload of the event is always interpreted in the context of the specific event. We would not be looking at individual payload fields independently. In theory, the payload could contain other type of data than a list of name/value pairs (the LogRecord body is AnyValue). So, interpreting the payload data will always be done in the context of the given event. |
I agree, based on what Alex is saying about the The payload is as @martinkuba defines above is for the |
PerformanceSo, first, I think in order to consider performance arguments we need to see the data, measurements that compare different approaches. I don't see that data, which makes it impossible to assign relevant weight to the performance argument. I am going to ignore performance aspects for now until we see the supporting data. We can reconsider the design when there is additional evidence from benchmarking. Design PrinciplesFor now I think we should base our design on the following principles:
ProposalBased on the principles I believe we should do the following. For events that originate externally:
For greenfield Otel events:
For all events:
|
In the interest of not blowing up the conversation too much, I have added some thoughts on this subject here: https://gist.github.com/breedx-splk/14bcc08a8cbcdef5b635cb5d22989ccc Best move is to read it top to bottom (it's not too too long, I promise) but here are the highlights for the distractible and overworked:
That doc doesn't at all address the idea of putting event bodies in different places depending on whether or not they are otel native. That idea is baffling to me. I think having a single consistent place to look for the event body makes way more sense than having it somehow be conditional. |
@breedx-splk I do not understand the definition of metadata you have and the example you bring.
Why is "balloon diameter" an event data and "balloon color" a metadata? What makes "balloon color" a "data about other data" (what other data?), but "balloon diameter" NOT a "data about other data"? [UPDATE]: Re-reading the proposal again, it seems the separation is between the required data and supplemental (optional) data. Is that a fair litmus test to use to understand the difference between the "event data" and "metadata" in your proposal? If that is not the correct litmus test can you please define one? |
I think the CloudEvents spec describes this well: attributes are used for context and custom data, while event data (payload/body) contains "domain-specific" information. By domain-specific I understand the data that is inherent to the event - the event can still be interpreted without its attributes but not without its data. The button click event in @breedx-splk's examples illustrates this well. Information about the button (e.g. its name) specifically describes the event, while screen dimensions or session ID describe the context in which the event occurred. With that said, I would like to add an argument in favor of separating the event data from other attributes. Systems that interpret events might (very likely) want to handle the event data separately from the other attributes, whether for storage, displaying in a UI, or doing some additional processing. If the event data is combined with other attributes, how would they be distinguished? The consumer would have to be aware of all field names of all events it wants to handle, or all names that are only used for context. I suppose there could be a convention like everything prefixed with the name of the event, but that does not sound like a clean solution. |
The CloudEvents' I do not see "custom data" mentioned anywhere in CloudEvents spec. Please clarify what you refer to. @martinkuba It would be very useful to see a clear definition that explains the difference between attributes and event data. You say:
Is this the definition of what constitutes event data vs attributes? If it is than I am afraid it is going to be hinged on what "interpreting" means. In the examples of @breedx-splk's proposal "session.id" is shown as supplementary (so not "inherent" to the event). I can argue both ways: a) "session.id" is not necessary for interpreting the event, b) "session.id" is absolutely necessary for interpreting the event. Which way I argue depends on what is it exactly that I am doing when I am "interpreting". If I am trying to build the click heatmaps in my UI, I don't need the "session.id", I just need coordinates of the click events. If I am trying to replay a particular user's actions over time I absolutely need the "session.id", I cannot interpret without it. So, to use the definition that you brought ("necessary for interpreting") I think we need to say more concretely what "interpreting" means. |
I was referring to the Extension Context Attributes. By context, I also meant any data that provides additional context about the event, such as the environment. |
@breedx-splk I read your writeup - thanks for clearly laying out your thoughts. Some comments: You try to lay out the distinction between the event and meta data with a couple of examples: a balloon, a person, and an example button click event. In all these cases, the distinction between payload and metadata is not obvious:
In all these cases, the difference seems to be somewhat arbitrary. As @tigrannajaryan points out, I think for the distinction to be made between the payload and other attributes, we need a convincing litmus test. So far, the ones I've seen are hand-wavy. You mention the following:
How do you reconcile that with the need to be able to evolve event schemas, and add additional fields over time? A consumer may expect version 1.0.0 of the event schema with includes fields The option 5 you present has no listed cons. I can think of a few: Terse fields have the negatives listed here. One thing we discussed in the SIG is the ability to opt into including the original unadulterated click data as provided by the browser in the OpenTelemetry event. The body is the natural place for this as it matches the description from the log data model:
Any solution in which the body is used for the first-party event fields requires another opt-in field in the schema to be able to accommodate this behavior. This is part of what convinced me of @tigrannajaryan's proposal. I now imagine event semantic conventions as each defining a name, set of attributes, and description of where the body comes from (i.e. typically a link to the API used to populate it). |
@tigrannajaryan has said
and @jack-berg said something very similar
So the designer of an event schema is tasked with deciding which fields (from the set of all available things in the known universe) belong to this event. For some events, the name may be enough, and the required set of fields is just the empty set. For others, the designer must choose which fields are necessary (required) for for their specific problem domain, which fields are nice to have (optional), and which fields are not included (everything else). The balloon example is contrived, so I mentioned If there needs to be a litmus test for whether something is a core piece of an entity or if it's metadata, I suppose it involves answering these questions:
I'm not convinced that there exists a singular test that can be performed to make design decisions for data modeling or event design purposes. Design is a craft that requires art, experience, and consideration. The event designer is deciding what constitutes the event based on the needs of the emitter ("I want to communicate this important thing outward") and the requirements of known receivers ("I want to handle events that have these qualities"). Events should be designed to address something in a problem space. I think nobody would be surprised to find a coordinate in a mouse click event. They might be surprised though if it was a button click event. Those are different things that serve different purposes. If I receive a person entity and look inside and find a field called If it does, in fact, turn out that the balloon's color is helpful when processing these pop events, then by all means we should improve our schema design. Similarly, if we think we'll never mess with mylar, let's remove the surface material while we're at it. |
@jack-berg said
Yeah totally, I agree. I concede that point as a tad flimsy...but there are pathological cases where entities have all this extra stuff that most consumers don't want or need. Heavyweight burdensome cruft. It's nice not to have to deal with that.
Agreed that those items you called out are applicable to Option 5. I don't agree that all of those things are outright cons, but I understand where you are coming from. I'll try and keep my take short.
I've added a link to your comment as "cons" in my linked gist. Thanks for the input! |
Formatted response on the performance aspects of this (everyone should have comment access) |
And at Microsoft we do exactly that, we support the concept of "Custom Events" (ie. application derived events), we don't control what "fields" they include, we provide the ability to send a "custom event" and all of the |
I'll bring in a few additional perspectives to see if it may help. "Performance concerns" mean different things to different people. I don't need to see anyone's data to be convinced that a smaller encoding is better for performance. Speaking for the OpenTelemetry Protocol with Apache Arrow project, I think we should get away from using the cost of a "fully-expanded" representation to justify abbreviation. In general, we should not be encoding the name of the attribute with every data item on the wire, whether the name is fully qualified or a short, event-specific field name. Everyone arguing in this thread could have what they want, if OpenTelemetry would get better about encoding schema information. The schema in today's https://github.com/open-telemetry/otel-arrow does not specialize for the logs This is what you get using Arrow-based transport. By the time you are exporting that event, it will be decomposed into a number of rows; the schema of each row will be known and transmitted once per stream, and from a performance perspective it won't matter whether:
The OTel-Arrow group would be glad to discuss this; it's something @lquerel experimented with before the project donation. So, should event fields be part of the global attribute registry? No. I think events should have schemas. We should transmit schemas once per stream, we should transmit variables once per event, and OTel-Arrow will take care of reconstituting the fully-expanded representation at the end of the pipeline, where it's somebody else's performance concerns. It is a separate concern how schemas should be encoded and registered, and so on. This is currently under discussion in relation to a proposal, also by @lquerel, in Introducing Application Telemetry Schema in OpenTelemetry - Vision and Roadmap and the #otel-weaver channel. We think it would be best if the schema was used to compile strongly-typed interfaces, which would compile directly into an efficient implementation, that would alleviate the performance concerns found here. Reading the gist @breedx-splk, I find myself with a few more questions. Timestamp: is relevant, just not under debate, is that correct? Otherwise, is there is a way to order events created by a particular resource?
Session ID: I haven't forgotten open-telemetry/opentelemetry-specification#2500 and I still want to see us store session-id as a non-identifying resource attribute. Again, this is something we could accomplish with more schema information. The present ("weak schema") definition for resource prevents us from adding ephemeral and other descriptive information with cross-cutting applications, and I want to see us fix that. Option 6: speaking from a past life, I could at least imagine more options than those enumerated in the gist linked above. For option six, I propose to add a couple of things to the OTel log record:
|
100% agree. And for clients there are many additional concerns which are not currently addressed, and unfortunately something like Apache-Arrow (at this point in time) is a non-starter at many different levels which include
There are also some good words and some concerns in your response as well.
If this means that for every "request" getting sent from a client, then this is problematic and depends on "how" the schema is encoded on the client. Ideally, it's not included and instead it's either "inferred" (by a name) or just referenced url.
Encoding is also problematic, as the "time" required to serialize into said encoding is one of those critical performance concerns for client environments. Ideally, this needs to use a native environment library (for browsers the most common is JSON.stringify()) as there is no faster way to convert an object hierarchy into a serialized form. Could Arrow be a solution (or part the of) -- maybe, but not today (from what the JS and Client Sigs' have currently seen -- moslty because of that required "bundle" size) so the following comment and the result would be key to that
|
Thanks @MSNev, this is very useful. I have put together a quick comparison test here: https://github.com/tigrannajaryan/exp-jsevent It generates synthetic events based on github.com stats you provided. It would be great to modify this to use more real event composition to make sure it is a fair comparison. So far I have this results: In original, non-encoded form the flattened events are 85896-32550 = 53346 bytes larger (85896/32550=2.64 times larger) than in nested form (both JSON encoded). In dictionary-encoded form the flattened events are 23389-21426 = 1963 bytes larger (23389/21426=1.09 times larger) than in nested form (both JSON encoded). Dictionary encoding time: about 1.7ms. I am probably the worst JS coder in the world. It would be great if someone who knows what they are doing would take this and turn into a proper benchmark (and verify that I don't have measurement bugs and the comparison is fair). The question I have: is 53KB (or so) extra RAM, 1.7ms extra encoding time and 9% larger JSON payload (before any gzip compression) a dealbreaker? |
@tigrannajaryan where did you run this test? Is it on your laptop? Please see the article on how slow some CPUs are in the mobiles of users around the world. https://calendar.perfplanet.com/2023/wikipedias-worldwide-web-cpu-benchmark/ |
@scheler it would be great if you can help produce benchmarks on relevant devices (and with more realistic payloads, mine is probably not a good approximation of real world). It will make the performance argument stronger. |
@MSNev Currently, the protocol based on Apache Arrow is not optimal for browser environments due to the current size of the Apache Arrow JS library and the specific lifecycle of web pages, which makes it hard to offset the initial cost of the Arrow schema. In the future, these two issues could be addressed by creating a variant of this protocol for constrained environments like browsers. However, this is a separate project that will take time to develop, so I won't focus on it right now. As @jmacd highlighted, the proposed Application Telemetry Schema could address various issues discussed in this thread.
I believe this approach meets performance constraints while preserving semantic conventions and the need to define fields/attributes that are easily interpretable and rich in metadata. |
Correct. I just wanted it off the table while we discuss the other bits. |
This was our (OT Sig's) observation as well from looking at the 1000ft level with the details we had, so it good to see that we didn't miss anything.
Yes, I agree, the definition of "events" (especially application custom events), has a very high degree of correlation with what we are trying to do in the Event WG
Do we "need" a schema, we had been going down the initial path of not "requiring" one. |
Yes, events without a strict schema are useful as they are. But to further optimize this scenario, and to facilitate interoperability between systems, I think that yes, a schema is useful and not necessarily constraining. This is essentially what has made protobuf or thrift successful at the backend level, and it is also the direction that REST interfaces have taken with OpenAPI (and TypeSpec). |
@MSNev about the use of "performance arguments", I want to try and maintain the following balance:
To say that Arrow is a non-starter, to me, breaks this second principle. No one said you should use an Arrow library to accomplish this. The Arrow data specification combined with OTel-Arrow definitions is what you will use; I expect you to compile a custom event interface that is aware of your schema; I expect you to use WASM, and just as much code as necessary. Also, I won't be surprised if Arrow interfaces become standardized for browser environments. If 1000 bytes is important to you, I expect you have the resources to find a solution to this problem. |
One more thing regarding the Schema files and Body field. The Schema files currently allow describing the changes to Attributes field, but not to the Body field. If we put any data in the Body field the schema of which we control we either 1) will have no ability to describe changes that happen to such data or 2) we will need to add the ability to describe changes in the Body field to the Schema File. The later is additional complication and likely will be more complex than how Schema files describe Attributes (because in Body filed we will allow nesting of fields and it creates more variations of data changes). |
Understand, I think what we are thinking about in terms of schema (specifically along the lines of an application schema), is more akin to the way that Cloud Events describes it for the definition of the payload. |
@scheler I have rerun the test on the slowest phone that is available to me (Oneplus, 2017 model year) and I get around 3ms of encoding time for that synthetic 121 event batch that I modelled after @MSNev's example from github.com. Updated benchmarks, with comparison to JSON.stringify, standard OTLP JSON and a new encoding that uses dictionaries: https://github.com/tigrannajaryan/exp-jsevent/blob/main/README.md Anyone who has access to slower devices feel free to rerun the benchmark. So far the numbers I am seeing look acceptable to me. |
@tigrannajaryan thanks for creating the benchmark tests. I checked that Oneplus 2017 is about twice faster than Moto G5 that's used as a reference in the wikipedia cpu benchmark tests, and noting that in South Africa 25% users have devices slower than Moto G5, we can deduce that this encoding could take ~8-10ms on slower devices, compared to 1ms on faster devicds. Martin also pointed out that this is "additional time", on top of the time the original apps already spend, in batching, serializing and sending the content out. People have performance budgets for the whole app and so numbers matter when choosing additional libraries to integrate into their app. It's interesting you included JSON as-is in your benchmarking tests - it's only slightly larger than "OTLP with key dictionary" when used in the form "Nested, in Body, no namespaces" and also about half the size of OTLP JSON. The payloads will be lot smaller if we use key dictionary on top of JSON as-is. I guess it makes a case to support JSON as-is in OTLP irrespective of any further encoding. This will be helpful for cases where protobuf is not used, for eg., pages that automatically redirect to others give you very little to do anything. I noticed this was discussed in the Event SIG last week - will this be a v2 of OTLP JSON or this will have to be a transport encoding on top of OTLP JSON? |
@scheler yes, I intentionally included it as it is likely the lower bound of what is possible from encoding duration perspective since json.stringify() is just a single call to a native function. The difficulty with as-is encoding is that it essentially fully couples the in-memory representation of events in the SDK with the wire protocol and once published makes difficult/impossible to refactor the SDK's in-memory data models since that will break the wire format. It may be acceptable for hyper-optimized Otel implementations but likely is a no-go for a generally useful Otel SDK and protocol that needs to evolve over time.
I suspect "as-is with dictionary" may have roughly the same encoding execution time as the "OTLP with key dictionary". I may be wrong, it is worth experimenting with. I posted the benchmark source codes so that anyone is free to play with it and try adding new formats for comparison.
I am not sure I understand this part. Can you elaborate? |
For short-lived pages - that immediately redirect to another page or pages that have a high chance of users navigating away immediately - we can deliver a lighter version of javascript bundle that includes just one instrumentation and a minimal SDK with only Events support, no batching, no protobuf, to fire off a pageview event that can be exported immediately. Since it's just one event, OTLP JSON should be okay if JSON as-is is not possible for the reasons you outlined. |
A few more thoughts on event data as attributes or body: There will be three kinds of events: OTel-defined events, 3rd-party events, and custom events (added by the application via an API). Consistency - Regarding the litmus test of what data goes in event data vs attributes - IMO, the only confusion of whether to put some data in the body or attributes would happen with custom events. In that case, the end user has control and they can decide where to put their data based on how it is used. If we combine event data with other attributes, and want it to not be confusing, should we not have a concept of event data in the API? But we still need it in the API for 3rd-party events? I think the main question for me is how events will be processed and stored. Will there be use cases where the backend needs to distinguish between event data and attributes? If yes and we do not separate them in the transport, then we will make this use case impossible, at least for custom events - assuming that a schema could be used to extract event data attributes for OTel events, and for 3rd-party events it would be in the body. |
Both in today's 11/12/24 event SIG and the previous 12/8/23 meeting, we just discussed the idea of start with a set of principles and letting the design fall out of that. The idea being that designing without a guiding set of principles can end up feeling subjective, be hard to defend, and lack coherence. I propose the following design principles, ordered in terms of priority (high priority items first):
From these, the following design falls out. I'm being brief because the characteristics of these things has been discussed elsewhere. The numbers in parenthesis indicate the number of the design principle that informs that decision.
|
Here is a counter-proposal, separated into two parts that I think can be decided separately. Separation of event data and attributesDesign principles:
Proposal:
Naming of event data fields for OTel-defined eventsDesign principles:
Proposal:
|
I've not had time to formally enumerate my design principals, but they will be based on my original comment #505 (comment) which we can talk about in the SIG. |
I agree with a lot of the above! I'll offer up my approach: Design principles:
Rather than make a proposal yet (wherein my $0.02 adds a circle to the growing Venn diagram), I want to offer some push back / alternatives to existing/above ideas.
|
We will open new issues related to the outstanding issues here as the original question is now answered. |
What's the resolution? |
The answer to the question posed The answer is No, the payload fields should not be part of the global semantic registry, we are moving forward to define how the payload fields will be defined. |
Part of what is discussed here is whether the event payload should be represented in the log body or as a number of individual attributes. This has already been addressed in #566 by specifying that the event payload should be in the log body. Modeling the event payload is currently being worked on (#755 as @MSNev pointed out). I am closing this issue since it appears the discussion here has been exhausted, and more specific discussions are now happening in other issues. But if anyone wants to continue this conversation here, please feel free to re-open this issue. |
This issue is intended as a discussion about the approach of defining event fields.
Background
Events are a new concept in OpenTelemetry that is in the process of being defined. An event will consist of a name and a body (payload). The name will be represented as the event.name attribute. The body payload will typically consist of a collection of fields (name/value pairs). An event may also have additional attributes that are separate from the body.
The discussion about field names in this issue refers to fields that are part of the event body, not the additional attributes. Additional attributes will follow the same rules as attributes on spans.
Discussion
Should fields in the event body be defined in the global attributes registry?
Here is an example event representing a browser page load (from this PR). The field names are not namespaced and their definition would be only in the scope of this specific event.
Pros:
Cons:
Fields defined in the attribute registry
If the fields for the same event were defined in the registry, they could hypothetically look like this:
Note that both
url.full
is an existing attribute that is re-used in this event. However, thebrowser.document.referrer
also contains a full URL, but must have a different name here to distinguish from page URL and also to provide the meaning of the field.Pros:
Cons:
Hybrid
There is a third option where an event payload could contain both fields defined only for the specific event and attributes from the global registry.
Event fields defined externally
There might be cases when it's better to not define the content of a payload in OpenTelemetry semantic conventions, but instead link to an external definition. An example of this might be the data from the W3C ResourceTiming API. In this case, if the field definitions were duplicated in the attributes registry, they may get out of sync with the external definition.
If fields defined externally are used in the event body, they would not be namespaced.
The text was updated successfully, but these errors were encountered: