-
Notifications
You must be signed in to change notification settings - Fork 893
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is Name field necessary in Log Data Model #2074
Comments
If logs do end being the data type to represent events as well, then I would prefer having a name field as a differentiator. New Relic supports the MELT data types: Metrics, Events, Logs, and Traces. The characteristics of traces and metrics are distinct enough that they obviously require data types in OpenTelemetry. Logs and Events are much closer. In New Relic's log data model, logs don't require any fields, but generally have a timestamp, a message, and attributes. In New Relic's event data model, events have a required I've been working off the assumption that OpenTelemetry won't have an event data type, and that logs would cover both use cases. And so in order to allow our users to be able to use OpenTelemetry to ingest New Relic events, we could use the presence of the Its possible to differentiate between New Relic logs and New Relic events with an attribute, but I like the idea of it being a top level field in order to make it more obvious: an opentelemetry log with a name field is treated as a new relic event; other logs are new relic logs. |
The name would also be quite useful for modelling for RUM events, which look a lot like logs, but would benefit from having a |
I think Good call out on the RUM events. I haven't been following those discussions closely, but is the group leaning towards using logs to represent stuff? |
Yes, we are. Point-in-time events (which look pretty much identical to NR's events) are very commonly needed in RUM instrumentation, and logs seems to be a good match. The only thing that is missing is a notion of causality. In tracing, you can have span parenting, but in logs, you only have the loose association via attributes, and logs can't parent other logs (as they don't have ids). |
@jack-berg it would be very useful if you could add the mapping from New Relic to Otel Log Data to example mappings. It would help with data model discussions. We need more examples to be able to better generalize the data model to cover real life use cases.
Yes, definitely. This was discussed and accepted as the Otel's stance on log and events. |
I think we need to decide on 2 things:
Note that not having a |
Will do 👍 .
I agree that sounds a bit strange. But is there good reason for the Syslog
If name or type is going to be a top level field and be useful for New Relic's purposes, it should be defined as a low cardinality categorization of the log. I can imagine this type of concept would be useful for RUM events as well. |
On the |
I would think that would go into the Instrumentation Library 'name', rather than the individual log entry 'name. |
I wonder if that is not too fine granular though. Also, in certain cases, this includes more information, such as line number:
I think the candidates for EDIT: I forgot about the idea of having |
Events must have a mandatory name/type to make sense, so if we remove Name field we will be changing it from being mandatory to optional to now hidden. While we can make the Name field mandatory in the API (say:
I think, yes. @jack-berg could you confirm that the Name field can represent event type that maps to New Relic's eventType field? |
In the case of RUM event names, yes, since I believe RUM event names to be a low cardinality categorization. |
Yes, that's a separate issue. Let's not conflate with this one. |
This was discussed in the Log SIG today. It was agreed that the field should remain. A clarification that cardinality should be limited will be proposed to the data model by @jack-berg. This issue is expected to be closed once that language is added. |
Clarifies that log name field is a low cardinality event classifier. Resolves #2074.
@tigrannajaryan after further consideration with some folk at New Relic, I think we should re-open this issue. TL;DR is that I think we should actually remove I thought it would be useful to have name as a low cardinality event type. New Relic supports the notion of custom events, and the inclusion of
Looking at the example mappings in the appendix, Syslog sets name to be The other problem is that using the value of a field as a marker to backends seems to be inconsistent with conventions in this project. More often, OpenTelemetry uses the presence of an attribute key as a marker. Let's take the RUM events use case as an example. Would it be better to say that I think the attributes solution is stronger because it encodes information in a way that's easier to operate on: A backend can use the presence of an attribute called To support our New Relic custom event use case, we might introduce our own New Relic semantic convention that says if you include So in conclusion, I believe that both the RUM event use case and the New Relic custom event use case are better served with attributes. Because those were the use cases that motivated the retention of /cc @alanwest @martinkuba |
@jack-berg Custom events should be officially supported by Otel via "generic events" and so there should be an attribute name for this purpose that does not include vendor name. |
@jkwatson you were arguing in favour of keeping the Name field. Can you reply to #2074 (comment) |
@jack-berg's perspective makes sense to me. Moreover, I'd argue we haven't successfully established a clear argument that this should be a top-level field. The data model provides the following guidance on determining whether or not a top-level field is justified:
A case could be made that the first condition is satisfied, but I think the second condition is a stretch. To Tigran's original point, there are several different ways in which the field would be used, and in many cases it will not be used at all. Semantic conventions would appear to be a better fit here, allowing each use case its own expectations for the information that might have been placed in this field. Finally, if we're unsure whether to keep the field, I think we should remove it now. An additive change later is far preferable to supporting a field we no longer want to have. |
Can you elaborate a little more? My gut was to think this at first too, but I'm becoming less convinced. What's the difference if I were to inspect the value of the Also, I share the concern of others that the |
@alanwest the backend processing is different for logs and events and we need a simple way to distinguish between the two. If we are using an attribute for naming events, we could designate a specific attribute key, say "event_name", to identify the log_record as being an event. However, @jack-berg also talked about using the full event name as the attribute key - that makes it hard to distinguish events from logs. Note that having the name field will not help the above if Syslog uses this field. Since we are talking about removing the field entirely, why not repurpose it solely for use with events? Otherwise, we will have to designate another attribute 'record_type' to distinguish between logs and events. |
@scheler @jkwatson According to Otel spec there is no difference between logs and events. They are totally synonymous words for the exact same concept. It is fine if you want to have this distinction in your backend, but there should not be an expectation that they are going to be conceptually different from Otel perspective, with a specific top-level field allowing to distinguish logs from events. If this is what you expect to get from the If you need some |
@jkwatson Can you show some examples of "type of event"? Is this going to be a value from a finite enumeration defined in Otel? |
It will not be a finite enumeration, no. There will be some standard event names, but this is also something that users can create themselves. If the logs data model doesn't have a way to identify events vs normal logs, perhaps the logs are not the right way to model events. |
It's not clear to me how the |
@jkwatson Logs and Events are the same thing in OpenTelemetry spec. Please read this https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/logs/overview.md#events-and-logs It seems like you have some other concept of "event" in your mind. It would be useful to explain what you want to model in more details and clarify how it is different from the current definition of the Event/Log Record in Otel. |
There is a semantic difference between application logs and user initiated events in the RUM use case. Back ends/analysis tools need to treat them separately. They will need to be routed differently to different places. I don't know that the name field is useful for this (it probably isn't). But there needs to be some efficient way for that routing to occur and if there isn't, the RUM event will indeed need to be a new signal, which I believe no one wants at this point. |
@jkwatson can we record the RUM event name in a LogRecord attribute |
Also, if we want to discern between different types of records (such as events vs logs), perhaps those could be grouped at the InstrumentationLibraryLogs level? E.g. could one anticipate a specific InstrumentationLibrary.Name in case of RUM events? |
@tigrannajaryan the above section from the otel spec is questionable. Are we sure the same resource cannot emit both logs and events? This sounds restrictive. We are talking only of events in the context of RUM currently but why exclude the possibility of logs in future?
This would work. If we are alluding to using attribute to distinguish between events and logs, then the next thing to discuss would be whether to use a fixed attribute key for all events (rum.event_name) or use a broad event category for the attribute key (rum.browser.event). |
To summarize extensive discussion at the spec SIG meeting today:
|
@jkwatson Is there any update on this from the client telemetry group? |
We're still discussing it. Things got a bit slowed down by the holidays, but we are working on it, leaning toward using zero-duration spans currently. |
@jkwatson - we discussed this at the Log SIG today and there doesn't seem to be strong reason to keep the field:
|
I sincerely apologize for being late to comment on this issue, and I rarely comment on forums. But I feel compelled to complain about this, futile as it might be. There are really good reasons NOT to make this change:
Please reconsider this change. |
@jjharr it is way too late to reconsider removal of the Name field. It is done for all intents and purposes, removed from the specification, deprecated in the protocol definition, deleted from the implementation in the Collector. We can consider adding the Name field back. I am open to discuss it. It will be a backward compatible, additive change that we can consider. Since you feel strong about it I suggest that you open a new issue in this repository and include the justification that you posted in your comment (and any other additional justifications that you can think of) and we can take it from there. It would be great if you could also join Otel Spec SIG meetings and Otel Log SIG meetings to advocate for this. |
I'd like to echo @jjharr comment. This issue just came up for me as well since we have implemented a new API that customers are using. Our API is based off of the OpenTelemetry log data model and we've placed a significant focus on the Name field. There are two arguments made in this discussion that I would like to challenge:
I understand that OpenTelemetry is in it's early stages but we really need to consider backwards compatibility. Once you have others working off of this specification a change like this IS high risk. This is a breaking change for us.
I completely agree that it is important for existing log formats to map well into the log data model. This excludes the category of new applications that can be written with better logging practices in mind. I strongly believe the Name field should be recommended for use by new logging APIs and new applications because of the low cardinality aspect. Just because other log formats don't have a "Name" equivalent doesn't mean it should be dropped from the log data model. It can just be Optional and left out of example mappings. |
@tigrannajaryan Thanks for the guidance. I'll open a new issue, and I will also try to start attending those meetings. |
@jjharr when you open the issue please post the link in this thread too. @epsteina16 please also add your supporting comments to the issue once it is opened. |
Related discussion in issue 2398 |
Clarifies that log name field is a low cardinality event classifier. Resolves open-telemetry#2074.
The log data model currently has a Name field.
It is not entirely clear why this field is needed.
Some of the example mappings provided use it this way:
In most of these cases it appears that it could be placed in Attributes. The unclear one is CloudTrail Log Event and probably Google Cloud Logging.
Can we come up with clear justifications of why the Name field must exist and why for example Body is not enough?
The text was updated successfully, but these errors were encountered: