-
Notifications
You must be signed in to change notification settings - Fork 843
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor Log SDK to implement OTEP-0150 #3759
Conversation
I'm currently working to align this with OTEl 0150. No need to review until that's done. |
Done. I'm updating the PR title and description accordingly. |
sdk/logs/src/main/java/io/opentelemetry/sdk/logs/LogEmitter.java
Outdated
Show resolved
Hide resolved
sdk/logs/src/main/java/io/opentelemetry/sdk/logs/data/LogData.java
Outdated
Show resolved
Hide resolved
* href="https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/logs/data-model.md">OpenTelemetry | ||
* Log Data Model</a>. | ||
*/ | ||
public interface LogRecord { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See definition in the otep.
The data model states that a log record has a Resource
field, but I think this is a bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to differentiate the logical data model which the data model is about and representations of that logical data model in actual data structures, which this Java code is about (or for example what OTLP protocol does).
From data model perspective each log record is associated with a Resource. It does not mean that in actual representations of the data model each log record contains a copy of the Resource.
We should probably make this clear in the data model document to avoid confusion.
Which also leads to Instrumentation Library being missing from the data model completely. I think it needs to be added as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Do you see any problems with the way its organized in this PR? Here's an example of the usage:
// 1. Setup a log emitter provider, adding a resource and zero to many log processors
SdkLogEmitterProvider logEmitterProvider = SdkLogEmitterProvider.builder()
.setResource(Resource.getDefault())
.addLogProcessor(SimpleLogProcessor.create(InMemoryLogExporter.create()))
.build();
// 2. Obtain a log emitter from the provider, with instrumentation library details
LogEmitter logEmitter = logEmitterProvider.get("my-instrumentation-library", "1.0.0");
// 3. Build a log record and emit it via the emitter
logEmitter.emit(LogRecord.builder()
.setEpochMillis(System.currentTimeMillis())
.setBody("my log message")
.setAttributes(Attributes.builder().put("key", "value").build())
.build());
// 4. The log processor will be called with a LogData, which is the LogRecord passed to the emitter
// plus the resource and instrumentation library
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this matches the intent of OTEP0150. I don't see see any issues. Thanks for working on this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a comment above too, hoping we can settle on just one type using the same model as tracer
logEmitter.logBuilder()
.setEpochMillis()
.setBody()
.emit()
And remove what's currently called LogRecord
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting idea. I would want the log SIG to have bought off on this model before we adopted it. Is there a use case for someone bringing their own implementation of LogRecord
and us still supporting it being sent through the pipeline? I do like limiting the API surface, but I want to make sure we're not cutting some use-case off at the knees with this approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell from reading the otep, LogRecord is not an interface in the spec but just a bag of data, so I suspect custom LogRecord implementations aren't expected. But would be good to confirm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Python logging SDK prototype went ahead with an actual LogRecord POD structure. AFAIK, the builder pattern is not very common in Python world. So, perhaps the OTEP should be rephrased to allow for some variations here depending on what's idiomatic in a particular language.
I do not see any particular arguments in favour for directly exposing the LogRecord in the API as a re-implementable interface in particular, if it can be substituted by a more idiomatic pattern that also reduces the API surface.
@lonewolf3739 what are your thoughts on this? Did LogRecord feel natural in Python implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tigrannajaryan There was nothing odd about the LogRecord and it is mostly similar to LogRecord
from python standard library. In Python SDK we just do one-to-one mapping from the stdlib log record to OTEP model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Submitted an amendment to OTEP: open-telemetry/oteps#183
|
||
@AutoValue | ||
@Immutable | ||
public abstract class ReadableLogData implements LogData { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not clear to whether we should separate interfaces from implementations for these data classes.
SpanData
is an interface. MetricData
is an abstract class with an autovalue implementation. LogData
could be either.
We should talk about this and be consistent!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like there's a PR to make MetricData
into an interface. I believe that the instrumentation agent prefers working with interfaces, if possible.
I'm more concerned with having two things called "ReadableXXX" in here, though. I will never be able to remember which represents which thing. Rather than focussing on the exact names of things from the OTEP, I think we should try to line things up as much as we can internally in our SDK.
I don't know that we even need this "readable" concept with logs, since a given log record is immutable by nature. In the Span world, we had mutability in the core model of a Span, so we needed to be able to separate the "readable" portions from the "writable" portions (eg. in the SpanProcessor, in onEnd()
you no longer can do any mutations, so you don't get the writable part of the interface provided to you). In effect, everything about the log record is "readable" since it's immutable, and nothing is "writable", so it seems like a naming that will only be confusing in the long run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, yes, we don't have "ReadableMetricData" for the same reason. I think we shouldn't introduce the concept of "readability" when everything is immutable. Thoughts, @jack-berg ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. Sorry - didn't see the first comment you made until now.
More than anything, I'm interested in consistency. The recent change to make MetricData
an interface added MetricDataImpl
as an implementation. Its package private, so the name is an implementation detail, but there's also the competing pattern of using the Sdk
prefix for this type of thing instead of the Impl
suffix. MetricDataImpl
could just as well be SdkMetricData
. There's also the pattern of an Immutable
prefix (see ImmutableLinkData
, ImmutableEventData
, etc).
My preference is the Sdk
prefix. Readable
and Immutable
prefixes are redundant like you said, and the Impl
suffix is redundant because of course it's an implementation.
I just pushed up a change to rename the ReadableLog*
classes to SdkLog*
, and also make them package private so that their name is an implementation detail. The public LogData
and LogRecord
interfaces now have static methods to create the SdkLog*
instances.
Codecov Report
@@ Coverage Diff @@
## main #3759 +/- ##
============================================
+ Coverage 89.30% 89.32% +0.02%
- Complexity 3975 4014 +39
============================================
Files 474 481 +7
Lines 12324 12424 +100
Branches 1201 1205 +4
============================================
+ Hits 11006 11098 +92
- Misses 908 918 +10
+ Partials 410 408 -2
Continue to review full report at Codecov.
|
This PR is ready for review. The log SDK now reflects OTELP-0150, and has patterns that closely mirrors the trace SDK. If we could get these changes into version 1.8.0, we could start to have some appenders contributed to |
import javax.annotation.Nullable; | ||
|
||
/** Provides instances of {@link LogEmitter}. */ | ||
public final class SdkLogEmitterProvider implements Closeable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand this correctly, this would be the class that we expose on our OpenTelemetrySdk instance, once logs have stabilized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. And if there was a log API, we might expose LogEmitterProvider
where SdkLogEmitterProvider
is just an implementation detail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And on second thought, why wait to add the LogEmitterProvider
interface. Better to have symmetry now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, we don't want to introduce an API for logging, so I don't know that this would have a lot of value right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And thinking about it further, the LogEmitterProvider
interface wouldn't live in the SDK module. So even if we do want to introduce an API which includes LogEmitterProvider
, it would have to move. Best to wait API requirements, if any, to develop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are no plans for an OTel logging API for java. There are already too many logging APIs for Java, so there's no reason to muddle things further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've talked about having two artifacts in some of the Monday Java meetings I think, they both could live in "SDK-land" but in the instrumentation discussions we found it will be important to have an artifact that is used by appenders independent from the SDK, to follow the "instrumentation doesn't depend on the SDK" pattern. Notably, it means that the logging SDK could still be initialized in the agent class loader and not the app one.
I think we will need this - the main difference with the other APIs is that we wouldn't need a no-op implementation of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I buy that. Do we want to hash this out in a followup PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, please
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created this issue to track it 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks solid to me. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks very good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left some nits and non-blocking comments as I understand the API is still in flux and can be improved possibly at the spec level. Though if possible would recommend making the API change related to timestamp here now, it's related to Java (or more precisely our codebase's) idiom and shouldn't exactly match the methods named in the spec.
sdk/logs/src/main/java/io/opentelemetry/sdk/logs/LogEmitterProvider.java
Outdated
Show resolved
Hide resolved
sdk/logs/src/main/java/io/opentelemetry/sdk/logs/LogEmitterProvider.java
Outdated
Show resolved
Hide resolved
* | ||
* @param logRecord the log record | ||
*/ | ||
void emit(LogRecord logRecord); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this could instead be logBuilder()
emitter.logBuilder().setMessage("broke").emit()
The consistency with the Tracer API would be good, we wouldn't need two public interfaces, one with the resource and one without, and the usability doesn't seem any worse since otherwise the user calls LogRecord.builder()
anyways otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may totally non-concern since I don't know how the internals work but just wanted to double check that the chained calls are no slower than the single emit call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this pattern and had something similar in a local prototype. If it meets the spirit of the OTEP, I prefer it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's definitely no performance difference with the current approach where a user instead chains through LogRecord.builder()
. Perhaps there could be a tiny difference with a huge factory method like LogRecord.create(...)
, but since Java doesn't support keyword arguments we would never use that for usability concerns.
sdk/logs/src/main/java/io/opentelemetry/sdk/logs/LogEmitterSharedState.java
Outdated
Show resolved
Hide resolved
sdk/logs/src/main/java/io/opentelemetry/sdk/logs/LogProcessor.java
Outdated
Show resolved
Hide resolved
sdk/logs/src/main/java/io/opentelemetry/sdk/logs/data/LogRecordBuilder.java
Outdated
Show resolved
Hide resolved
sdk/logs/src/main/java/io/opentelemetry/sdk/logs/data/LogRecordBuilder.java
Outdated
Show resolved
Hide resolved
return this; | ||
} | ||
/** Set the flags. */ | ||
LogRecordBuilder setFlags(int flags); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably need to improve this to make it more semantic / usable than just an opaque int flags
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. Mind if I do that in a followup PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely should be a follow-up. Do we even know what sort of flags need to go in here? Could we remove it from our SDK until it's clear what the semantics of these flags are?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created Issue #3804 to track a variety of questions with log fields.
return this; | ||
} | ||
/** Set the name. */ | ||
LogRecordBuilder setName(String name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's a log name? I don't think I've ever seen a name associated with a log
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is part of log data model: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/logs/data-model.md#field-name
I have been recently reviewing the data model critically and the Name field is one of the fields that I have doubts about. Opened an issue to discuss here: open-telemetry/opentelemetry-specification#2074
If it gets removed from the data model we will also need to remove from implementations. Shouldn't be difficult to delete here, most work is going to be in the Collector where it is used by a few components.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a comment on that spec issue. TL;DR I like the log name field 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
commented on your comment... I'd prefer it were called type
as I think that's how NR and RUM would use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created Issue #3804 to track a variety of questions with log fields.
return this; | ||
} | ||
/** Set the body. */ | ||
LogRecordBuilder setBody(Body body); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we already have the issue out for it, does it make sense to remove?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know @zenmoto has very strong feelings about this. I would love to just make our body be a String and nothing else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created Issue #3804 to track a variety of questions with log fields.
@jack-berg and reviewers: thank you for working on this. If you find any inconsistencies in OTEP0150 or think anything in the OTEP needs to be changed please do provide that feedback. We want to learn from this implementation as much as possible. |
import javax.annotation.Nullable; | ||
|
||
/** Provides instances of {@link LogEmitter}. */ | ||
public final class SdkLogEmitterProvider implements Closeable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've talked about having two artifacts in some of the Monday Java meetings I think, they both could live in "SDK-land" but in the instrumentation discussions we found it will be important to have an artifact that is used by appenders independent from the SDK, to follow the "instrumentation doesn't depend on the SDK" pattern. Notably, it means that the logging SDK could still be initialized in the agent class loader and not the app one.
I think we will need this - the main difference with the other APIs is that we wouldn't need a no-op implementation of it.
needs a CHANGELOG rebase, then ready to merge, I think |
@jkwatson I just pushed a commit to switch to the |
no, that's fine. no worries. Just thought we should get this thing merged and tweak from there. |
…y-java into log-sdk-refactor
The discussion while implementing the Java SDK revealed that the LogRecord data type may not be necessary. See here open-telemetry/opentelemetry-java#3759 (comment) However, in some other languages LogRecord fits nicely (see the comment about Python implementation in the same comment thread). This changes allows implementators of the OTEP to choose the most idiomatic approach for their language.
The discussion while implementing the Java SDK revealed that the LogRecord data type may not be necessary. See here open-telemetry/opentelemetry-java#3759 (comment) However, in some other languages LogRecord fits nicely (see the comment about Python implementation in the same comment thread). This changes allows implementators of the OTEP to choose the most idiomatic approach for their language.
) The discussion while implementing the Java SDK revealed that the LogRecord data type may not be necessary. See here open-telemetry/opentelemetry-java#3759 (comment) However, in some other languages LogRecord fits nicely (see the comment about Python implementation in the same comment thread). This changes allows implementators of the OTEP to choose the most idiomatic approach for their language.
…pen-telemetry#183) The discussion while implementing the Java SDK revealed that the LogRecord data type may not be necessary. See here open-telemetry/opentelemetry-java#3759 (comment) However, in some other languages LogRecord fits nicely (see the comment about Python implementation in the same comment thread). This changes allows implementators of the OTEP to choose the most idiomatic approach for their language.
…pen-telemetry#183) The discussion while implementing the Java SDK revealed that the LogRecord data type may not be necessary. See here open-telemetry/opentelemetry-java#3759 (comment) However, in some other languages LogRecord fits nicely (see the comment about Python implementation in the same comment thread). This changes allows implementators of the OTEP to choose the most idiomatic approach for their language.
…pen-telemetry#183) The discussion while implementing the Java SDK revealed that the LogRecord data type may not be necessary. See here open-telemetry/opentelemetry-java#3759 (comment) However, in some other languages LogRecord fits nicely (see the comment about Python implementation in the same comment thread). This changes allows implementators of the OTEP to choose the most idiomatic approach for their language.
…pen-telemetry#183) The discussion while implementing the Java SDK revealed that the LogRecord data type may not be necessary. See here open-telemetry/opentelemetry-java#3759 (comment) However, in some other languages LogRecord fits nicely (see the comment about Python implementation in the same comment thread). This changes allows implementators of the OTEP to choose the most idiomatic approach for their language.
…pen-telemetry/oteps#183) The discussion while implementing the Java SDK revealed that the LogRecord data type may not be necessary. See here open-telemetry/opentelemetry-java#3759 (comment) However, in some other languages LogRecord fits nicely (see the comment about Python implementation in the same comment thread). This changes allows implementators of the OTEP to choose the most idiomatic approach for their language.
Refactor log sdk to implement OTEP-0150.
Also brings it much more in line with the patterns of traces and metrics.
Resolves Issue #3750.
I still need to write some tests for some of the components added.