-
Notifications
You must be signed in to change notification settings - Fork 897
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resolve lingering question of LogRecord mutability in LogRecordProcessor #2955
Comments
Option 4 is to do something similar to Collector, where data is exclusively owned by one processor at a time: https://github.com/open-telemetry/opentelemetry-collector/tree/main/processor#data-ownership In this case there is no need for locks and processors can freely mutate the data while they own it. |
Thanks for sharing that link. It looks like the collector relies on chaining to define the ownership boundaries. I.e. a processor is responsible for calling the next processor in the chain, and owns the data until it calls the next:
Changing the architecture of I wonder if we can define similar concepts of exclusive ownership, shared ownership, and ownership handoff while not requiring processors to call the next in the chain. Something like:
The consequence of this design would be that asynchronous processors like BatchLogRecordProcessor wouldn't be allowed to mutate the data without making a copy. This seems reasonable. |
I like just doing chaining, but even if not going that direction, requiring a copy for modification is essentially what OnEnd enforces, so it would be matching with the SpanProcessor design if that were done, right? The rest ( |
I like chaining as well, but think the design should be similar to tracing unless there's a strong reason to make it different.
Now that you mention it, there's no actual behavior change in an SDK in response to Maybe this is enough? @tigrannajaryan does the collector somehow enforce its policy that no modifications can be made after the "processor calls the next processor's |
@jack-berg It doesn't currently. We had one case recently when an exporter that was no supposed to mutate the date did mutate it and caused a crash. We discussed a possible "debug" option to enforce/catch such violations, but it will likely remain a debug-only capability since it has performance implications. |
#2681 made it possible for
LogRecordProcessor
to mutate LogRecords in theironEmit
implementations. Mutating LogRecords is critical for use cases like enriching with baggage, redacting sensitive information, and more.After merging, a conversation took place regarding potential performance impact of allowing mutation due to locking requirements.
Option 1: Do Nothing
All
LogRecordProcessor
s are allowed to mutate LogRecords. SDKs have to deal with this by coming up with appropriate locking strategies.It may be possible to still have lock-free high performance if none of the registered
LogRecordProcessors
perform mutations. Consider a strategy where mutations are tracked using a lock free CAS (compare and set) counter like java'sAtomicInteger
. If no mutation ever occurs, thenBatchLogProcessor
can transform to an immutable ReadableLogRecord version needed to callLogRecordExporter#export
without taking a lock. If mutations do occur, a lock would have to be obtained.Option 2: Force
LogRecordProcessor
implementations to indicate whether they mutateFor example, by adding method to
LogRecordProcessor
calledboolean isMutating()
.This makes it somewhat easier to avoid locks in configurations where no
LogRecordProcessors
perform mutations.If an implementation trying to mutate the
LogRecord
whenisMutating() == false
, reject the mutation and log a warning.As noted here, this is the strategy used by collector processors.
Option 3: Split out start and end phases of emitting a LogRecord
SpanProcessor is split into two phases: 1.
onStart
runs when the span is starting, and has the ability to mutate data. 2.onEnd
runs when the span has ended and DOES NOT have the ability to mutate the data.The Log API could similarly be split into multiple phases of starging and log record and emitting it.
LogRecordProcessor
could then be split into multiple phases as well, withonStart
allowed to mutate while theLogRecord
sent toonEnd
is immutable.This produces awkward ergonomics for the API since people don't think of emitting a log in multiple phases. Additionally, we'd have to answer questions like what types of data can / can not be set in the start phase, and what data must be set in the start phase.
Not sure of any advantages of this approach above Option 2.
Let's talk about this and come to some conclusion so we can move forward with confidence.
The text was updated successfully, but these errors were encountered: