Resolve lingering question of LogRecord mutability in LogRecordProcessor #2955

jack-berg · 2022-11-16T17:49:09Z

#2681 made it possible for LogRecordProcessor to mutate LogRecords in their onEmit implementations. Mutating LogRecords is critical for use cases like enriching with baggage, redacting sensitive information, and more.

After merging, a conversation took place regarding potential performance impact of allowing mutation due to locking requirements.

Option 1: Do Nothing

All LogRecordProcessors are allowed to mutate LogRecords. SDKs have to deal with this by coming up with appropriate locking strategies.

It may be possible to still have lock-free high performance if none of the registered LogRecordProcessors perform mutations. Consider a strategy where mutations are tracked using a lock free CAS (compare and set) counter like java's AtomicInteger. If no mutation ever occurs, then BatchLogProcessor can transform to an immutable ReadableLogRecord version needed to call LogRecordExporter#export without taking a lock. If mutations do occur, a lock would have to be obtained.

Option 2: Force LogRecordProcessor implementations to indicate whether they mutate

For example, by adding method to LogRecordProcessor called boolean isMutating().

This makes it somewhat easier to avoid locks in configurations where no LogRecordProcessors perform mutations.

If an implementation trying to mutate the LogRecord when isMutating() == false, reject the mutation and log a warning.

As noted here, this is the strategy used by collector processors.

Option 3: Split out start and end phases of emitting a LogRecord

SpanProcessor is split into two phases: 1. onStart runs when the span is starting, and has the ability to mutate data. 2. onEnd runs when the span has ended and DOES NOT have the ability to mutate the data.

The Log API could similarly be split into multiple phases of starging and log record and emitting it. LogRecordProcessor could then be split into multiple phases as well, with onStart allowed to mutate while the LogRecord sent to onEnd is immutable.

This produces awkward ergonomics for the API since people don't think of emitting a log in multiple phases. Additionally, we'd have to answer questions like what types of data can / can not be set in the start phase, and what data must be set in the start phase.

Not sure of any advantages of this approach above Option 2.

Let's talk about this and come to some conclusion so we can move forward with confidence.

The text was updated successfully, but these errors were encountered:

tigrannajaryan · 2022-11-16T19:30:01Z

Option 4 is to do something similar to Collector, where data is exclusively owned by one processor at a time: https://github.com/open-telemetry/opentelemetry-collector/tree/main/processor#data-ownership

In this case there is no need for locks and processors can freely mutate the data while they own it.

jack-berg · 2022-11-16T20:28:50Z

In this case there is no need for locks and processors can freely mutate the data while they own it.

Thanks for sharing that link. It looks like the collector relies on chaining to define the ownership boundaries. I.e. a processor is responsible for calling the next processor in the chain, and owns the data until it calls the next:

The duration of ownership of the data by processor is from the beginning of ConsumeTraces/ConsumeMetrics/ConsumeLogs call until the processor calls the next processor's ConsumeTraces/ConsumeMetrics/ConsumeLogs function, which passes the ownership to the next processor. After that the processor must no longer read or write the data since it may be concurrently modified by the new owner.

Changing the architecture of LogRecordProcessors to have each responsible for calling the next is different from the trace SDK design, which the log SDK was modeled after.

I wonder if we can define similar concepts of exclusive ownership, shared ownership, and ownership handoff while not requiring processors to call the next in the chain. Something like:

LogRecordProcessors declare intent to mutate by implementing boolean isMutating.
Log SDK operates in exclusive ownership mode if any LogRecordProcessor intends to mutate, else log SDK operates in shared ownership mode. There is no functional different between exclusive and shared ownership mode since the SDK doesn't have the fanout considerations of the collector.
In exclusive ownership mode, a processor can freely modify the data in its implementation of onEmit until onEmit returns.
In shared ownership mode, a processor is not allowed to modify the data unless it makes a copy.

The consequence of this design would be that asynchronous processors like BatchLogRecordProcessor wouldn't be allowed to mutate the data without making a copy.

This seems reasonable.

tsloughter · 2022-11-16T23:32:46Z

I like just doing chaining, but even if not going that direction, requiring a copy for modification is essentially what OnEnd enforces, so it would be matching with the SpanProcessor design if that were done, right?

The rest (isMutating and ownership mode) seems implementation specific and/or to be optimizations so may be over complicating the discussion.

jack-berg · 2022-11-18T21:11:27Z

I like just doing chaining

I like chaining as well, but think the design should be similar to tracing unless there's a strong reason to make it different.

The rest (isMutating and ownership mode) seems implementation specific and/or to be optimizations so may be over complicating the discussion.

Now that you mention it, there's no actual behavior change in an SDK in response to isMutation() in the scheme I outlined. If we were to define the ownership model as "a processor can freely modify the data in its implementation of onEmit until onEmit returns", then we sidestep the extra complexity and simply trust that processors will do the right think and not modify asynchronously without first making a copy.

Maybe this is enough? @tigrannajaryan does the collector somehow enforce its policy that no modifications can be made after the "processor calls the next processor's ConsumeTraces/ConsumeMetrics/ConsumeLogs function"?

tigrannajaryan · 2022-11-21T20:43:53Z

Maybe this is enough? @tigrannajaryan does the collector somehow enforce its policy that no modifications can be made after the "processor calls the next processor's ConsumeTraces/ConsumeMetrics/ConsumeLogs function"?

@jack-berg It doesn't currently. We had one case recently when an exporter that was no supposed to mutate the date did mutate it and caused a crash. We discussed a possible "debug" option to enforce/catch such violations, but it will likely remain a debug-only capability since it has performance implications.

jack-berg added the spec:logs Related to the specification/logs directory label Nov 16, 2022

github-actions bot assigned yurishkuro Nov 16, 2022

jack-berg mentioned this issue Nov 16, 2022

Move towards Log API/SDK stabilization #2911

Closed

25 tasks

jack-berg mentioned this issue Nov 21, 2022

Define data ownership for LogRecordProcessor #2969

Merged

reyang closed this as completed in #2969 Dec 2, 2022

jack-berg mentioned this issue May 31, 2024

Log record mutations are visible in next registered processors #4065

Closed

pellared mentioned this issue Jun 3, 2024

[PoC] Change Processor.OnEmit to accept a record pointer open-telemetry/opentelemetry-go#5469

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resolve lingering question of LogRecord mutability in LogRecordProcessor #2955

Resolve lingering question of LogRecord mutability in LogRecordProcessor #2955

jack-berg commented Nov 16, 2022

tigrannajaryan commented Nov 16, 2022

jack-berg commented Nov 16, 2022

tsloughter commented Nov 16, 2022

jack-berg commented Nov 18, 2022

tigrannajaryan commented Nov 21, 2022

Resolve lingering question of LogRecord mutability in LogRecordProcessor #2955

Resolve lingering question of LogRecord mutability in LogRecordProcessor #2955

Comments

jack-berg commented Nov 16, 2022

tigrannajaryan commented Nov 16, 2022

jack-berg commented Nov 16, 2022

tsloughter commented Nov 16, 2022

jack-berg commented Nov 18, 2022

tigrannajaryan commented Nov 21, 2022