Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the BatchingProcessor #5093

Merged
merged 74 commits into from
Apr 18, 2024
Merged

Conversation

MrAlias
Copy link
Contributor

@MrAlias MrAlias commented Mar 21, 2024

Part of #5063

Adds a feature complete implementation of the BatchingProcessor. There remains performance improvements to the addressed as follow-up tasks (tracked below)

Design

At a high-level, the BatchingProcessor is designed to provide the highest throughput of log Records possible while being compatible with OpenTelemetry. The entry point of log Records is the OnEmit method. This method is designed to receive Records as fast as possible while still honoring shutdown commands. All Records received are enqueued to a queue.

In order to block OnEmit as little as possible, a separate "poll" goroutine is spawned at the creation of a BatchingProcessor. This goroutine is responsible for batching the queue at regular polled intervals, or when it is directly signaled to.

To keep the polling goroutine from backing up, all batches it makes are exported with a bufferedExporter. This exporter allows the poll goroutine to enqueue an export payload that will be handled in a separate goroutine dedicated to the export. This asynchronous behavior allows the poll goroutine to maintain accurate interval polling.

  __BatchingProcessor__     __Poll Goroutine__     __Export Goroutine__
||                     || ||                  || ||                    ||
||          ********** || ||                  || ||     **********     ||
|| Records=>* OnEmit * || ||   | - ticker     || ||     * export *     ||
||          ********** || ||   | - trigger    || ||     **********     ||
||             ||      || ||   |              || ||         ||         ||
||             ||      || ||   |              || ||         ||         ||
||   __________\/___   || ||   |***********   || ||   ______/\_______  ||
||  (____queue______)>=||=||===|*  batch  *===||=||=>[_export_buffer_] ||
||                     || ||   |***********   || ||                    ||
||_____________________|| ||__________________|| ||____________________||

Release valve

When the BatchingProcessor receives more Records than it is able to batch and export it needs a way to not block the user application still sending Records. The "release valve" that prevents this is the queue records are stored to. This queue is backed by a ring buffer. It will overwrite the oldest records first when writes to OnEmit are made faster than the queue can be batched.

Shutdown and ForceFlush

The asynchronous design of the BatchingProcessor that allows for high throughput needs to also incorporate the synchronous nature of calls to Shutdown and ForceFlush.

To handle the flushing operation for both of these operations, the queue type can be explicitly flushed. The data returned can be sent to the bufferExporter's Export method to synchronously be exported (along with flushing any pending exports).

The shutdown state of the processor is managed at a high-level using an atomic.Bool that is checked by all exported methods before the preform operations.

Enabled

This updates the Enabled method to return false when the processor has been shutdown.

Follow up tasks

Tracked in #5063

  • Pool export payloads
  • Add option for users to configure the exporter buffer
  • Add custom implementation of "container/ring"
  • Add benchmarks
  • Ensure a BatchingProcessor created without using NewBatchingProcessor does not panic

@MrAlias MrAlias added area:logs Part of OpenTelemetry logs Skip Changelog PRs that do not require a CHANGELOG.md entry labels Mar 21, 2024
Copy link

codecov bot commented Mar 21, 2024

Codecov Report

Attention: Patch coverage is 98.75000% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 84.5%. Comparing base (fe3de70) to head (47b77c6).

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #5093     +/-   ##
=======================================
+ Coverage   84.3%   84.5%   +0.1%     
=======================================
  Files        258     258             
  Lines      17042   17110     +68     
=======================================
+ Hits       14382   14466     +84     
+ Misses      2361    2345     -16     
  Partials     299     299             
Files Coverage Δ
sdk/log/exporter.go 98.1% <100.0%> (+0.1%) ⬆️
sdk/log/batch.go 99.4% <98.5%> (+16.0%) ⬆️

... and 1 file with indirect coverage changes

@MrAlias MrAlias force-pushed the impl-batch branch 3 times, most recently from 3604420 to 5ce16bf Compare March 22, 2024 14:55
@MrAlias MrAlias mentioned this pull request Mar 25, 2024
MrAlias added a commit to MrAlias/opentelemetry-go that referenced this pull request Mar 25, 2024
The batching log processor will generate records from 4 different
locations (polling, OnEmit, ForceFlush, Shutdown). In order to ensure an
Exporter is called serially, as is required by the interface, this
function will be used in the processor to centralize the interaction
with its Exporter.

Part of open-telemetry#5063.

See open-telemetry#5093 for the implementation use.
@MrAlias MrAlias mentioned this pull request Mar 25, 2024
MrAlias added a commit to MrAlias/opentelemetry-go that referenced this pull request Mar 25, 2024
The BatchingProcessor is not expected to ultimately contain
configuration fields for queue size or export parameters (see open-telemetry#5093).
This will break TestNewBatchingProcessorConfiguration which tests the
configuration by evaluating the BatchingProcessor directly.

Instead, test the batchingConfig and rename the test to
TestNewBatchingConfig to match what is being tested.
MrAlias added 3 commits March 25, 2024 10:45
The BatchingProcessor is not expected to ultimately contain
configuration fields for queue size or export parameters (see open-telemetry#5093).
This will break TestNewBatchingProcessorConfiguration which tests the
configuration by evaluating the BatchingProcessor directly.

Instead, test the batchingConfig and rename the test to
TestNewBatchingConfig to match what is being tested.
sdk/log/batch.go Outdated Show resolved Hide resolved
sdk/log/batch.go Outdated Show resolved Hide resolved
sdk/log/exporter.go Outdated Show resolved Hide resolved
@pellared pellared dismissed their stale review April 9, 2024 19:03

All my blocking comments are resolved. I will do my best to do another look on Thursday.

MrAlias and others added 2 commits April 9, 2024 12:19
sdk/log/batch_test.go Show resolved Hide resolved
sdk/log/batch.go Show resolved Hide resolved
@MrAlias
Copy link
Contributor Author

MrAlias commented Apr 11, 2024

Plan is to wait until Monday to merge. @dashpole and @MadVikingGod are planning to take a look at this.

sdk/log/batch.go Show resolved Hide resolved
sdk/log/batch.go Show resolved Hide resolved
sdk/log/batch.go Show resolved Hide resolved
@MrAlias
Copy link
Contributor Author

MrAlias commented Apr 15, 2024

@dashpole does this look good to you?

@MrAlias
Copy link
Contributor Author

MrAlias commented Apr 18, 2024

@dashpole: planning to merge given this is blocking other work.

@MrAlias MrAlias merged commit 4af9c20 into open-telemetry:main Apr 18, 2024
27 checks passed
@MrAlias MrAlias deleted the impl-batch branch April 18, 2024 14:48
@MrAlias MrAlias added this to the v1.26.0 milestone Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:logs Part of OpenTelemetry logs Skip Changelog PRs that do not require a CHANGELOG.md entry
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants