Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collecting events from more threads does not work #94

Closed
mladedav opened this issue Feb 17, 2024 · 0 comments
Closed

Collecting events from more threads does not work #94

mladedav opened this issue Feb 17, 2024 · 0 comments

Comments

@mladedav
Copy link
Contributor

mladedav commented Feb 17, 2024

Bug Report

Version

Current HEAD in git: 101dc89 (origin/v0.1.x, origin/HEAD) Honor explicit parent spans for events (#92)

For completeness sake:

tracing-opentelemetry v0.22.0 (/...)
├── tracing v0.1.40
│   ├── tracing-attributes v0.1.27 (proc-macro)
│   └── tracing-core v0.1.32
├── tracing-core v0.1.32 (*)
├── tracing-log v0.2.0
│   └── tracing-core v0.1.32 (*)
└── tracing-subscriber v0.3.18
    └── tracing-core v0.1.32 (*)
│   │       │   │   │   │   └── tracing v0.1.40 (*)
│   │       │   │   │   └── tracing v0.1.40 (*)
│   │       │   │   ├── tracing v0.1.40 (*)
│   │       │   │   └── tracing v0.1.40 (*)
│   │       └── tracing v0.1.40 (*)
├── tracing v0.1.40 (*)
└── tracing-subscriber v0.3.18 (*)

Platform

Linux silver 6.1.71-1-MANJARO #1 SMP PREEMPT_DYNAMIC Fri Jan 5 17:36:36 UTC 2024 x86_64 GNU/Linux

Description

When events are emitted for the same span from multiple threads, some of them may be lost due to race conditions.

Repro is here

This is caused by the OpenTelemetryLayer taking the OtelData from extensions, mutating the events and then replacing it back. If some other thread tries to log an event in the meantime and doesn't find OtelData in the extensions, that event will be lost.

Important context is #59 which is also the reason we cannot simply take a unique reference to the metadata because it might cause a deadlock. More context concerning that:

jtescher pushed a commit that referenced this issue Feb 26, 2024
## Motivation

#94 - Mutli-threaded tracing drops most of the events

## Solution

I have reorganized the code so that the lock on extensions is held, but
not while `record` is called which should solve the deadlock originally
solved in #59

### Benchmark

I've used the benchmark from #93, please ignore the "filtered" vs
"non_filtered" distinction, for the purpose of this PR they should be
identical cases.

See results in later comment as the code got changed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant