Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EventCounters] fix deadlock #1260

Merged
merged 5 commits into from
Jul 10, 2023
Merged

Conversation

Yun-Ting
Copy link
Contributor

@Yun-Ting Yun-Ting commented Jul 5, 2023

Mitigates #1024.

Changes

Below is an example of when the deadlock can occur:

Thread 1 Thread 2 locks acquired at this timestamp
Time call ctor of EventCountersMetrics
lock preInitEventSources lock on preInitEventSources object by thread 1
iterate through the HashSet of EventSources AND enable them lock on EventSource by thread 1
EventSourceA got enabled by thread 1
OnEventWritten callback fired on receiving an event
EventCountersInstrumentationEventSource got lazy-init lock on EventSource by thread 2
OnEventSourceCreated callback fired
lock preInitEventSouces lock on preInitEventSouces by thread 2

Discussion

  1. Better way to mitigate this issue could be, use an assistive dictionary to keep track of whether a particular eventSource should be enabled in a lock, and a flag (for example, shouldEnable) before releasing the lock. Outside of the lock, when the flag was set to true earlier, enable that eventSource. This approach should be more performant as compared to the current implementation. For example: https://github.com/dotnet/runtime/blob/84fd85932fd3b371cad69cc2412404ea50158c63/src/libraries/Common/tests/System/Diagnostics/Tracing/TestEventListener.cs#L84C1-L93C14
  2. The locks on preInitEventSources were added in this PR to address the issue -
    "As OnEventSourceCreated can run before the constructor is done, you can run into a situation where you fail to enable an EventSource instance."
    @MihaZupan: Instead of the current locking mechanism, Is it possible in OnEventSourceCreated, we use Spinlock to wait until this.options was set to ensure that OnEventSourceCreated will be triggered only after the ctor got executed? Too hacky?

For significant contributions please make sure you have completed the following items:

  • Appropriate CHANGELOG.md updated for non-trivial changes
  • Design discussion issue #
  • Changes in public API reviewed

@Yun-Ting Yun-Ting requested a review from a team July 5, 2023 18:05
@github-actions github-actions bot requested review from hananiel and mic-max July 5, 2023 18:05
Copy link

@besseb05 besseb05 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm - although I am not familiar with the codebase

Copy link
Member

@reyang reyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a changelog entry.

@utpilla utpilla merged commit e602753 into open-telemetry:main Jul 10, 2023
@Yun-Ting Yun-Ting deleted the yunl/fixDeadlock branch July 10, 2023 20:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants