Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TracerProvider allows calling Tracer() while it's shutting down #3924

Merged

Conversation

ash2k
Copy link
Contributor

@ash2k ash2k commented Mar 23, 2023

Fixes #3890.

I decided to introduce a separate mutex to decouple Tracer() from ALL other methods. I think this is way more robust than trying to release locks early, etc. Releasing mu early would have unintended, potentially undesirable effect of allowing Register/Unregister vs Shutdown to be unblocked earlier, while a concurrent call is still running. It may be hard to use the tracer provider in some scenarios if it works like that. It may be fine, but I think this is very subtle and can be undesirable behavior.

I've also moved logging out of the protected section just to be sure no external code can be called while holding the lock to avoid any chance of deadlock.

@codecov
Copy link

codecov bot commented Mar 23, 2023

Codecov Report

Merging #3924 (f07c478) into main (ae90c44) will decrease coverage by 0.1%.
The diff coverage is 73.9%.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #3924     +/-   ##
=======================================
- Coverage   81.8%   81.7%   -0.1%     
=======================================
  Files        170     170             
  Lines      12911   12943     +32     
=======================================
+ Hits       10563   10577     +14     
- Misses      2129    2143     +14     
- Partials     219     223      +4     
Impacted Files Coverage Δ
sdk/trace/provider.go 84.2% <73.9%> (-3.0%) ⬇️

... and 2 files with indirect coverage changes

@Aneurysm9
Copy link
Member

I think the first step in Tracer() needs to be to check whether the provider has been shut down and to return a no-op if it has:

After the call to Shutdown, subsequent attempts to get a Tracer are not allowed. SDKs SHOULD return a valid no-op Tracer for these calls, if possible.

In that case there would not be a way for a SpanProcessor's Shutdown() to enter protected sections of Tracer() as the flag indicating the provider is shut down would already be set. I think that puts Tracer() back in a position of needing to take a lock on the mutex guarding that flag, though, and not just the tracer map so maybe the flag should be changed to an atomic.Bool?

@ash2k ash2k force-pushed the ash2k/decouple-tracer-and-shutdown branch 2 times, most recently from 2416e32 to ff9fee7 Compare March 23, 2023 09:13
@ash2k
Copy link
Contributor Author

ash2k commented Mar 23, 2023

@Aneurysm9 This makes sense, thanks! I've changed the code accordingly.

sdk/trace/provider.go Show resolved Hide resolved
sdk/trace/provider.go Show resolved Hide resolved
sdk/trace/provider.go Outdated Show resolved Hide resolved
Copy link
Member

@pellared pellared left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to avoid deadlocks, I believe the double-checked locking is needed in all exported methods (similarly to Tracer()) as all of them can be used within Shutdown() (or UnregisterSpanProcessor()) via external code (sp.Shudown() and otel.Handle())

@pellared pellared dismissed their stale review March 23, 2023 11:27

Leaving the decission for #3924 (comment) to the maitainers

@ash2k ash2k force-pushed the ash2k/decouple-tracer-and-shutdown branch from ff9fee7 to 133cd0d Compare March 24, 2023 07:32
sdk/trace/provider.go Show resolved Hide resolved
@ash2k ash2k force-pushed the ash2k/decouple-tracer-and-shutdown branch from 133cd0d to 9e712cf Compare March 24, 2023 08:22
sdk/trace/provider.go Outdated Show resolved Hide resolved
@ash2k ash2k force-pushed the ash2k/decouple-tracer-and-shutdown branch from 9e712cf to 3a8d297 Compare March 24, 2023 23:18
sdk/trace/provider.go Outdated Show resolved Hide resolved
sdk/trace/provider.go Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
@ash2k ash2k force-pushed the ash2k/decouple-tracer-and-shutdown branch from 3a8d297 to a44fa69 Compare March 27, 2023 22:54
@ash2k ash2k force-pushed the ash2k/decouple-tracer-and-shutdown branch from a44fa69 to f07c478 Compare March 27, 2023 22:58
@MrAlias MrAlias merged commit c4940f3 into open-telemetry:main Mar 28, 2023
@ash2k ash2k deleted the ash2k/decouple-tracer-and-shutdown branch March 28, 2023 00:39
@MrAlias MrAlias mentioned this pull request Apr 27, 2023
@XSAM XSAM added this to the Old Untracked PRs milestone Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deadlock in trace SDK shutdown
6 participants