-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DoStackSnapshot (async) deadlock #32286
Comments
Hi @iskiselev, I am not super surprised you are seeing this sort of issue, the asynchronous mode of pausing a thread at a random state is pretty dangerous. Here are the ways I know to try and alleviate this:
In coreclr 3.0 we added the ICorProfilerInfo10::SuspendRuntime API to make profilers able to sample on Linux, but it also solves this problem. The runtime will use its internal suspension mechanisms that it uses for GCs to stop the runtime, and then all managed threads will be in a safe state to sample. Since we control the suspension, you never have to worry about threads holding locks outside of the CLR's control. And since we use stackwalking during suspensions for the GC you can also be guaranteed you'll never have to worry about any internal deadlocks either. |
Closing issue as it looks like the question has been answered. Please let us know if this is not correct and we will re-activate. |
We were affected by deadlock introduced with profiler async call
DoStackSnapshot
. We have seen it on .Net Framework, but it should be applicable to any version of .Net Core on Windows too (only Windows, as other OS do not support async DoStackSnapshot).Here is condition. Profiler thread suspended application thread before calling
DoStackSnapshot
.Application thread has next call stack:
Profiler thread has next call stack:
Application thread holds CritSec
ntdll!LdrpLoaderLock
lock and looks like it also holds SRW lock, that probably was acquired inntdll!RtlpxLookupFunctionTable
. Profiler thread tries to acquire the same SRW lock inntdll!RtlpxLookupFunctionTable
. In our case application thread acquired some more locks, resulting in full dead-lock of all application thread.Looks like the application calls
CreateFromCertFile
often, which callsKERNELBASE!LoadLibraryExW
internally on each attempt. So, this application reveal issue easily - while for most other application dead lock though loading library will be low-probability, a libraries usually are loaded finite amount of times.We are discussing possible workarounds for this problem. Is it possible to predict if it is safe to call
DoStackSnapshot
at some particular moment? Will it be safe to move callingDoStackSnapshot
into separate thread and kill that thread if no progress have been after some time? If it is not safe - may it introduce data corruption/additional locks or only some small memory leaks?The text was updated successfully, but these errors were encountered: