Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

puffin_egui deadlock if new_frame is called from another thread #197

Open
lunixbochs opened this issue Feb 21, 2024 · 1 comment
Open

puffin_egui deadlock if new_frame is called from another thread #197

lunixbochs opened this issue Feb 21, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@lunixbochs
Copy link

lunixbochs commented Feb 21, 2024

I'm running into a deadlock in puffin_egui if new_frame() is called from a thread other than the UI thread.
I built a sampling profiler that runs in a single thread, manually creates ThreadProfiler instances for all threads running in the process, and samples their stack traces at intervals, so I'm calling new_frame from that thread rather than my UI thread.

Here puffin takes the GlobalFrameView lock and holds it across the UI function call:

pub fn window(&mut self, ctx: &egui::Context) -> bool {
let mut frame_view = self.global_frame_view.lock();
self.profiler_ui
.window(ctx, &mut MaybeMutRef::MutRef(&mut frame_view))
}
/// Show the profiler.
///
/// Call this from within an [`egui::Window`], or use [`Self::window`] instead.
pub fn ui(&mut self, ui: &mut egui::Ui) {
let mut frame_view = self.global_frame_view.lock();
self.profiler_ui
.ui(ui, &mut MaybeMutRef::MutRef(&mut frame_view));
}

The profiler ui call itself contains profiling calls, which calls ThreadProfiler::end_scope internally, which tries to take the GlobalProfiler lock. So you have a nested GlobalFrameView lock -> GlobalProfiler lock acquisition in the UI.

GlobalProfiler::lock().new_frame() has the GlobalProfiler locked and calls the sink function here, which tries to lock the GlobalFrameView:

view_clone.lock().add_frame(frame);

This mismatched lock order triggers a deadlock.

@lunixbochs lunixbochs added the bug Something isn't working label Feb 21, 2024
@v0x0g
Copy link
Contributor

v0x0g commented May 18, 2024

This seems like an interesting issue (also your sampling profiler sounds very cool).

I'm a bit confused on the exact order of these locks and accesses - I'm not familiar with puffin internals - could you possibly explain it again if my explanation is wrong? It seems like we have one thread doing GlobalFrameView::lock() -> GlobalProfiler::lock(), with another doing the reverse order GlobalProfiler::lock() -> GlobalFrameView::lock().

I assume that using a re-entrant Mutex would not fix this problem, because they are executing on different thread. Perhaps we could try using a proxy GlobalProfiler instance, so that the GlobalProfiler::SINGLETON writes to the profiler UI's proxy profiler, and the profiler UI only ever reads the proxy profiler?

A better alternative might be to try to lock the profiler, read all the data from it, and then unlock and call the profiler UI using the data we read, instead of accessing the GlobalProfiler. This would mean the reading of data couldn't be orofiler, but would allow the profiler UI to contain profiling calls as usual.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants