-
Notifications
You must be signed in to change notification settings - Fork 565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CRASH from race on detach in LOG debug-build code #4641
Labels
Comments
derekbruening
added a commit
that referenced
this issue
Dec 29, 2020
Eliminates a race on detach where the detaching thread removes DR's SIGSEGV handler while a now-native detached thread is in the middle of having a signal delivered (natively) and invokes a TLS magic field safe read, whose SIGSEGV then goes to the application. The detaching thread is the one doing all the real cleanup, so we simply avoid any safe reads or TLS for detaching threads by recording the detacher's ID when we start the detach process. This var is not cleared until re-init, so we have no race with the end of detach. Tested on api.detach_signal with the forthcoming signal mask checks, which trigger when the handler is invoked for a DR signal instead of an app-generated signal. Without this fix, the test fails easily: about 1 in 5 runs in debug build. With this fix, it succeeds 200x in a loop. I still see one type of crash in debug build, a rare race where d_r_stats is set to NULL in between the check and use of a LOG(), but that is limited to debug and is beyond the scope of this fix and is much lower priority: I filed it as #4641. Fixes #3535
derekbruening
added a commit
that referenced
this issue
Dec 30, 2020
Eliminates a race on detach where the detaching thread removes DR's SIGSEGV handler while a now-native detached thread is in the middle of having a signal delivered (natively) and invokes a TLS magic field safe read, whose SIGSEGV then goes to the application. The detaching thread is the one doing all the real cleanup, so we simply avoid any safe reads or TLS for detaching threads by recording the detacher's ID when we start the detach process. This var is not cleared until re-init, so we have no race with the end of detach. Tested on api.detach_signal with the forthcoming signal mask checks, which trigger when the handler is invoked for a DR signal instead of an app-generated signal. Without this fix, the test fails easily: about 1 in 5 runs in debug build. With this fix, it succeeds 200x in a loop. I still see one type of crash in debug build, a rare race where d_r_stats is set to NULL in between the check and use of a LOG(), but that is limited to debug and is beyond the scope of this fix and is much lower priority: I filed it as #4641. Fixes #3535
The best way to solve this may be to leave the useful LOG calls but move |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
While testing a fix for #3535, running debug build I hit this crash maybe once in 100 runs:
Must be a race where d_r_stats is set to NULL in between the check for NULL and
the de-ref of loglevel.
I was not planning to fix it: b/c it would require eliminating all LOG calls on detach paths, which are useful for debugging, and b/c it is limited to debug build.
The text was updated successfully, but these errors were encountered: