Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i#1921 native sig: Handle signals during DR initialization #4662

Merged
merged 3 commits into from
Jan 9, 2021

Conversation

derekbruening
Copy link
Contributor

Puts in place 6 fixes for handling signals during DR initialization,
typically in a start/stop setup where other threads are alive.

  1. Copy the app handler at init time for delivering native signals
    during init.

  2. Reorder signal_arch_init(), which obtains the signal frame size, to
    run before DR installs its handler.

  3. Obtains the app handler before installing DR's handler, eliminating
    a (narrow) race window.

  4. Until DR starts executing the app, continues delivering native
    signals and using the globally recorded app handler, to match how DR
    init works.

  5. Set detacher_tid between init and setup to avoid races like in
    DR's handler at the end of init that were under safe_read magic unstable during detach from threaded app. #3535

  6. Handle a race where the init thread has set the global try_except,
    causing master_signal_handler_C to think an app thread's signal is
    DR's. We add a global_try_tid and check the thread id to solve this.

Augments api.detach_signal with signals sent during init.

Also tested on a large proprietary application.

Issue: #1921, #3535

Puts in place 6 fixes for handling signals during DR initialization,
typically in a start/stop setup where other threads are alive.

1) Copy the app handler at init time for delivering native signals
during init.

2) Reorder signal_arch_init(), which obtains the signal frame size, to
run before DR installs its handler.

3) Obtains the app handler before installing DR's handler, eliminating
a (narrow) race window.

4) Until DR starts executing the app, continues delivering native
signals and using the globally recorded app handler, to match how DR
init works.

5) Set detacher_tid between init and setup to avoid races like in
DR's handler at the end of init that were under #3535

6) Handle a race where the init thread has set the global try_except,
causing master_signal_handler_C to think an app thread's signal is
DR's.  We add a global_try_tid and check the thread id to solve this.

Augments api.detach_signal with signals sent during init.

Also tested on a large proprietary application.

Issue: #1921, #3535
@derekbruening
Copy link
Contributor Author

Hmm, detach_signal failed on 32-bit x86 with no output. But locally 32-bit x86 succeeds 100x in a row.

@derekbruening
Copy link
Contributor Author

Hmm, detach_signal failed on 32-bit x86 with no output. But locally 32-bit x86 succeeds 100x in a row.

Finally reproduced it in an Ubuntu 16.04.7 LTS VM, and after a lot of debugging (very confusing as to what happened initially), finally figured it out: it's the vsyscall hook not handling native threads in general. So it is not a regression of any kind: it's just that my expanded test here hits that always-broken case. I filed #4664 for that and for now in this PR I'm going to make the test 64-bit only.

@derekbruening derekbruening merged commit 568d7a6 into master Jan 9, 2021
@derekbruening derekbruening deleted the i1921-init-time-signals branch January 9, 2021 05:24
derekbruening added a commit that referenced this pull request Jan 11, 2021
…has-dcontext-but-during-init case from PR #4662 (issue #1921) we need to use the store app sigstack and not query the kernel and find DR's
derekbruening added a commit that referenced this pull request Jan 11, 2021
A fault in the executable that has a statically-linked client and DR
is currently reported as a client crash, even though we have no way to
distinguish client code from app code in such a situation.  We change
that here to invoke execute_native_handler(), which solves problems
where the application has a fault handler in place and a tool or other
mechanism that generates faults during DR init or other points.

Updates the has-dcontext-but-during-init case from PR #4662 (issue #1921)
to use the stored app sigstack and not query the kernel and find DR's sigstack.

The api.static_crash exercises this path and we updated its template here,
but it uses -unsafe_crash_process to trigger it.
I tried to reproduce an incoming signal trigger by making a static version of
api.detach_signal but it requires a fault-generating allocator
replacement (hit by droption and other places that use the system
allocator) or something similar, which is non-trivial to replicate in
a small test.  I did test on the original proprietary application.

Fixes #4640
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants