-
Notifications
You must be signed in to change notification settings - Fork 348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix SIGHUP/SIGCONT sometimes reaching the child due to tty quirks #174
Conversation
👍 idea seems good. We talked through some of the scenarios offline including duplicate SIGHUP / SIGCONT signals -- the nice thing is that just eating one of these signals is fine as the kernel / user signals here are indistinguishable (kernel / user send |
b2203fd
to
3bc3d4f
Compare
Should be ready for final review / merge now. Seems like the ppc64le tests are failing though, and not producing a log, so I wonder if something is wrong with the Travis ppc64le infrastructure? :( @ghatwala any idea why we'd be seeing that with the PPC tests? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
otherwise looks good!
dumb-init.c
Outdated
if (signum == SIGCHLD) { | ||
|
||
if (signal_temporary_ignores[signum] == 1) { | ||
DEBUG("Ignoring signal %d during its first receive.\n", signum); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe Ignoring tty hand-off signal...
so it's more clear why this is happening here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this wording better, changed to that.
This is due to Yelp#175.
Fixes #136
To summarize, the issue in #136 is that when dumb-init calls
TIOCNOTTY
, the kernel will send a SIGHUP and SIGCONT to the dumb-init process if dumb-init is the session leader. Quothman tty
:When this happens, there's a race with two possible outcomes:
Typically (1) happens, but some environments cause (2) to happen more frequently, probably due to differences in how the scheduler behaves.
This is a fairly non-intrusive fix: it just ignores the first SIGHUP/SIGCONT. At some point I'd still like to see if there's a better fix (can we hand the session to the child when we're the session leader rather than creating a new one?), but this should be fairly safe.
Still working on writing a test. Unfortunately it's hard to test the actual functionality (hard to force the race without patching dumb-init's code) so it may end up just asserting debug messages :(