-
Notifications
You must be signed in to change notification settings - Fork 784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Merged by Bors] - Add more unix signal handlers #2486
Conversation
4b5a93a
to
9de3875
Compare
Pushed a couple of updates based on the CI failures from earlier. The current version of the PR should pass CI now. |
Added SIGHUP, SIGPIPE, and SIGTERM to the signal handlers that trigger lightouse to shutdown gracefully
d357df7
to
3fa865b
Compare
Thanks @mstallmo, this is looking great! We'll get to a review soon. New PRs are blocked until we release v1.5.0 (probably next week) and we've been busy with the Pyrmont fork. I just wanted to let you know that it's on our radar and not going stale :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
I spent some time digging into why the pipe commands from the original issue don't seem to work and I think they are actually working correctly, but we just can't see the output.
If tee
receives the SIGINT
from the shell, then it immediately shuts down and closes its stdin
file descriptor which lighthouse
was writing to. That means no more terminal output after tee
exits, because both stderr
and stdout
from Lighthouse are redirected to tee
. Lighthouse also receives the SIGINT
that knocked out tee
, and runs its own signal handler to begin shutting down. Anything that Lighthouse logs from this point on is lost, because it has nowhere to write it to. It exits with a 0
exit code, indicating that the process exited successfully (you need to echo ${PIPESTATUS[0]}
in bash
to see it). Further, if you modify Lighthouse's signal handler so that it panics, then Lighthouse exits it with a 101
exit code (the standard panic exit code), demonstrating that the signal handler does indeed still run. Even better, if you run tee
with -i
to stop it from exiting then it will record Lighthouse's shutdown messages before exiting.
I stumbled around testing things for a while before realising this, and I found this article incredibly helpful: https://www.cons.org/cracauer/sigint.html
Thanks again!
bors r+
## Issue Addressed Resolves #2114 Swapped out the ctrlc crate for tokio signals to hook register handlers for SIGPIPE and SIGHUP along with SIGTERM and SIGINT. ## Proposed Changes - Swap out the ctrlc crate for tokio signals for unix signal handing - Register signals for SIGPIPE and SHIGUP that trigger the same shutdown procedure as SIGTERM and SIGINT ## Additional Info I tested these changes against the examples in the original issue and noticed some interesting behavior on my machine. When running `lighthouse bn --network pyrmont |& tee -a pyrmont_bn.log` or `lighthouse bn --network pyrmont 2>&1 | tee -a pyrmont_bn.log` none of the above signals are sent to the lighthouse program in a way I was able to observe. The only time it seems that the signal gets sent to the lighthouse program is if there is no redirection of stderr to stdout. I'm not as familiar with the details of how unix signals work in linux with a redirect like that so I'm not sure if this is a bug in the program or expected behavior. Signals are correctly received without the redirection and if the above signals are sent directly to the program with something like `kill`.
Pull request successfully merged into unstable. Build succeeded: |
## Proposed Changes Remove the SIGPIPE handler added in #2486. We saw some of the testnet nodes running under `systemd` being stopped due to `journald` restarts. The systemd docs state: > If systemd-journald.service is stopped, the stream connections associated with all services are terminated. Further writes to those streams by the service will result in EPIPE errors. In order to react gracefully in this case it is recommended that programs logging to standard output/error ignore such errors. If the SIGPIPE UNIX signal handler is not blocked or turned off, such write attempts will also result in such process signals being generated, see signal(7). From https://www.freedesktop.org/software/systemd/man/systemd-journald.service.html ## Additional Info It turned out that the issue described in #2114 was due to `tee`'s behaviour rather than Lighthouse's, so the SIGPIPE handler isn't required for any current use case. An alternative to disabling it all together would be to exit with a non-zero code so that systemd knows to restart the process, but it seems more desirable to tolerate journald glitches than to restart frequently.
Issue Addressed
Resolves #2114
Swapped out the ctrlc crate for tokio signals to hook register handlers for SIGPIPE and SIGHUP along with SIGTERM and SIGINT.
Proposed Changes
Additional Info
I tested these changes against the examples in the original issue and noticed some interesting behavior on my machine. When running
lighthouse bn --network pyrmont |& tee -a pyrmont_bn.log
orlighthouse bn --network pyrmont 2>&1 | tee -a pyrmont_bn.log
none of the above signals are sent to the lighthouse program in a way I was able to observe.The only time it seems that the signal gets sent to the lighthouse program is if there is no redirection of stderr to stdout. I'm not as familiar with the details of how unix signals work in linux with a redirect like that so I'm not sure if this is a bug in the program or expected behavior.
Signals are correctly received without the redirection and if the above signals are sent directly to the program with something like
kill
.