[Merged by Bors] - Add more unix signal handlers #2486

mstallmo · 2021-07-31T21:27:04Z

Issue Addressed

Resolves #2114

Swapped out the ctrlc crate for tokio signals to hook register handlers for SIGPIPE and SIGHUP along with SIGTERM and SIGINT.

Proposed Changes

Swap out the ctrlc crate for tokio signals for unix signal handing
Register signals for SIGPIPE and SHIGUP that trigger the same shutdown procedure as SIGTERM and SIGINT

Additional Info

I tested these changes against the examples in the original issue and noticed some interesting behavior on my machine. When running lighthouse bn --network pyrmont |& tee -a pyrmont_bn.log or lighthouse bn --network pyrmont 2>&1 | tee -a pyrmont_bn.log none of the above signals are sent to the lighthouse program in a way I was able to observe.

The only time it seems that the signal gets sent to the lighthouse program is if there is no redirection of stderr to stdout. I'm not as familiar with the details of how unix signals work in linux with a redirect like that so I'm not sure if this is a bug in the program or expected behavior.

Signals are correctly received without the redirection and if the above signals are sent directly to the program with something like kill.

CLAassistant · 2021-07-31T21:27:09Z

All committers have signed the CLA.

mstallmo · 2021-08-03T06:34:52Z

Pushed a couple of updates based on the CI failures from earlier. The current version of the PR should pass CI now.

lighthouse/environment/src/lib.rs

Added SIGHUP, SIGPIPE, and SIGTERM to the signal handlers that trigger lightouse to shutdown gracefully

paulhauner · 2021-08-20T01:36:37Z

Thanks @mstallmo, this is looking great! We'll get to a review soon. New PRs are blocked until we release v1.5.0 (probably next week) and we've been busy with the Pyrmont fork.

I just wanted to let you know that it's on our radar and not going stale :)

michaelsproul

Great work!

I spent some time digging into why the pipe commands from the original issue don't seem to work and I think they are actually working correctly, but we just can't see the output.

If tee receives the SIGINT from the shell, then it immediately shuts down and closes its stdin file descriptor which lighthouse was writing to. That means no more terminal output after tee exits, because both stderr and stdout from Lighthouse are redirected to tee. Lighthouse also receives the SIGINT that knocked out tee, and runs its own signal handler to begin shutting down. Anything that Lighthouse logs from this point on is lost, because it has nowhere to write it to. It exits with a 0 exit code, indicating that the process exited successfully (you need to echo ${PIPESTATUS[0]} in bash to see it). Further, if you modify Lighthouse's signal handler so that it panics, then Lighthouse exits it with a 101 exit code (the standard panic exit code), demonstrating that the signal handler does indeed still run. Even better, if you run tee with -i to stop it from exiting then it will record Lighthouse's shutdown messages before exiting.

I stumbled around testing things for a while before realising this, and I found this article incredibly helpful: https://www.cons.org/cracauer/sigint.html

Thanks again!

bors r+

## Issue Addressed Resolves #2114 Swapped out the ctrlc crate for tokio signals to hook register handlers for SIGPIPE and SIGHUP along with SIGTERM and SIGINT. ## Proposed Changes - Swap out the ctrlc crate for tokio signals for unix signal handing - Register signals for SIGPIPE and SHIGUP that trigger the same shutdown procedure as SIGTERM and SIGINT ## Additional Info I tested these changes against the examples in the original issue and noticed some interesting behavior on my machine. When running `lighthouse bn --network pyrmont |& tee -a pyrmont_bn.log` or `lighthouse bn --network pyrmont 2>&1 | tee -a pyrmont_bn.log` none of the above signals are sent to the lighthouse program in a way I was able to observe. The only time it seems that the signal gets sent to the lighthouse program is if there is no redirection of stderr to stdout. I'm not as familiar with the details of how unix signals work in linux with a redirect like that so I'm not sure if this is a bug in the program or expected behavior. Signals are correctly received without the redirection and if the above signals are sent directly to the program with something like `kill`.

bors · 2021-08-30T06:41:24Z

Pull request successfully merged into unstable.

Build succeeded:

## Proposed Changes Remove the SIGPIPE handler added in #2486. We saw some of the testnet nodes running under `systemd` being stopped due to `journald` restarts. The systemd docs state: > If systemd-journald.service is stopped, the stream connections associated with all services are terminated. Further writes to those streams by the service will result in EPIPE errors. In order to react gracefully in this case it is recommended that programs logging to standard output/error ignore such errors. If the SIGPIPE UNIX signal handler is not blocked or turned off, such write attempts will also result in such process signals being generated, see signal(7). From https://www.freedesktop.org/software/systemd/man/systemd-journald.service.html ## Additional Info It turned out that the issue described in #2114 was due to `tee`'s behaviour rather than Lighthouse's, so the SIGPIPE handler isn't required for any current use case. An alternative to disabling it all together would be to exit with a non-zero code so that systemd knows to restart the process, but it seems more desirable to tolerate journald glitches than to restart frequently.

paulhauner added ready-for-review The code is ready for review v1.5.1 To be included in the v1.5.1 relase labels Aug 2, 2021

mstallmo force-pushed the fix_graceful_shutdown branch from 4b5a93a to 9de3875 Compare August 3, 2021 05:21

film42 reviewed Aug 9, 2021

View reviewed changes

lighthouse/environment/src/lib.rs Outdated Show resolved Hide resolved

mstallmo added 5 commits August 17, 2021 22:20

Add more unix signal handlers

47d01a7

Added SIGHUP, SIGPIPE, and SIGTERM to the signal handlers that trigger lightouse to shutdown gracefully

Fix clippy lint

299abd2

Add back non-unix shutdown signal handling

5d75c2b

Fix unused deps and platform specific dependencies

b619735

Relax target cfg from linux to unix

3fa865b

mstallmo force-pushed the fix_graceful_shutdown branch from d357df7 to 3fa865b Compare August 18, 2021 05:40

michaelsproul added v1.5.2 The release after v1.5.1 v2.0.0 Altair on mainnet release (v2.0.0) and removed v1.5.1 To be included in the v1.5.1 relase v1.5.2 The release after v1.5.1 labels Aug 26, 2021

michaelsproul self-requested a review August 30, 2021 00:48

michaelsproul approved these changes Aug 30, 2021

View reviewed changes

michaelsproul added ready-for-merge This PR is ready to merge. and removed ready-for-review The code is ready for review labels Aug 30, 2021

michaelsproul mentioned this pull request Aug 30, 2021

Shutdown gracefully on SIGPIPE, SIGHUP #2114

Closed

bors bot changed the title ~~Add more unix signal handlers~~ [Merged by Bors] - Add more unix signal handlers Aug 30, 2021

bors bot closed this Aug 30, 2021

michaelsproul mentioned this pull request Sep 2, 2021

[Merged by Bors] - Remove SIGPIPE handler #2558

Closed

mstallmo deleted the fix_graceful_shutdown branch September 11, 2021 04:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Merged by Bors] - Add more unix signal handlers #2486

[Merged by Bors] - Add more unix signal handlers #2486

mstallmo commented Jul 31, 2021

CLAassistant commented Jul 31, 2021 •

edited

Loading

mstallmo commented Aug 3, 2021

paulhauner commented Aug 20, 2021

michaelsproul left a comment •

edited

Loading

bors bot commented Aug 30, 2021

[Merged by Bors] - Add more unix signal handlers #2486

[Merged by Bors] - Add more unix signal handlers #2486

Conversation

mstallmo commented Jul 31, 2021

Issue Addressed

Proposed Changes

Additional Info

CLAassistant commented Jul 31, 2021 • edited Loading

mstallmo commented Aug 3, 2021

paulhauner commented Aug 20, 2021

michaelsproul left a comment • edited Loading

Choose a reason for hiding this comment

bors bot commented Aug 30, 2021

CLAassistant commented Jul 31, 2021 •

edited

Loading

michaelsproul left a comment •

edited

Loading