Shutdown gracefully on SIGPIPE, SIGHUP #2114

michaelsproul · 2020-12-22T07:18:41Z

Description

When running Lighthouse in a shell pipeline it fails to exit gracefully because of the lack of signal handler for SIGPIPE. I noticed that my Lighthouse nodes wouldn't shutdown gracefully from my runner script, and it took a bit of digging to find the cause.

This command reproduces the issue (bash assumed):

$ lighthouse bn --network pyrmont |& tee -a pyrmont_bn.log

Or equivalently:

$ lighthouse bn --network pyrmont 2>&1 | tee -a pyrmont_bn.log

Both of these commands redirect lighthouse's stderr (2) to its stdout (&1), and then send stdout to tee for writing to the terminal and a file simultaneously (a common UNIX pattern). If you hit ^C while running either of these commands Lighthouse will exit immediately, without running the shutdown handlers, running drop on the BeaconChain, or printing these messages:

Dec 22 18:09:01.939 INFO Shutting down..
Dec 22 18:09:07.463 INFO Saved beacon chain to disk              service: beacon

Version

Lighthouse v1.0.4

Steps to resolve

Currently we use the ctrlc crate to register a signal handler for SIGINT and SIGTERM. Unfortunately it seems this crate doesn't allow us to register handlers for other signals.

lighthouse/lighthouse/environment/src/lib.rs

Lines 382 to 392 in 1abc70e

    
           ctrlc::set_handler(move || { 
        
               if let Some(ctrlc_send) = ctrlc_send_c.try_borrow_mut().unwrap().take() { 
        
                   if let Err(e) = ctrlc_send.send(()) { 
        
                       error!( 
        
                           log, 
        
                           "Error sending ctrl-c message"; 
        
                           "error" => e 
        
                       ); 
        
                   } 
        
               } 
        
           })

Using another crate (or maybe tokio::signal), we should ensure that this handler runs for all of the signals we can reasonably handle. Based on the list here I think that should just be SIGINT, SIGTERM, plus:

SIGPIPE: The SIGPIPE signal is sent to a process when it attempts to write to a pipe without a process connected to the other end.
SIGHUP: The SIGHUP signal is sent to a process when its controlling terminal is closed.

The text was updated successfully, but these errors were encountered:

mstallmo · 2021-07-23T04:15:38Z

Would love to take a stab at resolving this if no one is working on it already!

michaelsproul · 2021-07-23T04:24:55Z

@mstallmo That would be awesome! There's nobody working on it currently, it's all yours!

## Issue Addressed Resolves #2114 Swapped out the ctrlc crate for tokio signals to hook register handlers for SIGPIPE and SIGHUP along with SIGTERM and SIGINT. ## Proposed Changes - Swap out the ctrlc crate for tokio signals for unix signal handing - Register signals for SIGPIPE and SHIGUP that trigger the same shutdown procedure as SIGTERM and SIGINT ## Additional Info I tested these changes against the examples in the original issue and noticed some interesting behavior on my machine. When running `lighthouse bn --network pyrmont |& tee -a pyrmont_bn.log` or `lighthouse bn --network pyrmont 2>&1 | tee -a pyrmont_bn.log` none of the above signals are sent to the lighthouse program in a way I was able to observe. The only time it seems that the signal gets sent to the lighthouse program is if there is no redirection of stderr to stdout. I'm not as familiar with the details of how unix signals work in linux with a redirect like that so I'm not sure if this is a bug in the program or expected behavior. Signals are correctly received without the redirection and if the above signals are sent directly to the program with something like `kill`.

michaelsproul · 2021-08-30T05:22:15Z

Finally got to the bottom of this here: #2486 (review)

For anyone wanting to see the shutdown output in tee, run with tee -i to stop tee from dying on interrupt

## Proposed Changes Remove the SIGPIPE handler added in #2486. We saw some of the testnet nodes running under `systemd` being stopped due to `journald` restarts. The systemd docs state: > If systemd-journald.service is stopped, the stream connections associated with all services are terminated. Further writes to those streams by the service will result in EPIPE errors. In order to react gracefully in this case it is recommended that programs logging to standard output/error ignore such errors. If the SIGPIPE UNIX signal handler is not blocked or turned off, such write attempts will also result in such process signals being generated, see signal(7). From https://www.freedesktop.org/software/systemd/man/systemd-journald.service.html ## Additional Info It turned out that the issue described in #2114 was due to `tee`'s behaviour rather than Lighthouse's, so the SIGPIPE handler isn't required for any current use case. An alternative to disabling it all together would be to exit with a non-zero code so that systemd knows to restart the process, but it seems more desirable to tolerate journald glitches than to restart frequently.

michaelsproul added UX-and-logs A1 labels Dec 22, 2020

michaelsproul added the good first issue Good for newcomers label Jul 8, 2021

mstallmo mentioned this issue Jul 31, 2021

[Merged by Bors] - Add more unix signal handlers #2486

Closed

paulhauner added the v1.5.1 To be included in the v1.5.1 relase label Aug 2, 2021

michaelsproul added v1.5.2 The release after v1.5.1 v2.0.0 Altair on mainnet release (v2.0.0) and removed v1.5.1 To be included in the v1.5.1 relase v1.5.2 The release after v1.5.1 labels Aug 26, 2021

michaelsproul closed this as completed Aug 30, 2021

michaelsproul mentioned this issue Sep 2, 2021

[Merged by Bors] - Remove SIGPIPE handler #2558

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shutdown gracefully on SIGPIPE, SIGHUP #2114

Shutdown gracefully on SIGPIPE, SIGHUP #2114

michaelsproul commented Dec 22, 2020

mstallmo commented Jul 23, 2021

michaelsproul commented Jul 23, 2021

michaelsproul commented Aug 30, 2021

Shutdown gracefully on SIGPIPE, SIGHUP #2114

Shutdown gracefully on SIGPIPE, SIGHUP #2114

Comments

michaelsproul commented Dec 22, 2020

Description

Version

Steps to resolve

mstallmo commented Jul 23, 2021

michaelsproul commented Jul 23, 2021

michaelsproul commented Aug 30, 2021