From 36db64fc43ab457ce76935fff32707fa6a462bcd Mon Sep 17 00:00:00 2001 From: Jim Garlick Date: Sun, 1 Dec 2019 13:27:42 -0800 Subject: [PATCH] libflux/ev_flux: handle EV_ERROR on io watcher Problem: ev_flux ignores events on its internal io watcher, so when libev raises EV_ERROR on it because something went wrong internally (like #2554), nothing happens. The internal io watcher is only used for its side effect of unblocking the reactor when pollevents edge triggers from "no events" to "some events". The prep/check/idle watchers do the heavy lifting. Add a check for pending EV_ERROR events on the io watcher in the check callback. If found, notify the user via the ev_flux watcher callback. Note: in libev 4.25, a failure of epoll_ctl() would internally call fd_kill(), which would call ev_feed_event() to raise EV_ERROR on the watcher. With this patch, EV_ERROR is caught; however, only after a one minute delay, which has not been explained. libev 4.27 turns that error into an assertion. Since other error paths in libev call fd_kill(), this change may be useful for other as yet unseen problems. --- src/common/libflux/ev_flux.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/src/common/libflux/ev_flux.c b/src/common/libflux/ev_flux.c index a679a66d4097..48bc5cd3968a 100644 --- a/src/common/libflux/ev_flux.c +++ b/src/common/libflux/ev_flux.c @@ -49,6 +49,12 @@ static void check_cb (struct ev_loop *loop, ev_check *w, int revents) { struct ev_flux *fw = (struct ev_flux *)((char *)w - offsetof (struct ev_flux, check_w)); + + if (ev_is_pending (&fw->io_w) + && ev_clear_pending (loop, &fw->io_w) & EV_ERROR) { + fw->cb (loop, fw, EV_ERROR); + return; + } int events = get_pollevents (fw->h); ev_io_stop (loop, &fw->io_w);