Handling dead workers in future.apply #732
-
I want to parallelize a model fitting process that will reliably randomly crash the R session to many dozens of cores/workers. I was unable to locate the issue that crashes the session as it is not my own code, and as it is not systematic, meaning a participant that crashes in one attempt is modeled perfectly fine in the next. I therefore want to work around this issue, let a worker die, and resurrect the dead worker, and the run the same participant's data on the same worker node again. However, as of now, it seems that there is no option to do so, as Or so it seems. I have namely noticed that after such an error, the function seems to have stopped running and my r-console is free again and I can write commands and work normally again, but logs of my fitting-function continue to be written, although with decreasing frequency, until at some point that stops as well. It seems that Does this behaviour mean that I could use the Catching the error of My current code looks something like this: future::plan(multisession, workers = 4)
future.apply::future_lapply(unique(data$participant), function(x) myfitfunc(data %>% dplyr::filter(participant == x), additional_args = 'many')) Thanks for any help |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
Only got a few minutes, but then f <- future(<expr>)
v <- tryCatch(value(f), FutureError = identity)
if (inherits(v, "FutureError")) {
<do something>
}
... FWIW, the |
Beta Was this translation helpful? Give feedback.
Only got a few minutes, but then
future.callr::callr
works similarly tomultisession
, but where there is fresh R process spun up for each future. That means, regardless whether the future finished successfully or not, there is no R process left behind. That'll take you the first step. You still have to handleFutureError
errors thrown by each future failing because the R process crashed. Such errors are considered so severe that they are not handled by higher-level APIs, e.g. future.apply, *furrr, and doFuture. Instead, you need to roll your own, e.g.FWIW, the
mult…