Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

finer control over future_lapply() #60

Open
3 tasks
MLopez-Ibanez opened this issue Jun 24, 2020 · 1 comment
Open
3 tasks

finer control over future_lapply() #60

MLopez-Ibanez opened this issue Jun 24, 2020 · 1 comment

Comments

@MLopez-Ibanez
Copy link

I'd like to implement the following using futures, but it doesn't seem possible yet?

  • Apply a function over a list of objects and get a list of futures. The tasks will start running immediately up to the number of workers and, if there are more tasks than workers, the remainder are queued for running using load-balancing.
  • Be able to iterate over the list of futures and check if the future is resolved, running, or queued.
  • Be able to cancel futures that are running or queued. When a future is cancelled, the next one in the queue starts executing.
@HenrikBengtsson
Copy link
Collaborator

Apply a function over a list of objects and get a list of futures. ...

This is by design. The future.apply API mimics the base R "apply" API as far as possible - but neither more or less than that. So from the "outside", the only difference the developer sees is that the functions starts with a future_ prefix. This way there are no surprises what the future.apply package is meant to do.

Now, I do mention in the README under 'Roadmap' that:

  1. Consider additional future_*apply() functions and features that fit in this package but don't necessarily have a corresponding function in base R. Examples of this may be "apply" functions that return futures rather than values, mechanisms for benchmarking, and richer control over load balancing.

This is also touched upon in Issue #32 and Issue #44, and possibly elsewhere too. However, it's far from obvious what such an API should look like and what it should support or not. It might also be better suited for another package. There's a risk of opening up the current API with features not existing in base R, e.g. it might be confusing and the existing API might be used in the wrong way. I see with with just future()/value() and %<-% where people attempt to do to y %<-% future(...) and end up in an trial'n'error mess.

You can always do:

fs <- lapply(X, FUN = function(x) future({
  ...
}))

to create your own futures. This wouldn't give you chunking ("load balancing") - you'd get one future per element in X. You could hack together some approach where you use chunks <- future_lapply(seq_along(X), FUN = function(idxs) { ... }) to figure out what the chunks are and what .Random.seed each element should be that's rather tedious.

To build your own map-reduce functions for future will be much easier when the future.chunks package is available. This is mentioned in Issue #59. But it's be a while before I get some solid to work on that.

Be able to cancel futures ...

Termination of futures is currently not supported by the Future API. This is something that needs to be implemented in the future package before anything can be done higher up. Getting a consistent API for terminating futures is not easy because it depends on the backend used. Such a feature most likely have to be optional, i.e. it might or might not work depending on backend and context. This further complicates how it can be used in cases like you propose. See futureverse/future#93 for more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants