Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel Processing vignette #21

Open
topepo opened this issue Jul 31, 2020 · 4 comments
Open

Parallel Processing vignette #21

topepo opened this issue Jul 31, 2020 · 4 comments

Comments

@topepo
Copy link
Collaborator

topepo commented Jul 31, 2020

I would suggest that, when using tune, the standard foreach parallelism be suggested and the model-specific threading methods be used if just parsnip is being used to fit.

Generally, parallelizing the resamples is faster than the individual models (see xgboost example). We always try to parallelize the longest running loop.

@Athospd
Copy link
Member

Athospd commented Aug 2, 2020

Hey, Max, glad to see you here. I was writing about forking and then I decided to perform a benchmark to enrich the vignette.
I was expecting to corroborate your findings but I ended up with counter-intuitive results.

tl;dr: pure forking or pure threading wasn't the best: 2 threads with 4 workers was the fastest setup.

see here https://curso-r.github.io/treesnip/articles/threading-forking-benchmark.html

Do you think that it is worth it to consider these combinations? Or is it better to stick with the simple rule of thumb (tune -> forking; fit -> thread)?

@topepo
Copy link
Collaborator Author

topepo commented Aug 3, 2020

That's really interesting! TBH I',m surprised that a combination like that works at all. Can you make a plot of the x-axis as the speed-up (seq time/par time)?

I might run some of these too locally this weekend.

@Athospd
Copy link
Member

Athospd commented Aug 9, 2020

@topepo I'm running more benchmarks here and I think I spotted a potential issue you might want to check yourself to confirm: when I set vfold_cv(v = 3) only 3 workers were used even with tune_grid() set to fit lots of different models. And when I set to vfold_cv(v = 8) I watched all my 8 cores 100%. My hypothesis is that tune_grid() is forking only on the folds loop.

@gregleleu
Copy link

Hi,
I'm using doFuture/doRNG parallel processing for my tidymodels workflows (for tuning), with other engines (apparently I need to load doFuture before using doRNG, but I'm still trying to check that):

library(doFuture)
registerDoFuture()
plan(multisession)

doRNG::registerDoRNG()

It fails when using treesnip with catboost. I get an error: Error in pkg_list[[1]]: subscript out of bounds.
This is because catboost and treesnip are not loaded on the workers (I can't fork because of Rstudio, and there is a consensus you shouldn't fork from Rstudio).
It works when I "register" the dependencies manually (see tidymodels/tune#205):

set_dependency("boost_tree", eng = "catboost", "catboost")
set_dependency("boost_tree", eng = "catboost", "treesnip")

It could useful to either document that somewhere for people or maybe there is a place where you can include the set_dependency commands.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants