-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel Processing vignette #21
Comments
Hey, Max, glad to see you here. I was writing about forking and then I decided to perform a benchmark to enrich the vignette. tl;dr: pure forking or pure threading wasn't the best: 2 threads with 4 workers was the fastest setup. see here https://curso-r.github.io/treesnip/articles/threading-forking-benchmark.html Do you think that it is worth it to consider these combinations? Or is it better to stick with the simple rule of thumb (tune -> forking; fit -> thread)? |
That's really interesting! TBH I',m surprised that a combination like that works at all. Can you make a plot of the x-axis as the speed-up (seq time/par time)? I might run some of these too locally this weekend. |
@topepo I'm running more benchmarks here and I think I spotted a potential issue you might want to check yourself to confirm: when I set vfold_cv(v = 3) only 3 workers were used even with tune_grid() set to fit lots of different models. And when I set to vfold_cv(v = 8) I watched all my 8 cores 100%. My hypothesis is that tune_grid() is forking only on the folds loop. |
Hi,
It fails when using treesnip with catboost. I get an error:
It could useful to either document that somewhere for people or maybe there is a place where you can include the set_dependency commands. |
I would suggest that, when using
tune
, the standardforeach
parallelism be suggested and the model-specific threading methods be used if justparsnip
is being used to fit.Generally, parallelizing the resamples is faster than the individual models (see
xgboost
example). We always try to parallelize the longest running loop.The text was updated successfully, but these errors were encountered: