-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add multithreading to LHCoptim #17
Conversation
Ok, so this seems to only work on Julia > v1.5 due to changes in the threading. |
Test are now passning for all versions. (Apart from AppVeyor which seems to be broken) |
Hej! Thanks, this looks great. I tested the new implementation on 1, 2, 4, 6, ..., 18, 20 threads on a machine with 20 physical threads and got these result. I also tried it on a 10x larger plan I think we can drop support for Julia 1.0 in favour of making it easier to maintain in the future, what do you think? Thanks a lot for this PR great performance increase, especially on large plans! |
Great to see that it scales reasonably well, at least for larger plans! It might be that other parts of the optimization loop are the bottleneck for high thread counts on large plans, and I wouldn't be surprised if it's possible to squeeze out a bit more performance by threading these parts as well. On the other hand, I guess that doing so would worsen the performance on smaller plans as we introduce more threading overhead, so there is probably a trade-off for which type of plan to prioritize. But that should probably be the scope of a possible future PR :) I don't have any strong opinions on whether to drop support for 1.0 or not. I guess it would make sense to keep it for now and drop it when the next LTS is out? |
Could do some further benchmarking with the threading in-place and see what is the next low-hanging fruit. I will keep 1.0 support for now and merge it in. Thanks! |
Hej!
I had a stab at adding multithreading to
LHCoptim
. Profiling showed that most of the time was spent in_AudzeEglaisDist
followed by_fixedcross!
, so these are the two portions ofLHCoptim
that are threaded.It's worth to note that one global RNG is still used (instead of each thread having its own), but this doesn't ensure consistent results between different number of threads, nor repeatable results on a number of threads > 1.
To make threading opt-in even if multiple threads are available, there is a macro called
@maybe_threaded
which unfortunately uses a non-exported function fromBase.Threads
, but this was the only way I could make it work sinceThreads.@threads
doesn't compose well within another macro.Performance wise, I have only been able to test on my 8 year old laptop with two glorious cores, but the results look quite promising:
It might be a good idea to benchmark on a beefier machine too before merging :)
/Emil