Add multithreading to LHCoptim #17

eljungsk · 2021-03-14T13:58:10Z

Hej!

I had a stab at adding multithreading to LHCoptim. Profiling showed that most of the time was spent in _AudzeEglaisDist followed by _fixedcross!, so these are the two portions of LHCoptim that are threaded.

It's worth to note that one global RNG is still used (instead of each thread having its own), but this doesn't ensure consistent results between different number of threads, nor repeatable results on a number of threads > 1.

To make threading opt-in even if multiple threads are available, there is a macro called @maybe_threaded ~~which unfortunately uses a non-exported function from Base.Threads, but this was the only way I could make it work since Threads.@threads doesn't compose well within another macro.~~

Performance wise, I have only been able to test on my 8 year old laptop with two glorious cores, but the results look quite promising:

julia> @benchmark LHCoptim(100,3,1000, threading = false)
BenchmarkTools.Trial: 
  memory estimate:  578.03 MiB
  allocs estimate:  1242768
  --------------
  minimum time:     3.639 s (0.83% GC)
  median time:      3.672 s (0.83% GC)
  mean time:        3.672 s (0.83% GC)
  maximum time:     3.705 s (0.84% GC)
  --------------
  samples:          2
  evals/sample:     1

julia> @benchmark LHCoptim(100,3,1000, threading = true)
BenchmarkTools.Trial: 
  memory estimate:  566.63 MiB
  allocs estimate:  727455
  --------------
  minimum time:     2.121 s (1.28% GC)
  median time:      2.128 s (1.31% GC)
  mean time:        2.133 s (1.33% GC)
  maximum time:     2.148 s (1.30% GC)
  --------------
  samples:          3
  evals/sample:     1

It might be a good idea to benchmark on a beefier machine too before merging :)

/Emil

eljungsk · 2021-03-14T14:40:38Z

Ok, so this seems to only work on Julia > v1.5 due to changes in the threading.

eljungsk · 2021-03-16T19:46:38Z

Test are now passning for all versions. (Apart from AppVeyor which seems to be broken)

MrUrq · 2021-03-19T14:48:25Z

Hej!

Thanks, this looks great. I tested the new implementation on 1, 2, 4, 6, ..., 18, 20 threads on a machine with 20 physical threads and got these result.
@benchmark LHCoptim(100,3,1000; threading=true) seconds=60

I also tried it on a 10x larger plan @benchmark LHCoptim(1000,3,100; threading=true) seconds=60 for 1, 2, 4, 12 and 20 threads showing performance increases all the way up to 20 threads (10x improvement @ 20 threads). I also compared it to the previous version with no performance regressions compared to using 1 thread.

I think we can drop support for Julia 1.0 in favour of making it easier to maintain in the future, what do you think?

Thanks a lot for this PR great performance increase, especially on large plans!

eljungsk · 2021-03-24T17:59:54Z

Great to see that it scales reasonably well, at least for larger plans!

It might be that other parts of the optimization loop are the bottleneck for high thread counts on large plans, and I wouldn't be surprised if it's possible to squeeze out a bit more performance by threading these parts as well. On the other hand, I guess that doing so would worsen the performance on smaller plans as we introduce more threading overhead, so there is probably a trade-off for which type of plan to prioritize. But that should probably be the scope of a possible future PR :)

I don't have any strong opinions on whether to drop support for 1.0 or not. I guess it would make sense to keep it for now and drop it when the next LTS is out?

MrUrq · 2021-03-29T08:10:27Z

Could do some further benchmarking with the threading in-place and see what is the next low-hanging fruit.

I will keep 1.0 support for now and merge it in. Thanks!

eljungsk added 5 commits March 14, 2021 13:45

Use threading when computing the objective function

ad3be45

Use threading for crossover

9d08f72

Test with threads too

2c17dc0

Use threads on Travis

51b6dfc

Update docs to mention multithreading

1bad069

eljungsk force-pushed the threads branch from 4104490 to 1bad069 Compare March 14, 2021 14:38

eljungsk added 2 commits March 16, 2021 07:51

Use @threads within @maybe_threaded

7be2b6a

Fix threading on Julia < v1.2

9305162

MrUrq merged commit b3d5ce8 into MrUrq:master Mar 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add multithreading to LHCoptim #17

Add multithreading to LHCoptim #17

eljungsk commented Mar 14, 2021 •

edited

Loading

eljungsk commented Mar 14, 2021

eljungsk commented Mar 16, 2021 •

edited

Loading

MrUrq commented Mar 19, 2021

eljungsk commented Mar 24, 2021

MrUrq commented Mar 29, 2021

Add multithreading to LHCoptim #17

Add multithreading to LHCoptim #17

Conversation

eljungsk commented Mar 14, 2021 • edited Loading

eljungsk commented Mar 14, 2021

eljungsk commented Mar 16, 2021 • edited Loading

MrUrq commented Mar 19, 2021

eljungsk commented Mar 24, 2021

MrUrq commented Mar 29, 2021

eljungsk commented Mar 14, 2021 •

edited

Loading

eljungsk commented Mar 16, 2021 •

edited

Loading