In this assignment you'll parallelize your solver with std::jthread
s. You can use any threading paradigm you'd like, but the number of threads spawned must be constant regardless of simulation time--you may not spawn threads on each iteration. Functionality and performance requirements are the same as they were for the first optimization assignment--given 8 threads on a whole m9
node, wavesolve_thread
must run in 20 seconds on 2d-medium-in.wo
, and checkpointing must still work.
The environment variable SOLVER_NUM_THREADS
will control how many threads are used. If it is not set or isn't a positive integer, one thread should be used.
The first choice you'll have to make is how to split work among threads. The example code divides work into evenly-sized chunks and assigns each thread one such chunk. Some people find it easier to split work into chunks, put these chunks on a shared queue, and have a pool of worker threads that pull chunks from said queue on each iteration. Choose whichever method makes the most sense to you--performance will be comparable assuming sane work splitting.
Update your CMakeLists.txt
to create wavesolve_thread
(making sure to compile with C++ threads), develop in a branch named phase6
or tag the commit you'd like me to grade from phase6
, and push it.
This phase is worth 20 points. The following deductions, up to 20 points total, will apply for a program that doesn't work as laid out by the spec:
Defect | Deduction |
---|---|
Failure to compile wavesolve_thread |
4 points |
Failure of wavesolve_thread to work on each of 3 test files |
4 points each |
Failure of wavesolve_thread to checkpoint correctly |
4 points |
Failure of wavesolve_thread to run on 2d-medium-in.wo on m9 with 8 threads in 20 seconds |
4 points |
...in 60 seconds | 4 points |
wavesolve_thread isn't a threaded program, or doesn't distribute work evenly between threads |
1-20 points |
wavesolve_thread spawns more than a small, constant number of threads |
10 points |