You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
At @jvanulde’s suggestion, I would like to share our experience/anecdote (?) when trying to parallelize consequence calculations with Canada earthquake scenarios that @tieganh and @jeremyrimando were working on.
While observing a scripts/run_OQStandard.sh run, I noticed that OpenQuake itself would happily use all available CPU cores to do calculations in parallel (which is awesome!) and would finish calculation in about 2 hours, but some other processing are single-threaded and for example, each of the two runs of python3 scripts/consequences-v3.10.0.py could take over 12 hours. See more info at:
Since @jeremyrimando needed to do complete ~50 scenario calculations within a month or two, it So, I tried my hands on the Python multiprocessing package:
Use Python multiprocessing package to take advantage of multiple CPU cores for processing multiple realizations simultaneously.
This would reduce the total run time of, for example,
All was good, or so I thought, until we noticed that we were getting mysterious, randomly inconsistent results with multiprocessing enabled, see OpenDRR/earthquake-scenarios#58 (comment).
My guess is that there might be bugs in Python itself or in Numpy’s OpenBLAS dot multiplications which might have caused memory corruption or pollution? I must admit that I don't know how to test or prove that claim, let alone debugging and resolving it.
Luckily, we came across GNU parallel, which we used to run multiple copies of Python simultaneously, and enjoyed the same reduction in calculation time with no data corruption:
There is an art to use Python multiprocessing: using the defaults will not work, one need to set the Pool in spawn mode. This is probably the reason why you had problems. The recommended way to run multiple calculations with the engine programmatically is to use the pair of functions create_jobs/run_jobs in the engine/engine.py module. However, if GNU parallel works for you, by all means keep using it ;-)
BTW, GitHub is NOT the right place where to give feedback, the next time send an email to the user mailing list (if you want to make your feedback public) or to [email protected] (then only GEM people will read it).
PS: of course your feedback is very much appreciated!
Hello @raoanirudh, @micheles et al.,
At @jvanulde’s suggestion, I would like to share our experience/anecdote (?) when trying to parallelize consequence calculations with Canada earthquake scenarios that @tieganh and @jeremyrimando were working on.
While observing a scripts/run_OQStandard.sh run, I noticed that OpenQuake itself would happily use all available CPU cores to do calculations in parallel (which is awesome!) and would finish calculation in about 2 hours, but some other processing are single-threaded and for example, each of the two runs of
python3 scripts/consequences-v3.10.0.py
could take over 12 hours. See more info at:Since @jeremyrimando needed to do complete ~50 scenario calculations within a month or two, it So, I tried my hands on the Python
multiprocessing
package:Parallelize calculations in consequences-v3.10.0.py OpenDRR/earthquake-scenarios#58
The actual relevant commit is here: OpenDRR/earthquake-scenarios@f675c83
Credit: https://gist.github.com/EdwinChan/3c13d3a746bb3ec5082f and https://github.com/tqdm/tqdm/blob/master/examples/parallel_bars.py
All was good, or so I thought, until we noticed that we were getting mysterious, randomly inconsistent results with multiprocessing enabled, see OpenDRR/earthquake-scenarios#58 (comment).
My guess is that there might be bugs in Python itself or in Numpy’s OpenBLAS dot multiplications which might have caused memory corruption or pollution? I must admit that I don't know how to test or prove that claim, let alone debugging and resolving it.
Luckily, we came across GNU parallel, which we used to run multiple copies of Python simultaneously, and enjoyed the same reduction in calculation time with no data corruption:
Use GNU parallel to run consequences processing in parallel OpenDRR/earthquake-scenarios#61
Note that the above quite a while ago, from mid-2022, tested with Python 3.8 and OQ 3.11.
@jvandule commented in May 2022:
though I must have had missed his message somehow, and/or procrastinated and forgotten somehow, until Sep 2023 when I replied:
And I finally 3½ months later, I am finally submitting this issue here. Sorry for the delay!
Thanks again for your wonderful work on OpenQuake making earthquake modelling possible!
The text was updated successfully, but these errors were encountered: