-
Notifications
You must be signed in to change notification settings - Fork 626
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using MPI across multiple sims #1178
Comments
Yup, this was already implemented 11 years ago in 81dcccb |
Call n = meep.divide_parallel_processes(N) to divide the MPI processes into That is, you have one run script, and the script only creates one simulation object (typically) — depending on the value of |
And if you need to do some global communications among all processes, just do meep.begin_global_communications()
# ...do stuff...
meep.end_global_communications() |
Note that the statement in https://meep.readthedocs.io/en/latest/Parallel_Meep/#different-forms-of-parallelization that "Meep provides no explicit support for this mode of operation" is actually untrue. |
For example, if you want to synchronize an array between all the processes, not just a subgroup, you could have each process allocate an array to store the results, initialized to zero, write its portion of the data into the array, and then call meep.begin_global_communications()
meep.sum_to_all(input_array, summed_array)
meep.end_global_communications() However, for this to work we will also need to add SWIG typemaps for the |
The script import meep as mp
resolution = 20
sxy = 4
dpml = 1
cell = mp.Vector3(sxy+2*dpml,sxy+2*dpml,0)
pml_layers = [mp.PML(dpml)]
n = mp.divide_parallel_processes(2)
fcen = 1.0 if n == 0 else 0.5
sources = [mp.Source(src=mp.GaussianSource(fcen,fwidth=0.2*fcen),
center=mp.Vector3(),
component=mp.Ez)]
symmetries = [mp.Mirror(mp.X),
mp.Mirror(mp.Y)]
sim = mp.Simulation(cell_size=cell,
resolution=resolution,
sources=sources,
symmetries=symmetries,
boundary_layers=pml_layers)
flux_box = sim.add_flux(fcen, 0, 1,
mp.FluxRegion(mp.Vector3(y=0.5*sxy), size=mp.Vector3(sxy)),
mp.FluxRegion(mp.Vector3(y=-0.5*sxy), size=mp.Vector3(sxy), weight=-1),
mp.FluxRegion(mp.Vector3(0.5*sxy), size=mp.Vector3(y=sxy)),
mp.FluxRegion(mp.Vector3(-0.5*sxy), size=mp.Vector3(y=sxy), weight=-1))
sim.run(until_after_sources=mp.stop_when_fields_decayed(50, mp.Ez, mp.Vector3(), 1e-6))
tot_flux = mp.get_fluxes(flux_box)[0]
print("flux:, {}, {:.4f}, {:.6f}".format(n,fcen,tot_flux)) output
There is no indication that results for the other frequency ( |
Doesn't the print function only operate for the master process now? I think a better way to test is with a |
So you'll need to do some global communications to send the results back to the master process. |
Note that you can do global communications directly with
|
Using from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
tot_fluxes = [0,0]
if rank == 0:
comm.send(tot_flux, dest=1, tag=11)
tot_fluxes[0] = tot_flux
tot_fluxes[1] = comm.recv(source=1, tag=11)
elif rank == 1:
comm.send(tot_flux, dest=0, tag=11)
tot_fluxes[1] = tot_flux
tot_fluxes[0] = comm.recv(source=0, tag=11)
print(tot_fluxes) The output for
It would be nice to have a high-level interface for storing/printing the output from the different subgroups rather than have the user call the low-level |
Might be nice to have a function merged = merge_subgroup_data(data) that takes a numpy array Under the hood, it would call |
Hi @smartalecH , I want to make sure that I am using
|
Looks good to me! I'm assuming you aren't getting the results you are hoping for? |
Thank you! I got errors when running my optimization tasks using |
@smartalecH could you please tell how to choose correctly the number of processes np |
If you have N nodes and M cpus per node, then normally you would want to choose at most NM processes. |
@stevengj @smartalecH f.output_hdf5(Dielectric, v.surroundings()); double freq = 0.3, fwidth = 0.1; f.output_hdf5(Hz, v.surroundings()); return 0; |
Typically if I have a python script with one simulation object, I can easily parallelize it using mpi:
Now let's say my script has four simulation objects and I want to run all of them in parallel and each with multiple processes from the same script. Using the example above with 16 allocated processes, is there a clever way to assign 4 processes to each of my simulation objects that all run concurrently (and only communicate between "subgroups")?
This would be especially useful for multiobjective adjoint optimization.
I realize I could do this with some clever bash scripting but it would be nice to keep everything "in the loop" of a single python script.
The text was updated successfully, but these errors were encountered: