Wrapper for MPI-3 shared memory #350

lukashuebner · 2022-06-27T13:50:54Z

The MPI-3 standard introduced functions which enable multiple ranks on the same node / NUMA-domain to use shared local memory to communicate.

Can be used to for example share input data.
Faster than RDMA put and get, even on a single node.

Partitioning by NUMA-domain currently not standardized.
The shared memory region has a different virtual address on the different MPI processes. Some caveats and even more caveats.

Supporting this functionality in KaMPIng could be a unique selling point and very useful. There are probably multiple levels of support:

Wrap the MPI calls to provide a shmalloc().
Implement a C++ allocator and offset_ptr. These might not work with the SLT but possibly with the Boost containers created specifically for shared memory.
Provide faster communication using (a) shared send/recv buffers with parallel serialization/deserialization + fewer messages per node (b) faster inner-node communication (MPI seems to have some problems with inner node communication according to early experiments done by @mschimek). All of this would, however, just be a way of avoiding MPI+OpenMP and simplify being able to claim that you're using hybrid parallelization.

For sake of completeness: It seems as if one could also remap the shared memory region to another virtual address. On a 64bit system, there might even be a large enough block of virtual addresses which are available on all ranks, and we'd thus be able to map the shared memory region there and use raw pointers again.

The text was updated successfully, but these errors were encountered:

mschimek · 2022-06-27T16:53:34Z

I think this is a very interesting proposal and I share Lukas's view that this could be a beneficial feature for KaMPIng.
(@lukashuebner you mean MPI+OpenMP, don't you?)

lukashuebner · 2022-06-28T06:46:32Z

(@lukashuebner you mean MPI+OpenMP, don't you?)

Yes, of course 🙊

lukashuebner added discussion feature New feature or request labels Jun 27, 2022

niklas-uhl added the low-priority label Aug 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrapper for MPI-3 shared memory #350

Wrapper for MPI-3 shared memory #350

lukashuebner commented Jun 27, 2022 •

edited

Loading

mschimek commented Jun 27, 2022

lukashuebner commented Jun 28, 2022

Wrapper for MPI-3 shared memory #350

Wrapper for MPI-3 shared memory #350

Comments

lukashuebner commented Jun 27, 2022 • edited Loading

mschimek commented Jun 27, 2022

lukashuebner commented Jun 28, 2022

lukashuebner commented Jun 27, 2022 •

edited

Loading