Skip to content

Distributed diffusion equation C/C++ OpenMP one-sided communication GASPI (GPI2/PGAS)

Notifications You must be signed in to change notification settings

neudinger/equadiffGASNET

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Distributed diffusion equation C/C++ OpenMP one-sided communication GASPI (GPI2/PGAS)

Linux C++ C

GPI2 HWLOC OMP

  • Horizontal split

Hybrid computation with multiple parallel computation level :

(Cluster level) -> (Machine level) -> (CPU Level) -> (Core Level)

  • Distributed parallel operations layer with GASPI domain splitting.
  • Multithreaded parallel operations layer with OpenMP.
  • Vectorized parallel operations layer with cached blocked loops.

Usage of HWLOC to gather hierarchical topology and specified thread core process binding.

Boost was used for program_options. Compatible with infiniband. The cmake will download and build program_options only if boost is not found. If GPI2 is not found cmake will download and build it for this project. Only this library will be linked to reduce the library loading overhead.

Horizontal split

Build :

cmake -B build \
-DPRINT_PERF:BOOL=TRUE \
-DCMAKE_BUILD_TYPE=Release \
-DOPENMP:BOOL=TRUE \
-S . && \
cmake --build build 

Run :

On slurm :

sbatch \
--account "" \
--mem-per-cpu=100000 \
--nodes=1 \
--ntasks=${worker} \
--ntasks-per-node=1 \
--cpus-per-task=20 \
--partition=cpu_dist \
--time=24:00:00 \
${here}/run-batch.sh 80000 1000

On any preallocated resource cluter :

gaspi_run.slurm ${NUMA_AWARE} \
--nodes ${NODES} \
--machinefile ${MFILE} \
./build/bin/stencil \
--ompthread_nbr ${OMP_NUM_THREADS} \
--nbr_of_column ${1} \
--nbr_of_row ${1} \
--nbr_iters ${2} \
--energy_init 1

On one machine (localhost) :

gaspi_run -m machines.txt -n 4 build/bin/stencil \
--nbr_of_column 20000 \
--nbr_of_row 20000 \
--nbr_iters 40 \
--ompthread_nbr 0 \
--energy_init 1

C++ 17 was used due to usage of string_view and initializer_list.
GPI2 (PGAS) for distributed layer.
OpenMP for multithreading purpose.

This project was built with slurm as cluster node resource scheduler.