Code for porting and running code from grid-based clusters in MPI environments. mpist
allows unmodified binaries to be run on POSIX compliant clusters using MPI. lib/mpigrid.h
is a thin shim that can add support for MPI-based parallelism to an already functioning program; see the example in example
An MPI wrapper to allow the execution of many instances of a single-threaded program with a varying environmental variable. Requires the presence of execv() and fork() syscalls and so will not function in Blue Gene (e.g. bg/q) environments.
mpist [-h] [-G] -C ENVVAR -t n-m EXE [ARGS]
Executes EXE [ARGS] with the environmental variable specified by ENVVAR set to integers in the range [n,m] inclusive. If there are insufficient MPI ranks available, the top of the range is silently truncated.
-h displays this message. -G prevents actually exec-ing and just displays status.
Notes:
- It is required that m > n
- The environmental variable, ENVVAR, must be shorter than 127 characters.
A simple call to make
will produce the binary. If xl compilers are available, they will be used.
One might submit a cobalt job with mpist as:
% qsub -n 512 -t 20 mpist -C TASK_ID -t 1-8192 /path/to/my/single-thread/program [args...]
In environments where mpist
will not work (no execv(3) or fork(3)), mpigrid provides a drop-in, bolt-on shim to use MPI.
int getTaskID(const char f, int * argc, char ** argv)
Parses command line arguments and returns a task ID based on the MPI rank. Expects an argument of the form -f n-m to specify tasks to be completed. Initializes MPI functions and arranges for MPI_Finalize to be called at exit.
If an '-f' option is present, getTaskID will shift the contents of argv and change the count in argc to hide this elements from future invocations of getopt(3).
If a '-f' option is present and the range cannot be parsed, the program will emit an error and terminate.
If there are more tasks than ranks, rank 0 will emit a warning. If there are more ranks than tasks, the superfluous ranks will be killed silently.
int task = getTaskID('f', &argc, argv);
Parses the command-line arguments, looking for a flag like -f n-m where n-m is the range to execute over. A unique int in [n,m] is returned for each rank.
See examples/example.c
and make example
for more.