-
Notifications
You must be signed in to change notification settings - Fork 18
Data Staging
Stage-in is the ability to load external datasets into Hermes. Stage-out is the ability export data out of Hermes.
Currently, stage-in and stage-out can be applied to POSIX files. The ability to stage-in and stage-out HDF5 datasets (as opposed to the entire HDF5 file) is currently under development. Stage-in/stage-out can be used to load directories, specific files, or fractions of files into Hermes to be processed by the application.
The stage-in / stage-out utility scripts provide the following API:
./stage-in [url] [offset] [size] [dpe]
./stage-out [url]
The [url] parameter for now is just a POSIX path (e.g., "/home/user/hi.txt"). When [size] is 0, the size of the file will be determine automatically. In the future, this parameter could represent an HDF5 dataset using a different schema (e.g., "hdf5::/[dataset-group1]/[dataset-name1]").
An example of a typical stage-in / stage-out workflow is as follows:
mpirun -n 1 ${HERMES_INSTALL}/hermes_daemon
# Create an 8GB
mpirun -n 4 ior -w -k -o /tmp/hi.txt
mpirun -n 4 ${HERMES_INSTALL}/stage_in /tmp/hi.txt 0 0 kRoundRobin
mpirun -n 4 -genv HERMES_CONF=${HERMES_CONF} ior -r -o /tmp/hi.txt
mpirun -n 4 ${HERMES_INSTALL/stage_out /tmp/hi.txt
Stage-in / stage-out can also be applied in a native Hermes program.
#include <hermes/staging.h>
int main(int argc, char **argv) {
auto stager = DataStagerFactory::Get(url);
stager->StageIn(url, PlacementEngine::kRoundRobin);
stager->StageOut(url);
}