-
Notifications
You must be signed in to change notification settings - Fork 18
1. Getting Started
There are several ways to obtain a working Hermes installation. Information on dependencies can be found in the README.
- We also maintain Dockerfiles for Hermes development and Hermes dependencies
- CMake
- Instructions can be found in the README
- Spack
- Instructions can be found in the README
If you get stuck, the root of the repository contains a ci
folder where we
keep the scripts we use to build and test Hermes in a Github Actions workflow.
The workflow file itself is here.
Hermes is an application extension. Storage resources are deployed under Hermes control by
- Configuring Hermes for your system and application
- Making your application "Hermes-aware"
An application can be made aware of Hermes in at least three different ways:
- Through Hermes adapters,
LD_PRELOAD
-able shared libraries which intercept common I/O middleware calls such as UNIX STDIO, POSIX, and MPI-IO - Through an HDF5 virtual file driver (VFD)
- By directly targeting the Hermes native API
These options represent different use cases and trade-offs, for example, with respect to expected performance gains and required code change.
When using the STDIO
adapter (intercepting fopen
, fwrite
, etc.) and the
POSIX
adapter (intercepting open
, write
, etc.), there are multiple ways to
deploy Hermes with an existing application.
NOTE: The
MPI-IO
adapter is still experimental, and only supports MPICH at this time.
If your application is meant to be run as a single process, this is the
recommended approach. It doesn't require spawning any daemons. You simply
LD_PRELOAD
the appropriate adapter and provide a path to a Hermes
configuration file through the HERMES_CONFIG
environment variable.
# POSIX adapter
LD_PRELOAD=${HERMES_INSTALL_DIR}/lib/libhermes_posix.so \
HERMES_CONF=/path/to/hermes.conf \
./my_app
# STDIO adapter
LD_PRELOAD=${HERMES_INSTALL_DIR}/lib/libhermes_stdio.so \
HERMES_CONF=/path/to/hermes.conf \
./my_app
IMPORTANT: The adapters don't currently support relative paths or symbolic links. This applies both to the path to the Hermes configuration file, as well as paths to any files your application may
open
orfopen
. This will be fixed soon. See #179 and #180.
IMPORTANT: Currently, even a single-process application is required to call
MPI_Init
andMPI_Finalize
. This will be fixed soon. See #148.
If your app is an MPI application that runs with 2 or more ranks, then you must spawn a Hermes daemon before launching your app. Here's an example of running an app with four ranks on two nodes, two ranks per node:
# We need to start one and only one Hermes daemon on each node. I start this job
# in the background so I can launch the application in the same terminal.
mpirun -n 2 -ppn 1 \
-genv HERMES_CONF /path/to/hermes.conf \
${HERMES_INSTALL_DIR}/bin/hermes_daemon &
# Now we can start our application
mpirun -n 4 -ppn 2 \
-genv LD_PRELOAD ${HERMES_INSTALL_DIR}/lib/libhermes_posix.so \
-genv HERMES_CONF /path/to/hermes.conf \
./my_app
By default, when the application finishes it will also shutdown the Hermes
daemon. However, it is sometimes desirable to keep the daemon alive so your data
remains buffered and available for consumption by a second application. We can
achieve this via the HERMES_STOP_DAEMON
environment variable. Here is an
example of a checkpoint/restart workflow.
# Start a daemon
HERMES_CONF /path/to/hermes.conf \
${HERMES_INSTALL_DIR}/bin/hermes_daemon &
# Run an app that writes a checkpoint file
LD_PRELOAD=${HERMES_INSTALL_DIR}/lib/libhermes_posix.so \
HERMES_CONF=/path/to/hermes.conf \
# Keep the daemon alive
HERMES_STOP_DAEMON=0 \
# Don't persist buffered data to the final destination
ADAPTER_MODE=SCRATCH \
./my_producer ${PFS}/checkpoint.txt
# Run an app that reads the buffered data and performs some computation
LD_PRELOAD=${HERMES_INSTALL_DIR}/lib/libhermes_posix.so \
HERMES_CONF=/path/to/hermes.conf \
./my_consumer ${PFS}/checkpoint.txt
Normally, the producer would write a checkpoint to a parallel file system, and
the consumer would read it back. But when we use Hermes we can buffer the data
in faster, local media. The key lines here are to set HERMES_STOP_DAEMON=0
and
ADAPTER_MODE=SCRATCH
. The default ADAPTER_MODE
is to persist the buffered
data to the "final destination," (${PFS}/checkpoint.txt
in this case), but in
SCRATCH
mode we keep it buffered.
IMPORTANT: Currently the adapters keep buffered data by reference counting open files. Once all open handles are closed, Hermes deletes the buffered data. In the future we will add ways to retain buffered data even after all open handles are closed (see #258). As a temporary workaround for this limitation, just allow your program to exit without explicitly calling
close
orfclose
. The OS will clean up these handles after the app disconnects from the daemon.
MPI is used in Hermes for launching processes and doing some synchronization in startup and finalization. Beyond that, no MPI calls are made and no data is transferred over MPI. Data transfer and communication are done via remote procedure calls. That said, MPI is fully available to applications using the native Hermes API. We have wrappers for the most common functionality:
namespace hermes::api {
void Hermes::AppBarrier(); // collective
bool Hermes::IsFirstRankOnNode();
int Hermes::GetProcessRank();
int Hermes::GetNodeId();
int Hermes::GetNumProcesses();
}
If you require more complex MPI usage, you can get access to the MPI_Comm
like
this:
namespace hapi = hermes::api;
std::shared_ptr<hapi::Hermes> hermes = hapi::InitHermes();
MPI_Comm *app_comm = (MPI_Comm *)hermes->GetAppCommunicator();
// Now you can pass the dereferenced app_comm to any MPI function that accepts
// an MPI_Comm. For example:
int size;
MPI_Comm_size(*app_comm, &size);
Note that what we call the "App communicator" is distinct from the "Hermes communicator." This document assumes all user code using the native Hermes API is contained within a block such as:
if (hermes->IsApplicationCore()) {
// User code.
} else {
// Hermes core. No user code here.
}
See the end to end test for an example.
Here we will walk through an entire example of using Hermes with
IOR. IOR supports several I/O APIs (-a
option),
including POSIX, MPI-IO, and HDF5. Hermes has adapters for POSIX and MPI-IO,
and, with a minor code modification (we need to add a non-default HDF5 file
access property list), we can use the HDF5 Hermes VFD for sequential IOR runs.
For this tutorial, we'll focus on POSIX. We assume you already have working
Hermes and IOR installations. See the README for
Hermes installation details.
Name | Description | Measured Write Bandwidth |
---|---|---|
PFS | OrangeFS running on 8 server nodes, backed by HDDs | 536 MiB/s |
NVMe | Node-local NVMe attached SSDs. | 1918 MiB/s |
RAM | Node-local DRAM. | 79,061 MiB/s |
## Hermes configuration