-
Notifications
You must be signed in to change notification settings - Fork 9
Building and running with Trilinos
You can learn more about using the machine by running less /opt/VORTEX_INTRO
after logging in.
- To grab the current selection of modules/Trilinos (with RDC required):
source /projects/empire/installs/vortex/CUDA-10.1.243_GNU-7.3.1_SPMPI-ROLLING-RELEASE-CUDA-STATIC/trilinos/latest/load_matching_env.sh
- This is the build script I use for basic builds
#!/usr/bin/env bash
set +ex
empire=$1
if test $# -eq 0
then
echo "usage: $0 <empire-dir> [ <trace-enabled=0> ] [ <build-type=Release> ] "
exit 1
fi
if test $# -gt 1
then
trace=$2
else
trace=0
fi
if test $# -gt 2
then
build_type=$3
else
build_type=Release
fi
cmake -GNinja -DCMAKE_EXPORT_COMPILE_COMMANDS=true -DEMPIRE_ENABLE_WERROR=OFF -DEMPIRE_ENABLE_PIC=ON -Dvt_trace_enabled=${trace} -DCMAKE_BUILD_TYPE=${build_type} ${empire}
ninja EMPIRE_PIC.exe
To run an interactive job on Vortex with a proper shell run:
bsub -nnodes 16 -Is bash
The scheduler is [IBM LSF](The scheduler is IBM LSF: https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_users_guide/chap_jobs_lsf.html).
To schedule a batch job:
bsub -N -nnodes 16 -W <time_limit> -C 1000000000 -o <stdout_file> -e <stderr_file> <run_script>
The output and error files will only appear after the job has terminated. If you want to know what's happening sooner:
bpeek <job_id>
To see my jobs (both running and pending) summarized, I use:
bjobs -o "user: stat: jobid: job_name:25 submit_time: start_time: run_time: time_left: estimated_start_time:"
To see all jobs, add -u all
to the end. If you want to know how wide the running jobs are, it's best to just use:
bjobs -u all
If you schedule multiple jobs and decide not to run them in the order they were submitted, you can move a specific job to the top of your list using:
btop <job_id>
To kill a job, running or pending:
bkill <job_id>
To put a job on hold or release it:
bstop <job_id>
bresume <job_id>
You can learn more about using the machine by running less /opt/MUTRINO_INTRO
after logging in.
- To grab the current selection of modules/Trilinos (with RDC required):
module swap intel/19.0.4 intel/18.0.5
module unload cray-libsci/19.02.1
source /projects/empire/installs/mutrino/INTEL-18.0.5_MPICH-7.7.6-RELEASE-OPENMP-STATIC/trilinos/latest/load_matching_env.sh
module unload cmake/3.9.0
module load cmake/3.14.6
- This is the build script I use for basic builds
#!/usr/bin/env bash
set +ex
empire=$1
if test $# -eq 0
then
echo "usage: $0 <empire-dir> [ <trace-enabled=0> ] [ <build-type=Release> ] "
exit 1
fi
if test $# -gt 1
then
trace=$2
else
trace=0
fi
if test $# -gt 2
then
build_type=$3
else
build_type=Release
fi
srun cmake -DUSE_STANDARD_LINKER=ON -DCMAKE_EXPORT_COMPILE_COMMANDS=true -DEMPIRE_ENABLE_PIC=ON -Dvt_trace_enabled=${trace} -DCMAKE_BUILD_TYPE=${build_type} ${empire}
srun make -j32 EMPIRE_PIC.exe
Note that the srun
before make will build on a compute node, which has the benefit of allowing you to schedule execution as soon as the job successfully completes:
sbatch -d afterok:<make_job_id> <run_script>
The job will get stuck in the queue if your make command fails, so change the dependency using:
scontrol update Dependency=afterok:<new_make_job_id> <run_job_id>
or remove the dependency manually when it's finally built:
scontrol update Dependency= <run_job_id>
Note the space between the equal sign and the next argument.
If you want to build on the head node instead, remove srun
from before the make
command, but not from the cmake
command.
To run an interactive job:
salloc -C haswell -N 64 -t <time_limit> /bin/bash
source /projects/empire/installs/stria/ARM-20.0_OPENMPI-4.0.2-RELEASE-OPENMP-STATIC/trilinos/latest/load_matching_env.sh