Skip to content

childmindresearch/slurm_testing

Repository files navigation

Regression Testing on SLURM Clusters (Bridges-2)

This Github repo allows C-PAC developers to run full regression tests on Bridges-2, a cluster under Pittsburgh Computing Center. The scripts in this repo allow you to submit sbatch jobs on Bridges-2.

Note: this repository is currently in the process of transitioning to a CI-initiated flow from a manually initiated flow and contains code and documentation for both during the transition period.

GitHub Actions initiated

Launch a 'lite' regression test run

Set up a GitHub Actions workflow configuration file to call .github/scripts/launch_regtest_lite.SLURM. Use contexts, secrets and environment variables to pass the required variables to the SLURM script from GitHub Actions.

Required variables

  • HOME_DIR: Home directory of the machine user on the remote server (e.g., /ocean/projects/med####p/${USERNAME}).
  • IMAGE: The name and tag of the image to test (e.g., ghcr.io/fcp-indi/c-pac:nightly).
  • OWNER: The owner of the C-PAC repository being tested (e.g., FCP-INDI).
  • PATH_EXTRA: Any paths that need to be added to PATH to run these scripts on the remote server as the machine user.
  • REPO: The name of the C-PAC repository being tested (e.g., C-PAC).
  • SHA: The SHA of the commit or name of the branch or tag being tested.

Steps for launch from GitHub Actions

  1. Get branch to test (e.g.).
  2. Configure action to authenticate to remote server (e.g.).
  3. Initiate GitHub Check for test runs as "pending" (e.g).
  4. Launch the test on the remote server. We're currently using the GitHub Actions Marketplace Action SSH Remote Commands by Bo-Yi Wu to facilitate this step (e.g.).
  5. Remove the configuration from step 2 above (e.g.)
Example call to launch script
sbatch \
  --export="HOME_DIR=${HOME_DIR},IMAGE=${IMAGE},OWNER=${OWNER},PATH_EXTRA=${PATH_EXTRA},REPO=${REPO},SHA=${SHA}" \
  --output="${WORK_DIR}/logs/${SHA}/out.log" \
  --error="${WORK_DIR}/logs/${SHA}/error.log" \
  slurm_testing/.github/scripts/launch_regtest_lite.SLURM

Once launched, the code from this repository will orchestrate the launches (and eventually the correlations and reporting).

See :octocat: FCP-INDI/C-PAC/.github/workflows/regression_test_lite.yml for an example GitHub Actions workflow configuration file that calls this script.

What this repository does once launched

graph TB

launch_regtest_lite --> build_image{build_image}
build_image --"success"--> regtest_lite --> status_regtest_lite_success
build_image --"failure"--> status_regtest_lite_failure
regtest_lite --> launch_jobs
subgraph launch_jobs["for PIPELINE in ${PRECONFIGS} for DATA in ${DATA_SOURCE}"]
  reglite{"reglite_${IMAGE_NAME}_${PIPELINE}_${DATA}"}
  gh_initiate_check["Initiate Check\n(GitHub Check for specific run)"]
end

subgraph status_regtest_lite["status_regtest_lite\n(Update GitHub Check status for launch)"]
  status_regtest_lite_success[[success]]
  status_regtest_lite_failure[[failure]]
end

status_regtest_lite_failure --> push_to_github["push_to_github\n(push logs to GitHub)"]

reglite --> FULL_SUCCESS_DEPENDENCIES --> push_to_github
reglite --"success"--> correlate_regtest_lite
reglite --"failure"--> failed_regtest_lite["failed_regtest_lite\n(Update GitHub Check status for specific run)"]

correlate_regtest_lite -. "correlations not connected yet,\nbut will feed into dependencies\nonce they are" .-> FULL_SUCCESS_DEPENDENCIES

subgraph FULL_SUCCESS_DEPENDENCIES["${FULL_SUCCESS_DEPENDENCIES}\n(SLURM job statuses)"]
  
end
Loading

Manually initiated

Installation Guide

  1. SSH into Bridges-2

  2. In your project home directory (typically /ocean/projects/med####p/{username}), clone this repo

    git clone [email protected]:amygutierrez/slurm_testing.git
  3. You're ready to start testing! 🧑‍💻

What does a C-PAC Regression Test entail?

Regression testing for C-PAC means that certain pipelines and certain datasets will be used for testing. Full regression testing requires ALOT of computaional resources, so will need to run this on a cluster.

Regression testing pipelines tested:

  • default
  • benchmark-FNIRT
  • fmriprep-options
  • ndmg
  • fx-options
  • abcd-options
  • ccs-options
  • rodent
  • monkey

Regression testing datasets used:

  • KKI (5 subjects)
  • HNU_1 (5 subjects)
  • Site-CBIC (4 subjects)
  • Site-SI (3 subjects)

Script Details

This script will run the singularity image provided against the pipelines and datasets detailed above. Every pipeline has fixed the random_seed value to 77742777

Arguments:
--username {username}                             Provide your Bridges-2 username
--out_dir {path/to/desired/output/directory}      Provie the absolute path for the regression test outputs. 
--image_dir {path/to/image.sif}                   Provide the absolute path to the singularity image you want to use
EXAMPLE

To run this script on Bridges-2

bash "/${PATH_TO_REPO}/regression_run_scripts/regtest_job_seed.sh" --username "${USERNAME}" \
--out_dir "/ocean/projects/med####p/${USERNAME}/regression_test" \
--image_dir "/ocean/projects/med####p/${USERNAME}/cpac_nightly.sif"

This script will run the singularity image provided with CPAC branch changes against the pipelines and datasets detailed above. Every pipeline has fixed the random_seed value to 77742777

Arguments:
--username {username}                             Provide your Bridges-2 username
--out_dir {path/to/desired/output/directory}      Provie the absolute path for the regression test outputs. 
--image_dir {path/to/image.sif}                   Provide the absolute path to the singularity image you want to use
--cpac_dir {path/to/cpac/directory}               Provide the absolute path to CPAC git repository. Make sure C-PAC points
to desired branch you want to test
EXAMPLE

To run this script on Bridges-2

bash "/${PATH_TO_REPO}/regression_run_scripts/regtest_job_seed.sh" --username "${USERNAME}" \
--out_dir "/ocean/projects/med####p/${USERNAME}/regression_test" \
--image_dir "/ocean/projects/med####p/${USERNAME}/cpac_nightly.sif" \
--cpac_dir "/ocean/projects/med####p/${USERNAME}/C-PAC"

TIPS!

  • When running these scripts, run them inside a temporary directory that you can delete later. This is because bash scripts and slurm.out files will get written into the current directory. Example:
mkdir -p ./temp_reg_runs

cd ./temp_reg_runs

bash "/${PATH_TO_REPO}/regression_run_scripts/regtest_job_seed.sh" --username "${USERNAME}" \
--out_dir "/ocean/projects/med####p/${USERNAME}/regression_test" \
--image_dir "/ocean/projects/med####p/${USERNAME}/cpac_nightly.sif" \
--cpac_dir "/ocean/projects/med####p/${USERNAME}/C-PAC"
  • If you are testing a specific C-PAC branch, make sure that you pull the branch changes in the C-PAC directory before running regtest_job_cpac_code.sh

About

Run C-PAC regression tests via Slurm.

Resources

Stars

Watchers

Forks

Contributors 4

  •  
  •  
  •  
  •  

Languages