Skip to content

Latest commit

 

History

History
228 lines (158 loc) · 11.8 KB

README.md

File metadata and controls

228 lines (158 loc) · 11.8 KB

buildtest-nersc

This repository contains tests for Cori and Perlmutter using the buildtest framework.

Useful Links

Buildtest References

Setup

To get started, please connect to NERSC system and clone this repo and buildtest:

git clone https://github.com/buildtesters/buildtest
git clone https://github.com/buildtesters/buildtest-nersc

You will need python 3.7 or higher to install buildtest, on Cori/Perlmutter this can be done by loading python module and create a conda environment as shown below.

module load python
conda create -n buildtest
conda activate buildtest

Now let's install buildtest, assuming you have cloned buildtest in $HOME directory source the setup script. For csh users you need to source setup.csh

source ~/buildtest/setup.sh

# csh users
source ~/buildtest/setup.csh

Next, navigate to buildtest-nersc directory and set environment BUILDTEST_CONFIGFILE to point to config.yml which is the configuration file for NERSC system.

cd buildtest-nersc
export BUILDTEST_CONFIGFILE=$(pwd)/config.yml

Make sure the configuration is valid, this can be done by running the following. buildtest will validate the configuration file with the JSON schema :

buildtest config validate

Please make sure you are using tip of devel branch of buildtest when writing tests. You should sync your local devel branch with upstream fork, for more details see contributing guide.

First time around you should discover all buildspecs this can be done via buildtest buildspec find. The command below will find and validate all buildspecs in the buildtest-nersc repo and load them in buildspec cache. Note that one needs to specify --root to specify location where all buildspecs are located, we have not configured buildspec_root in the configuration file since we don't have a central location where this repo will reside.

cd buildtest-nersc
buildtest buildspec find --root buildspecs --rebuild -q

The buildspecs are loaded in buildspec cache file (JSON) that is used by buildtest buildspec find for querying cache. Subsequent runs will read from cache. For more details see buildspec interface.

Building Tests

Note: All tests are written in YAML using .yml extension

To build tests use buildtest build command for example we build all tests in system directory as follows

buildtest build -b system/

You can specify multiple buildspecs either files or directory via -b option

buildtest build -b slurm/partition.yml -b slurmutils/

You can exclude a buildspec via -x option this behaves same way as -b option so you can specify a directory or filepath which could be absolute path, or relative path. This is useful when you want to run multiple tests grouped in directory but exclude a few.

buildtest build -b slurm -x slurm/sinfo.yml

buildtest can run tests via tags which can be useful when grouping tests, to see a list of available tags you can run: buildtest buildspec find --tags

For instance if you want to run all lustre tests you can run the following:

buildtest build --tags lustre

For more details on buildtest test please see the buildtest tutorial

Tags Breakdown

When you write buildspecs, please make sure you attach one or more tags to the test that way your test will get picked up during one of the CI checks. Shown below is a summary of tag description

  • daily - this tag is used for running daily system checks using gitlab CI. Tests should run relatively quick
  • system - this tag is used for classifying all system tests that may include: system configuration, servers, network, cray tests. This tag should be used
  • slurm - this tag is used for slurm test that includes slurm utility check, slurm controller, etc... This tag shouldn't be used for job submission that is managed by jobs tag. The slurm tag tests should be short running test that use a Local Executor.
  • jobs - this tag is used for testing slurm policies by submitting jobs to scheduler.
  • compile - this tag is used for compilation of application (OpenMP, MPI, OpenACC, CUDA, upc, bupc, etc...)
  • e4s - this tag is used for running tests for E4S stack via spack test or E4S Testsuite.
  • module - this tag is used for testing module system
  • benchmark - this tag is used for benchmark tests. This can be application benchmarks, mini-benchmarks, kernels, etc...

You can see breakdown of tags and buildspec summary with the following commands

buildtest buildspec summary
buildtest buildspec find --group-by-tags

Querying Tests

You can use buildtest report and buildtest inspect to query tests. The commands differ slightly and data is represented differently. The buildtest report command will show output in tabular form and only show some of the metadata, if you want to access the entire test record use buildtest inspect command which displays the content in JSON format. For more details on querying tests see https://buildtest.readthedocs.io/en/devel/gettingstarted/query_test_report.html

CI Setup

Tests are run on schedule basis with one schedule corresponding to one gitlab job in .gitlab-ci.yml. The scheduled pipelines are configured in https://software.nersc.gov/NERSC/buildtest-nersc/-/pipeline_schedules. Each schedule has a variable TESTNAME defined to control which pipeline is run since we have multiple gitlab jobs. In the .gitlab-ci.yml we make use of conditional rules using only.

The scheduled jobs are run at different intervals (1x/day, 1x/week, etc...) at different times of day to avoid overloading the system. The gitlab jobs will run jobs based on tags, alternately some tests may be defined by running all tests in a directory (buildtest build -b apps). If you want to add a new scheduled job, please define a new schedule with an appropriate time. The target branch should be devel and define a unique variable used to distinguish scheduled jobs. Next, create a job in .gitlab-ci.yml that references the scheduled job and define variable TESTNAME in the scheduled pipeline.

Gitlab Runner

This project will run CI jobs using collaboration account bdtest. You can login to this user via laptop. We recommend using sshproxy so you can avoid typing password for every ssh connection.

Once you are logged in, you can check status of the gitlab runner using systemctl. For instance to check status of runner on Perlmutter you can run

bdtest@perlmutter:login40:~> systemctl --user status perlmutter-bdtest
● perlmutter-bdtest.service - Gitlab runner for bdtest on perlmutter
     Loaded: loaded (/global/homes/b/bdtest/.config/systemd/user/perlmutter-bdtest.service; enabled; vendor preset: disabled)
     Active: active (running) since Tue 2023-05-23 09:20:18 PDT; 34min ago
   Main PID: 60983 (gitlab-runner)
      Tasks: 1231 (limit: 39321)
     Memory: 3.2G
        CPU: 1d 11h 23min 24.701s
     CGroup: /user.slice/user-99914.slice/[email protected]/app.slice/perlmutter-bdtest.service

The systemd service configuration are located in directory $HOME/.config/systemd/user, shown below are the systemd service files (*.service).

bdtest@perlmutter:login40:~> ls $HOME/.config/systemd/user/*.service
/global/homes/b/bdtest/.config/systemd/user/muller-bdtest.service  /global/homes/b/bdtest/.config/systemd/user/perlmutter-bdtest.service

If you want to start/stop/restart the service you can do the following:

# restart service
systemctl --user restart perlmutter-bdtest

# stop service
systemctl --user stop perlmutter-bdtest

# start service 
systemctl --user start perlmutter-bdtest

The gitlab runner configuration is stored in $HOME/.gitlab-runner including the jacamar configuration (jacamar.toml).

Integrations

This project has integration with Slack to notify CI builds to buildtest Slack at #buildtest-nersc workspace. The integrations can be found at https://software.nersc.gov/NERSC/buildtest-nersc/-/settings/integrations.

The gitlab project https://software.nersc.gov/NERSC/buildtest-nersc has setup Github Integration in https://software.nersc.gov/NERSC/buildtest-nersc/-/settings/repository which is used for pull-mirroring and running CI/CD jobs. The pull-mirroring is configured in https://software.nersc.gov/NERSC/buildtest-nersc/-/settings/repository under Mirroring Repositories.

CDASH

buildtest will push test results to CDASH server at https://my.cdash.org/index.php?project=buildtest-nersc using buildtest cdash upload command.

Contributing Guide

To contribute back you will want to make sure your buildspec is validated before you contribute back, this could be done by running test manually buildtest build or see if buildspec is valid via buildtest buildspec find. It would be good to run your test and make sure it is working as expected, you can view test detail using buildtest inspect name <testname> or buildtest inspect query <testname>. For more details on querying test please see https://buildtest.readthedocs.io/en/devel/gettingstarted/query_test_report.html.

If you want to contribute your tests, please see CONTRIBUTING.md

Submitting an Issue

Please submit all issues to https://github.com/buildtesters/buildtest-nersc/issues. When creating an issue, please see the labels and try to select one or more labels to categorize issue. Please use the following labels depending on the type of issue you are reporting