Skip to content

jiavila/hpc-client-1

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Flywheel HPC Client

The HPC Client is a self-service solution that allows Flywheel jobs and gears to run on a High Performance Computing environment. Use on-premise hardware that's already available for highly-concurrent scientific workloads!

Project Status: Prototype. You may run into some rough edges, and will need to work in tandem with Flywheel staff.

Build Status

Architecture

hpc-client-architecture 20210726

HPC types

The client, also called Cast, can support several queue mechanisms out of the box. Flywheel, however, currently only provides support for Slurm. If you require assistance with other schedulers, contact Flywheel.

Common name Code name
IBM spectrum LSF lsf
Oracle / Sun Grid Engine sge
Slurm slurm

If your site uses one of these, it may well just need a config file to get running.
Otherwise, some light python development will be required.

Minimum requirements

Reference this article for the minimum software and computing requirements of the system where the HPC Client will be installed.

Getting started

  1. Before using Cast, you need to decide how it will run on your cluster.
    Choose an integration method and keep it in mind for later. This sets how frequently Cast with look for, pull, and queue hpc jobs to your HPC from your Flywheel site.

  2. It is strongly recommended that you make a private GitHub repo to track your changes.
    This will make Cast much easier to manage.

  3. Perform the initial cluster setup. If you are unfamiliar with
    singularity, it is recommended that you read--at a minimum--SingularityCE's introduction
    and quick start guides.

  4. Create an authorization token so Singularity and Flywheel can work with each other.

  5. If your queue type is not in the above table, or is sufficiently different, review the guide for adding a queue type.

  6. Collaborate with Flywheel staff to install an Engine binaries. They will also configure the hold engine on your Flywheel site to ensure that other engines do not pick up gear jobs that are tagged with "hpc".

  7. Complete the integration method you chose in step one.
    Confirm Cast is running regularly by monitoring logs/cast.log and the Flywheel user interface.

  8. Test and run your first HPC job tests in collaboration with Flywheel. It is recommended that you test with MRIQC (non-BIDS version), a gear that's available from Flywheel's Gear Exchange. Note: as of 11 May 2022, Flywheel will have to change the rootfs-url (location of where the Docker image resides) for any gears installed from the Gear Exchange. For more about how Cast uses a rootfs-url, see Background/Motivation of this article.

  9. Enjoy!

About

Cast jobs to an on-premise HPC.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 95.1%
  • Shell 4.9%