Skip to content

Latest commit

 

History

History
99 lines (86 loc) · 7.61 KB

README.md

File metadata and controls

99 lines (86 loc) · 7.61 KB

pysqa

Unittests Documentation Status Coverage Status Binder

High-performance computing (HPC) does not have to be hard. In this context the aim of the Python Simple Queuing System Adapter (pysqa) is to simplify the submission of tasks from python to HPC clusters as easy as starting another subprocess locally. This is achieved based on the assumption that even though modern HPC queuing systems offer a wide range of different configuration options, most users submit the majority of their jobs with very similar parameters.

Therefore, in pysqa users define submission script templates once and reuse them to submit many different tasks and workflows afterwards. These templates are defined in the jinja2 template language, so current submission scripts can be easily converted to templates. In addition, to the submission of new tasks to HPC queuing systems, pysqa also allows the users to track the progress of their tasks, delete them or enable reservations using the built-in functionality of the queuing system. Finally, pysqa enables remote connections to HPC clusters using SSH including support for two factor authentication via pyauthenticator, this allows the users to submit task from a python process on their local workstation to remote HPC clusters.

All this functionality is available from both the Python interface as well as the command line interface.

Features

The core feature of pysqa is the communication to HPC queuing systems including (Flux, LFS, MOAB, SGE, SLURM and TORQUE). This includes:

  • QueueAdapter().submit_job() - Submission of new tasks to the queuing system.
  • QueueAdapter().get_queue_status() - List of calculation currently waiting or running on the queuing system.
  • QueueAdapter().delete_job() - Deleting calculation which are currently waiting or running on the queuing system.
  • QueueAdapter().queue_list - List of available queue templates created by the user.
  • QueueAdapter().config - Templates to a specific number of cores, run time or other computing resources. With integrated checks if a given submitted task follows these restrictions.

In addition to these core features, pysqa is continuously extended to support more use cases for a larger group of users. These new features include the support for remote queuing systems:

  • Remote connection via the secure shell protocol (SSH) to access remote HPC clusters.
  • Transfer of files to and from remote HPC clusters, based on a predefined mapping of the remote file system into the local file system.
  • Support for both individual connections as well as continuous connections depending on the network availability.

Finally, there is current work in progress to support a combination of multiple local and remote queuing systems from within pysqa, which are represented to the user as a single resource.

Documentation

License

pysqa is released under the BSD license . It is a spin-off of the pyiron project therefore if you use pysqa for calculation which result in a scientific publication, please cite:

@article{pyiron-paper,
  title = {pyiron: An integrated development environment for computational materials science},
  journal = {Computational Materials Science},
  volume = {163},
  pages = {24 - 36},
  year = {2019},
  issn = {0927-0256},
  doi = {https://doi.org/10.1016/j.commatsci.2018.07.043},
  url = {http://www.sciencedirect.com/science/article/pii/S0927025618304786},
  author = {Jan Janssen and Sudarsan Surendralal and Yury Lysogorskiy and Mira Todorova and Tilmann Hickel and Ralf Drautz and Jörg Neugebauer},
  keywords = {Modelling workflow, Integrated development environment, Complex simulation protocols},
}