Skip to content

Commit

Permalink
Merge pull request #287 from pyiron/binder
Browse files Browse the repository at this point in the history
Add binder environment
  • Loading branch information
jan-janssen authored Apr 26, 2024
2 parents e84fbd6 + ef882ce commit 5a81da1
Show file tree
Hide file tree
Showing 6 changed files with 39 additions and 0 deletions.
1 change: 1 addition & 0 deletions .github/workflows/dependabot.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ jobs:
package=$(echo "$PR_TITLE" | awk '{print $2}')
from=$(echo "$PR_TITLE" | awk '{print $4}')
to=$(echo "$PR_TITLE" | awk '{print $6}')
sed -i "/${package}/s/${from}/${to}/g" binder/environment.yml
sed -i "/${package}/s/${from}/${to}/g" .ci_support/environment.yml
sed -i "/${package}/s/${from}/${to}/g" .ci_support/environment-docs.yml
- name: UpdateDependabotPR commit
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
[![Python package](https://github.com/pyiron/pysqa/workflows/Python%20package/badge.svg)](https://github.com/pyiron/pysqa/actions)
[![Documentation Status](https://readthedocs.org/projects/pysqa/badge/?version=latest)](https://pysqa.readthedocs.io/en/latest/?badge=latest)
[![Coverage Status](https://coveralls.io/repos/github/pyiron/pysqa/badge.svg?branch=main)](https://coveralls.io/github/pyiron/pysqa?branch=main)
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/pyiron/pysqa/HEAD?labpath=notebooks%2Fexample.ipynb)

High-performance computing (HPC) does not have to be hard. In this context the aim of `pysqa` is to simplify the submission of calculation to an HPC cluster as easy as starting another subprocess locally. This is achieved based on the assumption that even though modern HPC queuing systems offer a wide range of different configuration options, most users submit the majority of their jobs with very similar parameters.

Expand Down
14 changes: 14 additions & 0 deletions binder/environment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
channels:
- conda-forge
dependencies:
- defusedxml =0.7.1
- coverage
- pandas =2.2.2
- pyyaml =6.0.1
- jinja2 =3.1.3
- paramiko =3.4.0
- tqdm =4.66.2
- pympipool =0.8.1
- cloudpickle =3.0.0
- flux-core =0.59.0
- versioneer =0.28
16 changes: 16 additions & 0 deletions binder/kernel.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
{
"argv": [
"flux",
"start",
"/srv/conda/envs/notebook/bin/python",
"-m",
"ipykernel_launcher",
"-f",
"{connection_file}"
],
"display_name": "Flux",
"language": "python",
"metadata": {
"debugger": true
}
}
6 changes: 6 additions & 0 deletions binder/postBuild
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# jupyter kernel
mkdir -p /home/jovyan/.local/share/jupyter/kernels/flux
cp binder/kernel.json /home/jovyan/.local/share/jupyter/kernels/flux

# install pympipool
pip install . --no-deps --no-build-isolation
1 change: 1 addition & 0 deletions notebooks/example.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"metadata":{"language_info":{"name":"python","version":"3.10.12","mimetype":"text/x-python","codemirror_mode":{"name":"ipython","version":3},"pygments_lexer":"ipython3","nbconvert_exporter":"python","file_extension":".py"},"kernelspec":{"name":"flux","display_name":"Flux","language":"python"}},"nbformat_minor":5,"nbformat":4,"cells":[{"cell_type":"markdown","source":"# Python Interface \nThe `pysqa` package primarily defines one class, that is the `QueueAdapter`. It loads the configuration from a configuration directory, initializes the corrsponding adapter for the specific queuing system and provides a high level interface for users to interact with the queuing system. The `QueueAdapter` can be imported using:","metadata":{},"id":"097a5f9f-69a2-42ae-a565-e3cdb17da461"},{"cell_type":"code","source":"from pysqa import QueueAdapter","metadata":{"trusted":true,"tags":[]},"execution_count":1,"outputs":[],"id":"04e9d4a2-6161-448b-81cd-1c6f8689867d"},{"cell_type":"markdown","source":"After the initial import the class is initialized using the configuration directory specificed by the `directory` parameter which defaults to `\"~/.queues\"`. In this example we load the configuration from the `test` directory: ","metadata":{},"id":"7e3cf646-d4e7-4b1e-ab47-f07342d7a5a2"},{"cell_type":"code","source":"qa = QueueAdapter(directory=\"../tests/config/flux\")","metadata":{"trusted":true,"tags":[]},"execution_count":2,"outputs":[],"id":"7e234eaf-80bc-427e-bd65-9acf70802689"},{"cell_type":"markdown","source":"This directory primarily contains two files, a `queue.yaml` file which contains the meta-data for the queuing system and one or multiple shell script templates. In this example there is one shell script template named `flux.sh`. The configuration files are explained in more detail in the [documentation](https://pysqa.readthedocs.io/en/latest/queue.html#flux). ","metadata":{},"id":"514a7f2e-04ec-4fed-baa5-a181dace7123"},{"cell_type":"code","source":"!cat ../tests/config/flux/queue.yaml","metadata":{"trusted":true,"tags":[]},"execution_count":3,"outputs":[{"name":"stdout","text":"queue_type: FLUX\nqueue_primary: flux\nqueues:\n flux: {cores_max: 64, cores_min: 1, run_time_max: 172800, script: flux.sh}","output_type":"stream"}],"id":"272e7f10-3ae5-4902-aa30-fe62d8500e1f"},{"cell_type":"code","source":"!cat ../tests/config/flux/flux.sh","metadata":{"trusted":true,"tags":[]},"execution_count":4,"outputs":[{"name":"stdout","text":"#!/bin/bash\n# flux:--job-name={{job_name}}\n# flux: --env=CORES={{cores}}\n# flux: --output=time.out\n# flux: --error=error.out\n# flux: -n {{cores}}\n{%- if run_time_max %}\n# flux: -t {{ [1, run_time_max // 60]|max }}\n{%- endif %}\n\n{{command}}","output_type":"stream"}],"id":"87d12ef6-a34b-40d6-b383-0b9f548a66f3"},{"cell_type":"markdown","source":"The `queue.yaml` files and some templates for the most common queuing systems are defined below. By default `pysqa` supports the following variable for the submission script templates:\n\n* `job_name` - the name of the calculation which appears on the queuing system \n* `working_directory` - the directory on the file system the calculation is executed in \n* `cores` - the number of cores used for the calculation\n* `memory_max` - the amount of memory requested for the total calculation\n* `run_time_max` - the run time requested for a given calculation - typically in seconds \n* `command` - the command which is executed on the queuing system\n\nBeyond these standardized keywords, additional flags can be added to the template which are then available through the python interface. \n","metadata":{},"id":"7d079e96-f919-42bd-b353-32f8c407ef22"},{"cell_type":"markdown","source":"# List available queues \nList available queues as list of queue names: ","metadata":{},"id":"451180a6-bc70-4053-a67b-57357522da0f"},{"cell_type":"code","source":"qa.queue_list","metadata":{"trusted":true,"tags":[]},"execution_count":5,"outputs":[{"execution_count":5,"output_type":"execute_result","data":{"text/plain":"['flux']"},"metadata":{}}],"id":"88afd81d-08f3-4ba6-9f33-7baed9cc9149"},{"cell_type":"markdown","source":"List available queues in an pandas dataframe - this returns the information stored in the `queue.yaml` file as a `pandas.DataFrame`:","metadata":{},"id":"ff55f03f-3a51-437f-98cd-f6fd6b8afd40"},{"cell_type":"code","source":"qa.queue_view","metadata":{"trusted":true,"tags":[]},"execution_count":6,"outputs":[{"execution_count":6,"output_type":"execute_result","data":{"text/plain":" cores_max cores_min run_time_max memory_max\nflux 64 1 172800 None","text/html":"<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>cores_max</th>\n <th>cores_min</th>\n <th>run_time_max</th>\n <th>memory_max</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>flux</th>\n <td>64</td>\n <td>1</td>\n <td>172800</td>\n <td>None</td>\n </tr>\n </tbody>\n</table>\n</div>"},"metadata":{}}],"id":"16e44b12-5390-4128-b1ca-0fab463b8e9b"},{"cell_type":"markdown","source":"# Submit job to queue\nSubmit a job to the queue - if no queue is specified it is submitted to the default queue defined in the queue configuration:","metadata":{},"id":"42a53d33-2916-461f-86be-3edbe01d3cc7"},{"cell_type":"code","source":"queue_id = qa.submit_job(\n queue=None,\n job_name=None,\n working_directory=\".\",\n cores=None,\n memory_max=None,\n run_time_max=None,\n dependency_list=None,\n command='sleep 5',\n)\nqueue_id","metadata":{"trusted":true},"execution_count":7,"outputs":[{"execution_count":7,"output_type":"execute_result","data":{"text/plain":"114152177664"},"metadata":{}}],"id":"a3f2ba3a-0f82-4a0a-aa63-b5e71f8f8b39"},{"cell_type":"markdown","source":"The only required parameter is: \n* `command` the command that is executed as part of the job \n\nAdditional options for the submission of the job are:\n* `queue` the queue the job is submitted to. If this option is not defined the `primary_queue` defined in the configuration is used. \n* `job_name` the name of the job submitted to the queuing system. \n* `working_directory` the working directory the job submitted to the queuing system is executed in.\n* `cores` the number of cores used for the calculation. If the cores are not defined the minimum number of cores defined for the selected queue are used. \n* `memory_max` the memory used for the calculation. \n* `run_time_max` the run time for the calculation. If the run time is not defined the maximum run time defined for the selected queue is used. \n* `dependency_list` other jobs the calculation depends on. \n* `**kwargs` allows writing additional parameters to the job submission script if they are available in the corresponding template.\n","metadata":{},"id":"9aa0fdf9-0827-4706-bfed-6b95b95dd061"},{"cell_type":"markdown","source":"# Show jobs in queue \nGet status of all jobs currently handled by the queuing system:","metadata":{},"id":"672854fd-3aaa-4287-b29c-d5370e4adc14"},{"cell_type":"code","source":"qa.get_queue_status()","metadata":{"trusted":true},"execution_count":8,"outputs":[{"execution_count":8,"output_type":"execute_result","data":{"text/plain":" jobid user jobname status\n0 114152177664 jovyan None running","text/html":"<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>jobid</th>\n <th>user</th>\n <th>jobname</th>\n <th>status</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>114152177664</td>\n <td>jovyan</td>\n <td>None</td>\n <td>running</td>\n </tr>\n </tbody>\n</table>\n</div>"},"metadata":{}}],"id":"73518256-faf8-4fea-bc40-9b2198903bf5"},{"cell_type":"markdown","source":"With the additional parameter `user` a specific user can be defined to only list the jobs of this specific user. \n\nIn analogy the jobs of the current user can be listed with: ","metadata":{},"id":"9338f32f-b127-4700-8aba-25aded6b548f"},{"cell_type":"code","source":"qa.get_status_of_my_jobs()","metadata":{"trusted":true},"execution_count":9,"outputs":[{"execution_count":9,"output_type":"execute_result","data":{"text/plain":" jobid user jobname status\n0 114152177664 jovyan None running","text/html":"<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>jobid</th>\n <th>user</th>\n <th>jobname</th>\n <th>status</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>114152177664</td>\n <td>jovyan</td>\n <td>None</td>\n <td>running</td>\n </tr>\n </tbody>\n</table>\n</div>"},"metadata":{}}],"id":"cf6e59e8-f117-4d4a-9637-f83ec84c62fa"},{"cell_type":"markdown","source":"Finally, the status of a specific job with the queue id `queue_id` can be received from the queuing system using:","metadata":{},"id":"d2566873-2d30-4801-9d86-287a247fb7c6"},{"cell_type":"code","source":"qa.get_status_of_job(process_id=queue_id)","metadata":{"trusted":true},"execution_count":10,"outputs":[{"execution_count":10,"output_type":"execute_result","data":{"text/plain":"'running'"},"metadata":{}}],"id":"ee8e14db-cc6e-47e7-a1e5-035427ca83a9"},{"cell_type":"markdown","source":"# Delete job from queue \nDelete a job with the queue id `queue_id` from the queuing system:","metadata":{},"id":"f89528d3-a3f5-4adb-9f74-7f70270aec12"},{"cell_type":"code","source":"qa.delete_job(process_id=queue_id)","metadata":{"trusted":true,"tags":[]},"execution_count":11,"outputs":[{"execution_count":11,"output_type":"execute_result","data":{"text/plain":"''"},"metadata":{}}],"id":"06e1535b-eafd-4b94-ba33-ba24da088a33"},{"cell_type":"code","source":"","metadata":{},"execution_count":null,"outputs":[],"id":"e7ce1aee-5d8d-46b0-b7ec-44dd21646352"}]}

0 comments on commit 5a81da1

Please sign in to comment.