Executors

Software needs to be smart enough to adapt to the execution environment without needing much input from the end user. At its core, Executors settles on a small set of high-level operations e.g., submit, update (query), and cancel that seem to map quite well across different computing environments. If you need to run your command on a local machine, Slurm, LSF, or some futuristic execution environment, executors can do it!

Installation

Just use pip

pip install executors

Supported job schedulers

The following job schedulers are supported

slurm - Simple Linux Utility for Resource Management
lsf - IBM Platform LSF
pbsubmit - Martinos Center Torque wrapper
local - Local job executor

Usage

The goal is to keep this section as small as possible to get you up and running as quickly as possible. Here we go!

instantiate an `Executor`

To start using Executors, you need to instantiate an Executor object. You can do this in one of two ways. First, you can use executors.probe to automatically discover the job scheduler present in your environment

import executors

E = executors.probe('partition_name')

Alternatively, you can explicitly load an Executor using executors.get, passing in the type of scheduler as the first argument

import executors

E = executors.get('local')

Note that for schedulers other than local, you will need to pass in a second argument called partition_name. This referred to as a queue in some job schedulers

import executors

E = executors.get('slurm', 'partition_name')

instantiate a `Job`

Next, you have to create a Job. See below for descriptions of the supported arguments

from executors.models import Job

job = Job(
    name='job',
    command=['echo', 'Hello, World!'],
    memory='100M',
    time='10',
    output='~/job-%j.stdout',
    error='~/job-%j.stderr'
)

name: Job name (required)
command: Command to be executed (required)
memory: Amount of memory to reserve e.g., 1000K, 100M, 10G, 1TB (required)
time: Amount of time to reserve, in minutes (required)
cpus: Number of processors to reserve (default=1)
gpus: Number of GPUs to reserve
nodes: Number of nodes to reserve (default=1)
output: Path to standard output. Any occurrence of %j will be replaced with Job ID.
error: Path to standard error. Any occurrence of %j will be replaced with Job ID.
parent: Parent job object a.k.a job dependency.

submit the `Job`

Now you can submit your job using your executor object E. Once the job has been submitted, the job.pid property will be set

E.submit(job)
print(f'the job id is {job.pid}')

When the job finishes, you can check it's returncode

print(f'the job returncode is {job.returncode}')

updating the `Job` state

Each Job has active and returncode properties. By default, these are both set to None. These properties will be updated every time you call E.update(job)

E.update(job)
print(f'job {job.pid} has state {job.active} and returncode {job.returncode}')

Keep in mind that even though you have submitted a job, you may not be able to immediately query its state. For this reason, Executors cannot guarantee that calling E.update will update your Job state. If you want E.update to wait until a job is able to be queried, add the argument wait=True

E.update(job, wait=True)

update many `Job` states

Some job schedulers offer efficient ways to query the state of multiple jobs. For that reason, if you have a list (or generator) of Job objects, you can pass those to the update_many method

E.update_many(jobs)

Some Executor objects optimize how update_many is fulfilled, while others will resort to serially querying one job after the other which could result in poor performance.

Job arrays

There are times when you may want to submit several related jobs and control them as a group. A common need is to cancel all remaining jobs if any single job has failed. This is precisely what the JobArray class is for. Let's take a look at an example

from executors.models import JobArray

jobarray = JobArray(
    executor=E,
    cancel_on_fail=True
)
jobarray.add(job_a)
jobarray.add(job_b)
jobarray.submit()
jobarray.wait()

To impose rate limiting on the number of jobs submitted concurrently, use the limit argument. For example, use limit=1 to have only 1 job running at a time

jobarray.submit(limit=1)

Extending Executors

It's fairly simple to extend Executors as you encounter new schedulers. First you must create a new module within the top-level executors module, for example

executors/awsbatch/__init__.py

Next, within this module, you must create a new Executor class that extends executors.models.AbstractExecutor

from executors.models import AbstractExecutor

class Executor(AbstractExecutor):
    ...

Finally you need to add your executor to the probe and get functions within executors/__init__.py.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
examples		examples
executors		executors
.gitignore		.gitignore
Makefile		Makefile
Pipfile		Pipfile
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Executors

Table of contents

Installation

Supported job schedulers

Usage

instantiate an `Executor`

instantiate a `Job`

submit the `Job`

updating the `Job` state

update many `Job` states

Job arrays

Extending Executors

About

Releases

Packages

Languages

harvard-nrg/executors

Folders and files

Latest commit

History

Repository files navigation

Executors

Table of contents

Installation

Supported job schedulers

Usage

instantiate an Executor

instantiate a Job

submit the Job

updating the Job state

update many Job states

Job arrays

Extending Executors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

instantiate an `Executor`

instantiate a `Job`

submit the `Job`

updating the `Job` state

update many `Job` states

Packages