-
Notifications
You must be signed in to change notification settings - Fork 3
Customising your Snaketool
I've cookiecuttered the template, now what?
First, install it as a python module cd my_snektool && pip install -e .
so that you can easily run, rerun, rererun while developing your launcher.
Next, build your pipeline in the workflow
dir.
The example simply uses a Snakefile
and config.yaml
but you should add a directory for conda environments, rules files, etc. following Snakemake best practices.
Finally, open up __main__.py
in your favourite Python IDE and get to work!
Familiarising yourself with the Click command line interface for Python will help you through this guide.
If you have command line options that you want to use in more than one subcommand, you can add them to common_options()
.
Let's add an option to define a temporary directory
def common_options(func):
options = [
click.option('--output', help='Output directory', type=click.Path(),
default='my_snaketool.out', show_default=True),
+ click.option('--temp', help='Directory for temporary files', type=click.Path(),
+ default='my_snaketool.temp', show_default=True),
click.option('--configfile', default='config.yaml', help='Custom config file', show_default=True),
click.option('--threads', help='Number of threads to use', default=1, show_default=True),
click.option('--use-conda/--no-use-conda', default=True, help='Use conda for Snakemake rules',
show_default=True),
click.option('--conda-frontend',
type=click.Choice(['mamba', 'conda'], case_sensitive=True),
default='my_snaketool', help='Specify Conda frontend', show_default=True),
click.option('--conda-prefix', default=snake_base(os.path.join('workflow', 'conda')),
help='Custom conda env directory', type=click.Path(), show_default=False),
click.option('--snake-default', multiple=True,
default=['--rerun-incomplete', '--printshellcmds', '--nolock', '--show-failed-logs'],
help="Customise Snakemake runtime args", show_default=True),
click.argument('snake_args', nargs=-1)]
for option in reversed(options):
func = option(func)
return func
The subcommand run()
launches the main pipeline Snakefile
.
This also demonstrates all the available options when calling the run_snakemake()
function.
Most of the customisation you will probably want to do will be to add command line arguments that will be added to the configuration.
- Add new args as click options
- define them when calling the subcommand script
- add them to the
merge_config
dictionary.
That's it! The new options will be available within the Snakemake config
dictionary.
@click.command(epilog=help_message_extra, context_settings={"ignore_unknown_options": True})
@click.option('--input', '_input', help='Input file/directory', type=str, required=True)
+@click.option('--search', type=click.Choice(['fast', 'slow'], case_sensitive=False), default='fast', help='Search setting', show_default=True)
@common_options
-def run(_input, output, **kwargs):
+def run(_input, search, temp, output, **kwargs):
"""Run My Snaketool"""
merge_config = {
'input': _input,
'output': output,
+ 'search': search,
+ 'temp': temp
}
run_snakemake(
snakefile_path=snake_base(os.path.join('workflow', 'Snakefile')),
merge_config=merge_config,
**kwargs
)
Adding new subcommands is relatively easy. Say we have a super simple Snakemake script for installing the databases such as this example: https://gist.github.com/beardymcjohnface/9b26614536410addf42fc794dd4cab35
Let's make it available with an install subcommand, e.g. my_snaketool install ...
.
Use run() as a template and strip out what you dont want.
We will only keep the common options for running Snakemake.
- Create the installation Snakemake script:
workflow/install.smk
- Create new subcommand function (use 'run()' as a template):
install()
- Update the function doc string, which will become the help message for the subcommand
@click.command(epilog=help_message_extra, context_settings={"ignore_unknown_options": True})
-@click.option('--input', '_input', help='Input file/directory', type=str, required=True)
-@click.option('--search', type=click.Choice(['fast', 'slow'], case_sensitive=False), default='fast', help='Search setting', show_default=True)
@common_options
+def install(**kwargs):
-def run(_input, search, temp, output, **kwargs):
+ """Install databases"""
- """Run My Snaketool"""
- copy_config(configfile, system_config=snake_base(os.path.join('config', 'config.yaml')))
-
- merge_config = {
- 'input': _input,
- 'output': output,
- 'search': search,
- 'temp': temp
- }
run_snakemake(
+ snakefile_path=snake_base(os.path.join('workflow', 'install.smk')),
- snakefile_path=snake_base(os.path.join('workflow', 'Snakefile')),
- merge_config=merge_config,
**kwargs
)
Lastly, add this function name to click's list of commands. Note the order of commands is preserved in the click help message.
cli.add_command(run)
+cli.add_command(install)
cli.add_command(config)
You may wish to add groups of targets to your Snakefile for defining different run stages.
For instance, your pipeline might perform preprocessing and assembly.
You can define alternative top-level rules to let users run specific stages of the pipeline.
Note: target_rules
and the targetRule
decorator is only needed for print_targets
.
### In Snakefile ###
target_rules = []
def targetRule(fn):
assert fn.__name__.startswith('__')
target_rules.append(fn.__name__[2:])
return fn
@targetRule
rule all:
input:
preprocessing_files,
assembly_files
+@targetRule
+rule preprocessing:
+ input:
+ preprocessing_files
+
+@targetRule
+rule assembly:
+ input:
+ assembly_files
@targetRule
rule print_targets:
run:
print("\nTop level rules are: \n", file=sys.stderr)
print("* " + "\n* ".join(target_rules) + "\n\n", file=sys.stderr)
You dont need to change anything in the launcher for this new functionality to work.
my_snaketool run ... preprocessing
But you could update the help message to add the new available run stages.
### in __main__.py ###
help_message_extra = """
\b
CLUSTER EXECUTION:
my_snaketool run ... --profile [profile]
For information on Snakemake profiles see:
https://snakemake.readthedocs.io/en/stable/executing/cli.html#profiles
\b
RUN EXAMPLES:
Required: my_snaketool run --input [file]
Specify threads: my_snaketool run ... --threads [threads]
Disable conda: my_snaketool run ... --no-use-conda
Change defaults: my_snaketool run ... --snake-default="-k --nolock"
Add Snakemake args: my_snaketool run ... --dry-run --keep-going --touch
Specify targets: my_snaketool run ... all print_targets
Available targets:
all Run everything (default)
+ preprocessing Run preprocessing steps only
+ assembly Run assembly steps only
print_targets List available targets
"""