Skip to content

Commit

Permalink
Template update
Browse files Browse the repository at this point in the history
  • Loading branch information
FriederikeHanssen committed Dec 16, 2021
2 parents c91ce24 + 17f8ffe commit 6a3d981
Show file tree
Hide file tree
Showing 14 changed files with 57 additions and 170 deletions.
7 changes: 5 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,11 +51,14 @@ By default, the pipeline currently performs the following:
3. Download the pipeline and test it on a minimal dataset with a single command:

```console
nextflow run nf-core/sarek -profile test,<docker/singularity/podman/shifter/charliecloud/conda/institute>
nextflow run nf-core/sarek -profile test,YOURPROFILE
```

Note that some form of configuration will be needed so that Nextflow knows how to fetch the required software. This is usually done in the form of a config profile (`YOURPROFILE` in the example command above). You can chain multiple config profiles in a comma-separated string.

> * The pipeline comes with config profiles called `docker`, `singularity`, `podman`, `shifter`, `charliecloud` and `conda` which instruct the pipeline to use the named tool for software management. For example, `-profile test,docker`.
> * Please check [nf-core/configs](https://github.com/nf-core/configs#documentation) to see if a custom config file to run nf-core pipelines already exists for your Institute. If so, you can simply use `-profile <institute>` in your command. This will enable either `docker` or `singularity` and set the appropriate execution settings for your local compute environment.
> * If you are using `singularity` then the pipeline will auto-detect this and attempt to download the Singularity images directly as opposed to performing a conversion from Docker images. If you are persistently observing issues downloading Singularity images directly due to timeout or network issues then please use the `--singularity_pull_docker_container` parameter to pull and convert the Docker image instead. Alternatively, it is highly recommended to use the [`nf-core download`](https://nf-co.re/tools/#downloading-pipelines-for-offline-use) command to pre-download all of the required containers before running the pipeline and to set the [`NXF_SINGULARITY_CACHEDIR` or `singularity.cacheDir`](https://www.nextflow.io/docs/latest/singularity.html?#singularity-docker-hub) Nextflow options to be able to store and re-use the images from a central location for future pipeline runs.
> * If you are using `singularity` and are persistently observing issues downloading Singularity images directly due to timeout or network issues, then you can use the `--singularity_pull_docker_container` parameter to pull and convert the Docker image instead. Alternatively, you can use the [`nf-core download`](https://nf-co.re/tools/#downloading-pipelines-for-offline-use) command to download images first, before running the pipeline. Setting the [`NXF_SINGULARITY_CACHEDIR` or `singularity.cacheDir`](https://www.nextflow.io/docs/latest/singularity.html?#singularity-docker-hub) Nextflow options enables you to store and re-use the images from a central location for future pipeline runs.
> * If you are using `conda`, it is highly recommended to use the [`NXF_CONDA_CACHEDIR` or `conda.cacheDir`](https://www.nextflow.io/docs/latest/conda.html) settings to store the environments in a central location for future pipeline runs.

4. Start running your own analysis!
Expand Down
2 changes: 1 addition & 1 deletion assets/multiqc_config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ custom_logo_title: 'nf-core/sarek'
report_comment: >
This report has been generated by the <a href="https://github.com/nf-core/sarek" target="_blank">nf-core/sarek</a>
analysis pipeline. For information about how to interpret these results, please see the
<a href="https://github.com/nf-core/sarek" target="_blank">documentation</a>.
<a href="https://nf-co.re/sarek" target="_blank">documentation</a>.
report_section_order:
software_versions:
order: -1000
Expand Down
36 changes: 0 additions & 36 deletions bin/scrape_software_versions.py

This file was deleted.

2 changes: 2 additions & 0 deletions conf/base.config
Original file line number Diff line number Diff line change
Expand Up @@ -62,5 +62,7 @@ process {
}
withName:MULTIQC {
errorStrategy = {task.exitStatus == 143 ? 'retry' : 'ignore'}
withName:CUSTOM_DUMPSOFTWAREVERSIONS {
cache = false
}
}
7 changes: 4 additions & 3 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,10 @@
ext.args2 = Second set of arguments appended to command in module (multi-tool modules).
ext.args3 = Third set of arguments appended to command in module (multi-tool modules).
ext.suffix = File name ext.suffix output files. Not available for nf-core modules
ext.prefix = File name ext.prefix output files.
ext.prefix = File name prefix for output files.
----------------------------------------------------------------------------------------
*/

// Generic process options for all workflows
process {

publishDir = [
Expand All @@ -20,9 +19,10 @@ process {
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]

withName: 'CUSTOM_DUMPSOFTWAREVERSIONS' {
withName: CUSTOM_DUMPSOFTWAREVERSIONS {
publishDir = [
path: { "${params.outdir}/pipeline_info" },
mode: 'copy',
pattern: '*_versions.yml'
]
}
Expand Down Expand Up @@ -378,6 +378,7 @@ if ((params.tools) && (params.tools.contains('merge'))) {
ext.prefix = {"${meta.id}_snpEff_VEP.ann.vcf"}
}
}

}

// QC_TRIM
Expand Down
4 changes: 2 additions & 2 deletions conf/test.config
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,8 @@ params {

// Limit resources so that this can run on GitHub Actions
max_cpus = 2
max_memory = 6.GB
max_time = 6.h
max_memory = '6.GB'
max_time = '6.h'

// Input data
input = "${baseDir}/tests/csv/3.0/fastq_single.csv"
Expand Down
7 changes: 4 additions & 3 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -762,9 +762,10 @@ Results generated by MultiQC collate pipeline QC from supported tools e.g. FastQ
<details markdown="1">
<summary>Output files</summary>

- `pipeline_info/`
- Reports generated by Nextflow: `execution_report.html`, `execution_timeline.html`, `execution_trace.txt` and `pipeline_dag.dot`/`pipeline_dag.svg`.
- Reports generated by the pipeline: `pipeline_report.html`, `pipeline_report.txt` and `software_versions.tsv`.
* `pipeline_info/`
* Reports generated by Nextflow: `execution_report.html`, `execution_timeline.html`, `execution_trace.txt` and `pipeline_dag.dot`/`pipeline_dag.svg`.
* Reports generated by the pipeline: `pipeline_report.html`, `pipeline_report.txt` and `software_versions.yml`. The `pipeline_report*` files will only be present if the `--email` / `--email_on_fail` parameter's are used when running the pipeline.
* Reformatted samplesheet files used as input to the pipeline: `samplesheet.valid.csv`.

</details>

Expand Down
36 changes: 0 additions & 36 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -187,42 +187,6 @@ process {

> **NB:** We specify just the process name i.e. `STAR_ALIGN` in the config file and not the full task name string that is printed to screen in the error message or on the terminal whilst the pipeline is running i.e. `RNASEQ:ALIGN_STAR:STAR_ALIGN`. You may get a warning suggesting that the process selector isn't recognised but you can ignore that if the process name has been specified correctly. This is something that needs to be fixed upstream in core Nextflow.
### Tool-specific options

For the ultimate flexibility, we have implemented and are using Nextflow DSL2 modules in a way where it is possible for both developers and users to change tool-specific command-line arguments (e.g. providing an additional command-line argument to the `STAR_ALIGN` process) as well as publishing options (e.g. saving files produced by the `STAR_ALIGN` process that aren't saved by default by the pipeline). In the majority of instances, as a user you won't have to change the default options set by the pipeline developer(s), however, there may be edge cases where creating a simple custom config file can improve the behaviour of the pipeline if for example it is failing due to a weird error that requires setting a tool-specific parameter to deal with smaller / larger genomes.

The command-line arguments passed to STAR in the `STAR_ALIGN` module are a combination of:

* Mandatory arguments or those that need to be evaluated within the scope of the module, as supplied in the [`script`](https://github.com/nf-core/rnaseq/blob/4c27ef5610c87db00c3c5a3eed10b1d161abf575/modules/nf-core/software/star/align/main.nf#L49-L55) section of the module file.

* An [`options.args`](https://github.com/nf-core/rnaseq/blob/4c27ef5610c87db00c3c5a3eed10b1d161abf575/modules/nf-core/software/star/align/main.nf#L56) string of non-mandatory parameters that is set to be empty by default in the module but can be overwritten when including the module in the sub-workflow / workflow context via the `addParams` Nextflow option.

The nf-core/rnaseq pipeline has a sub-workflow (see [terminology](https://github.com/nf-core/modules#terminology)) specifically to align reads with STAR and to sort, index and generate some basic stats on the resulting BAM files using SAMtools. At the top of this file we import the `STAR_ALIGN` module via the Nextflow [`include`](https://github.com/nf-core/rnaseq/blob/4c27ef5610c87db00c3c5a3eed10b1d161abf575/subworkflows/nf-core/align_star.nf#L10) keyword and by default the options passed to the module via the `addParams` option are set as an empty Groovy map [here](https://github.com/nf-core/rnaseq/blob/4c27ef5610c87db00c3c5a3eed10b1d161abf575/subworkflows/nf-core/align_star.nf#L5); this in turn means `options.args` will be set to empty by default in the module file too. This is an intentional design choice and allows us to implement well-written sub-workflows composed of a chain of tools that by default run with the bare minimum parameter set for any given tool in order to make it much easier to share across pipelines and to provide the flexibility for users and developers to customise any non-mandatory arguments.

When including the sub-workflow above in the main pipeline workflow we use the same `include` statement, however, we now have the ability to overwrite options for each of the tools in the sub-workflow including the [`align_options`](https://github.com/nf-core/rnaseq/blob/4c27ef5610c87db00c3c5a3eed10b1d161abf575/workflows/rnaseq.nf#L225) variable that will be used specifically to overwrite the optional arguments passed to the `STAR_ALIGN` module. In this case, the options to be provided to `STAR_ALIGN` have been assigned sensible defaults by the developer(s) in the pipeline's [`modules.config`](https://github.com/nf-core/rnaseq/blob/4c27ef5610c87db00c3c5a3eed10b1d161abf575/conf/modules.config#L70-L74) and can be accessed and customised in the [workflow context](https://github.com/nf-core/rnaseq/blob/4c27ef5610c87db00c3c5a3eed10b1d161abf575/workflows/rnaseq.nf#L201-L204) too before eventually passing them to the sub-workflow as a Groovy map called `star_align_options`. These options will then be propagated from `workflow -> sub-workflow -> module`.

As mentioned at the beginning of this section it may also be necessary for users to overwrite the options passed to modules to be able to customise specific aspects of the way in which a particular tool is executed by the pipeline. Given that all of the default module options are stored in the pipeline's `modules.config` as a [`params` variable](https://github.com/nf-core/rnaseq/blob/4c27ef5610c87db00c3c5a3eed10b1d161abf575/conf/modules.config#L24-L25) it is also possible to overwrite any of these options via a custom config file.

Say for example we want to append an additional, non-mandatory parameter (i.e. `--outFilterMismatchNmax 16`) to the arguments passed to the `STAR_ALIGN` module. Firstly, we need to copy across the default `args` specified in the [`modules.config`](https://github.com/nf-core/rnaseq/blob/4c27ef5610c87db00c3c5a3eed10b1d161abf575/conf/modules.config#L71) and create a custom config file that is a composite of the default `args` as well as the additional options you would like to provide. This is very important because Nextflow will overwrite the default value of `args` that you provide via the custom config.

As you will see in the example below, we have:

* appended `--outFilterMismatchNmax 16` to the default `args` used by the module.
* changed the default `publish_dir` value to where the files will eventually be published in the main results directory.
* appended `'bam':''` to the default value of `publish_files` so that the BAM files generated by the process will also be saved in the top-level results directory for the module. Note: `'out':'log'` means any file/directory ending in `out` will now be saved in a separate directory called `my_star_directory/log/`.

```nextflow
params {
modules {
'star_align' {
args = "--quantMode TranscriptomeSAM --twopassMode Basic --outSAMtype BAM Unsorted --readFilesCommand zcat --runRNGseed 0 --outFilterMultimapNmax 20 --alignSJDBoverhangMin 1 --outSAMattributes NH HI AS NM MD --quantTranscriptomeBan Singleend --outFilterMismatchNmax 16"
publish_dir = "my_star_directory"
publish_files = ['out':'log', 'tab':'log', 'bam':'']
}
}
}
```

### Updating containers

The [Nextflow DSL2](https://www.nextflow.io/docs/latest/dsl2.html) implementation of this pipeline uses one container per process which makes it much easier to maintain and update software dependencies. If for some reason you need to use a different version of a particular tool with the pipeline then you just need to identify the `process` name and override the Nextflow `container` definition for that process using the `withName` declaration. For example, in the [nf-core/viralrecon](https://nf-co.re/viralrecon) pipeline a tool called [Pangolin](https://github.com/cov-lineages/pangolin) has been used during the COVID-19 pandemic to assign lineages to SARS-CoV-2 genome sequenced samples. Given that the lineage assignments change quite frequently it doesn't make sense to re-release the nf-core/viralrecon everytime a new version of Pangolin has been released. However, you can override the default container used by the pipeline by creating a custom config file and passing it as a command-line argument via `-c custom.config`.
Expand Down
4 changes: 2 additions & 2 deletions lib/NfcoreSchema.groovy
Original file line number Diff line number Diff line change
Expand Up @@ -206,7 +206,7 @@ class NfcoreSchema {
}
def type = '[' + group_params.get(param).type + ']'
def description = group_params.get(param).description
def defaultValue = group_params.get(param).default ? " [default: " + group_params.get(param).default.toString() + "]" : ''
def defaultValue = group_params.get(param).default != null ? " [default: " + group_params.get(param).default.toString() + "]" : ''
def description_default = description + colors.dim + defaultValue + colors.reset
// Wrap long description texts
// Loosely based on https://dzone.com/articles/groovy-plain-text-word-wrap
Expand Down Expand Up @@ -362,7 +362,7 @@ class NfcoreSchema {
}
}
for (ex in causingExceptions) {
printExceptions(ex, params_json, log)
printExceptions(ex, params_json, log, enums)
}
}

Expand Down
30 changes: 9 additions & 21 deletions lib/NfcoreTemplate.groovy
Original file line number Diff line number Diff line change
Expand Up @@ -19,27 +19,16 @@ class NfcoreTemplate {
}

//
// Check params.hostnames
// Warn if a -profile or Nextflow config has not been provided to run the pipeline
//
public static void hostName(workflow, params, log) {
Map colors = logColours(params.monochrome_logs)
if (params.hostnames) {
try {
def hostname = "hostname".execute().text.trim()
params.hostnames.each { prof, hnames ->
hnames.each { hname ->
if (hostname.contains(hname) && !workflow.profile.contains(prof)) {
log.info "=${colors.yellow}====================================================${colors.reset}=\n" +
"${colors.yellow}WARN: You are running with `-profile $workflow.profile`\n" +
" but your machine hostname is ${colors.white}'$hostname'${colors.reset}.\n" +
" ${colors.yellow_bold}Please use `-profile $prof${colors.reset}`\n" +
"=${colors.yellow}====================================================${colors.reset}="
}
}
}
} catch (Exception e) {
log.warn "[$workflow.manifest.name] Could not determine 'hostname' - skipping check. Reason: ${e.message}."
}
public static void checkConfigProvided(workflow, log) {
if (workflow.profile == 'standard' && workflow.configFiles.size() <= 1) {
log.warn "[$workflow.manifest.name] You are attempting to run the pipeline without any custom configuration!\n\n" +
"This will be dependent on your local compute environment but can be achieved via one or more of the following:\n" +
" (1) Using an existing pipeline profile e.g. `-profile docker` or `-profile singularity`\n" +
" (2) Using an existing nf-core/configs for your Institution e.g. `-profile crick` or `-profile uppmax`\n" +
" (3) Using your own local custom config e.g. `-c /path/to/your/custom.config`\n\n" +
"Please refer to the quick start section and usage docs for the pipeline.\n "
}
}

Expand Down Expand Up @@ -168,7 +157,6 @@ class NfcoreTemplate {
log.info "-${colors.purple}[$workflow.manifest.name]${colors.red} Pipeline completed successfully, but with errored process(es) ${colors.reset}-"
}
} else {
hostName(workflow, params, log)
log.info "-${colors.purple}[$workflow.manifest.name]${colors.red} Pipeline completed with errors${colors.reset}-"
}
}
Expand Down
7 changes: 0 additions & 7 deletions lib/Utils.groovy
Original file line number Diff line number Diff line change
Expand Up @@ -37,11 +37,4 @@ class Utils {
"==================================================================================="
}
}

//
// Join module args with appropriate spacing
//
public static String joinModuleArgs(args_list) {
return ' ' + args_list.join(' ')
}
}
10 changes: 8 additions & 2 deletions lib/WorkflowMain.groovy
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,9 @@ class WorkflowMain {
// Print parameter summary log to screen
log.info paramsSummaryLog(workflow, params, log)

// Check that a -profile or Nextflow config has been provided to run the pipeline
NfcoreTemplate.checkConfigProvided(workflow, log)

// Check that conda channels are set-up correctly
if (params.enable_conda) {
Utils.checkCondaChannels(log)
Expand All @@ -69,8 +72,11 @@ class WorkflowMain {
// Check AWS batch settings
NfcoreTemplate.awsBatch(workflow, params)

// Check the hostnames against configured profiles
NfcoreTemplate.hostName(workflow, params, log)
// Check input has been provided
if (!params.input) {
log.error "Please provide an input samplesheet to the pipeline e.g. '--input samplesheet.csv'"
System.exit(1)
}
}

//
Expand Down
Loading

0 comments on commit 6a3d981

Please sign in to comment.