Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use nf-validation plugin for parameter and samplesheet validation #112

Merged
merged 8 commits into from
Jul 24, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .nf-core.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,5 @@ repository_type: pipeline
lint:
files_unchanged:
- .github/ISSUE_TEMPLATE/bug_report.yml
- pyproject.toml
multiqc_config: false
17 changes: 9 additions & 8 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,17 +10,18 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [#80](https://github.com/nf-core/proteinfold/pull/80) - Add `accelerator` directive to GPU processes when `params.use_gpu` is true.
- [#81](https://github.com/nf-core/proteinfold/pull/81) - Support multiline fasta for colabfold multimer predictions.
- [#89](https://github.com/nf-core/proteinfold/pull/89) - Fix issue with excessive symlinking in the pdb_mmcif database.
- [#90](https://github.com/nf-core/proteinfold/pull/90) - Update pipeline template to [nf-core/tools 2.8](https://github.com/nf-core/tools/releases/tag/2.8).
- [#91](https://github.com/nf-core/proteinfold/pull/91) - Update ColabFold version to 1.5.2 and AlphaFold version to 2.3.2
- [#92](https://github.com/nf-core/proteinfold/pull/92) - Add ESMFold workflow to the pipeline.
- [PR #91](https://github.com/nf-core/proteinfold/pull/91) - Update ColabFold version to 1.5.2 and AlphaFold version to 2.3.2
- [PR #92](https://github.com/nf-core/proteinfold/pull/92) - Add ESMFold workflow to the pipeline.
- Update metro map to include ESMFold workflow.
- Update modules to remove quay from container url.
- [nf-core/tools#2286](https://github.com/nf-core/tools/issues/2286) - Set default container registry outside profile scope.
- [#97](https://github.com/nf-core/proteinfold/pull/97) - Fix issue with uniref30 missing path when using the full BFD database in AlphaFold.
- [#100](https://github.com/nf-core/proteinfold/pull/100) - Update containers for AlphaFold2 and ColabFold local modules.
- [#105](https://github.com/nf-core/proteinfold/pull/105) - Update COLABFOLD_BATCH docker container, metro map figure and nextflow schema description.
- [#106](https://github.com/nf-core/proteinfold/pull/106) - Add `singularity.registry = 'quay.io'` and bump NF version to 23.04.0
- [#108](https://github.com/nf-core/proteinfold/pull/108) - Fix gunzip error when providing too many files when downloading PDBMMCIF database.
- [PR #97](https://github.com/nf-core/proteinfold/pull/97) - Fix issue with uniref30 missing path when using the full BFD database in AlphaFold.
- [PR #100](https://github.com/nf-core/proteinfold/pull/100) - Update containers for AlphaFold2 and ColabFold local modules.
- [PR #105](https://github.com/nf-core/proteinfold/pull/105) - Update COLABFOLD_BATCH docker container, metro map figure and nextflow schema description.
- [PR #106](https://github.com/nf-core/proteinfold/pull/106) - Add `singularity.registry = 'quay.io'` and bump NF version to 23.04.0
- [PR #108](https://github.com/nf-core/proteinfold/pull/108) - Fix gunzip error when providing too many files when downloading PDBMMCIF database.
- [PR #111](https://github.com/nf-core/proteinfold/pull/111) - Update pipeline template to [nf-core/tools 2.9](https://github.com/nf-core/tools/releases/tag/2.9).
- [PR #112](https://github.com/nf-core/rnaseq/pull/112) - Use `nf-validation` plugin for parameter and samplesheet validation

## 1.0.0 - White Silver Reebok

Expand Down
26 changes: 8 additions & 18 deletions assets/schema_input.json
Original file line number Diff line number Diff line change
Expand Up @@ -7,28 +7,18 @@
"items": {
"type": "object",
"properties": {
"sample": {
"sequence": {
"type": "string",
"pattern": "^\\S+$",
"errorMessage": "Sample name must be provided and cannot contain spaces"
"errorMessage": "Sequence name must be provided and cannot contain spaces",
"meta": ["id"]
},
"fastq_1": {
"fasta": {
"type": "string",
"pattern": "^\\S+\\.f(ast)?q\\.gz$",
"errorMessage": "FastQ file for reads 1 must be provided, cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'"
},
"fastq_2": {
"errorMessage": "FastQ file for reads 2 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'",
"anyOf": [
{
"type": "string",
"pattern": "^\\S+\\.f(ast)?q\\.gz$"
},
{
"type": "string",
"maxLength": 0
}
]
"format": "file-path",
"exists": true,
"pattern": "^\\S+\\.fa(sta)?$",
"errorMessage": "Fasta file must be provided, cannot contain spaces and must have extension '.fa' or '.fasta'"
}
},
"required": ["sequence", "fasta"]
Expand Down
8 changes: 0 additions & 8 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -21,14 +21,6 @@ process {
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]

withName: 'SAMPLESHEET_CHECK' {
publishDir = [
path: { "${params.outdir}/pipeline_info" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
}

withName: 'CUSTOM_DUMPSOFTWAREVERSIONS' {
publishDir = [
path: { "${params.outdir}/pipeline_info" },
Expand Down
3 changes: 1 addition & 2 deletions lib/WorkflowAlphafold2.groovy
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,7 @@ class WorkflowAlphafold2 {
//
// Check and validate parameters
//
public static void initialise(params, log) {
}
public static void initialise(params, log) { }

//
// Get workflow summary for MultiQC
Expand Down
3 changes: 1 addition & 2 deletions lib/WorkflowColabfold.groovy
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,7 @@ class WorkflowColabfold {
//
// Check and validate parameters
//
public static void initialise(params, log) {
}
public static void initialise(params, log) { }

//
// Get workflow summary for MultiQC
Expand Down
3 changes: 1 addition & 2 deletions lib/WorkflowEsmfold.groovy
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,7 @@ class WorkflowEsmfold {
//
// Check and validate parameters
//
public static void initialise(params, log) {
}
public static void initialise(params, log) { }

//
// Get workflow summary for MultiQC
Expand Down
5 changes: 0 additions & 5 deletions lib/WorkflowMain.groovy
Original file line number Diff line number Diff line change
Expand Up @@ -44,11 +44,6 @@ class WorkflowMain {

// Check AWS batch settings
NfcoreTemplate.awsBatch(workflow, params)

// Check input has been provided
if (!params.input) {
Nextflow.error("Please provide an input samplesheet to the pipeline e.g. '--input samplesheet.csv'")
}
}
//
// Get attribute from genome config file e.g. fasta
Expand Down
2 changes: 1 addition & 1 deletion modules/local/multifasta_to_singlefasta.nf
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ process MULTIFASTA_TO_SINGLEFASTA {

stub:
"""
touch input.csv
touch ${meta.id}.fasta

cat <<-END_VERSIONS > versions.yml
"${task.process}":
Expand Down
31 changes: 0 additions & 31 deletions modules/local/samplesheet_check.nf

This file was deleted.

2 changes: 1 addition & 1 deletion nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ params {
// Schema validation default options
validationFailUnrecognisedParams = false
validationLenientMode = false
validationSchemaIgnoreParams = 'genomes'
validationSchemaIgnoreParams = ''
validationShowHiddenParams = false
validate_params = true

Expand Down
15 changes: 15 additions & 0 deletions nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
"exists": true,
"mimetype": "text/csv",
"pattern": "^\\S+\\.csv$",
"schema": "assets/schema_input.json",
"description": "Path to comma-separated file containing information about the samples in the experiment.",
"help_text": "You will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row. See [usage docs](https://nf-co.re/proteinfold/usage#samplesheet-input).",
"fa_icon": "fas fa-file-csv"
Expand Down Expand Up @@ -68,6 +69,8 @@
},
"alphafold2_db": {
"type": "string",
"format": "path",
"exists": true,
"description": "Specifies the DB and PARAMS path used by 'AlphaFold2' mode",
"fa_icon": "fas fa-database"
},
Expand Down Expand Up @@ -101,6 +104,8 @@
"properties": {
"colabfold_db": {
"type": "string",
"format": "path",
"exists": true,
"description": "Specifies the PARAMS and DB path used by 'colabfold' mode",
"fa_icon": "fas fa-folder-open"
},
Expand Down Expand Up @@ -170,6 +175,8 @@
"properties": {
"esmfold_db": {
"type": "string",
"format": "path",
"exists": true,
"description": "Specifies the PARAMS path used by 'esmfold' mode",
"fa_icon": "fas fa-folder-open"
},
Expand Down Expand Up @@ -570,18 +577,26 @@
"multiqc_config": {
"type": "string",
"format": "file-path",
"exists": true,
"mimetype": "text/plain",
"description": "Custom config file to supply to MultiQC.",
"fa_icon": "fas fa-cog",
"hidden": true
},
"multiqc_logo": {
"type": "string",
"format": "file-path",
"exists": true,
"mimetype": "text/plain",
"description": "Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file",
"fa_icon": "fas fa-image",
"hidden": true
},
"multiqc_methods_description": {
"type": "string",
"format": "file-path",
"exists": true,
"mimetype": "text/plain",
"description": "Custom MultiQC yaml file containing HTML including a methods description.",
"fa_icon": "fas fa-cog"
},
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Config file for Python. Mostly used to configure linting of bin/check_samplesheet.py with Black.
# Config file for Python. Mostly used to configure linting of bin/*.py with Black.
# Should be kept the same as nf-core/tools to avoid fighting with template synchronisation.
[tool.black]
line-length = 120
Expand Down
37 changes: 0 additions & 37 deletions subworkflows/local/input_check.nf

This file was deleted.

51 changes: 17 additions & 34 deletions workflows/alphafold2.nf
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/

include { paramsSummaryLog; paramsSummaryMap } from 'plugin/nf-validation'
include { paramsSummaryLog; paramsSummaryMap; fromSamplesheet } from 'plugin/nf-validation'

def logo = NfcoreTemplate.logo(workflow, params.monochrome_logs)
def citation = '\n' + WorkflowMain.citation(workflow) + '\n'
Expand All @@ -16,26 +16,16 @@ log.info logo + paramsSummaryLog(workflow) + citation
// Validate input parameters
WorkflowAlphafold2.initialise(params, log)

// Check input path parameters to see if they exist
def checkPathParamList = [
params.input,
params.alphafold2_db
]
for (param in checkPathParamList) { if (param) { file(param, checkIfExists: true) } }

// Check mandatory parameters
if (params.input) { ch_input = file(params.input) } else { exit 1, 'Input file not specified!' }

/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
CONFIG FILES
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/

ch_multiqc_config = Channel.fromPath("$projectDir/assets/multiqc_config.yml", checkIfExists: true)
ch_multiqc_custom_config = params.multiqc_config ? Channel.fromPath( params.multiqc_config, checkIfExists: true ) : Channel.empty()
ch_multiqc_logo = params.multiqc_logo ? Channel.fromPath( params.multiqc_logo, checkIfExists: true ) : Channel.empty()
ch_multiqc_custom_methods_description = params.multiqc_methods_description ? file(params.multiqc_methods_description, checkIfExists: true) : file("$projectDir/assets/methods_description_template.yml", checkIfExists: true)
ch_multiqc_custom_config = params.multiqc_config ? Channel.fromPath( params.multiqc_config ) : Channel.empty()
ch_multiqc_logo = params.multiqc_logo ? Channel.fromPath( params.multiqc_logo ) : Channel.empty()
ch_multiqc_custom_methods_description = params.multiqc_methods_description ? file(params.multiqc_methods_description) : file("$projectDir/assets/methods_description_template.yml", checkIfExists: true)

/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand All @@ -46,7 +36,6 @@ ch_multiqc_custom_methods_description = params.multiqc_methods_description ? fil
//
// SUBWORKFLOW: Consisting of a mix of local and nf-core/modules
//
include { INPUT_CHECK } from '../subworkflows/local/input_check'
include { PREPARE_ALPHAFOLD2_DBS } from '../subworkflows/local/prepare_alphafold2_dbs'

//
Expand Down Expand Up @@ -82,28 +71,22 @@ workflow ALPHAFOLD2 {
ch_versions = Channel.empty()

//
// SUBWORKFLOW: Read in samplesheet, validate and stage input files
// Create input channel from input file provided through params.input
//
if (params.alphafold2_model_preset != 'multimer') {
INPUT_CHECK (
ch_input
)
.fastas
.map {
meta, fasta ->
[ meta, fasta.splitFasta(file:true) ]
}
.transpose()
Channel
.fromSamplesheet("input")
.set { ch_fasta }
} else {
INPUT_CHECK (
ch_input
)
.fastas
.set { ch_fasta }
}
ch_versions = ch_versions.mix(INPUT_CHECK.out.versions)

if (params.alphafold2_model_preset != 'multimer') {
ch_fasta
.map {
meta, fasta ->
[ meta, fasta.splitFasta(file:true) ]
}
.transpose()
.set { ch_fasta }
}

//
// SUBWORKFLOW: Download databases and params for Alphafold2
//
Expand Down
Loading