Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subworkflow Infrastructure #662

Merged
merged 45 commits into from
Oct 8, 2021
Merged
Show file tree
Hide file tree
Changes from 40 commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
fe39480
feat(subworkflows): Add align_bowtie2 subworkflow
edmundmiller Aug 9, 2021
daa23b5
test(align_bowtie2): Add initial list of changes to test
edmundmiller Aug 9, 2021
8232c66
test(align_bowtie2): Add initial test
edmundmiller Aug 10, 2021
a3637b2
refactor: Use tags to run subworkflows ci
edmundmiller Aug 10, 2021
ab570e8
refactor: Use individual directories for subworkflows
edmundmiller Aug 10, 2021
26d8764
docs(align_bowtie2): Add initial meta.yml
edmundmiller Aug 10, 2021
7842e71
fix(align_bowtie2): Fix module include paths
edmundmiller Aug 10, 2021
735f746
test(bam_sort_samtools): Add initial test
edmundmiller Aug 10, 2021
1b683bf
ci(bam_sort_samtools): Add modules that trigger the tag
edmundmiller Aug 10, 2021
48085a5
test(bam_stats_samtools): Add initial test
edmundmiller Aug 10, 2021
64a061a
ci(bam_stats_samtools): Add keys to pick up changes
edmundmiller Aug 10, 2021
87c4ce9
docs(bam_samtools): Add initial meta.yml
edmundmiller Aug 10, 2021
6bf69db
test(align_bowtie2): Fix path to subworkflow
edmundmiller Sep 9, 2021
02b3a01
test(align_bowtie2): Update entry point
edmundmiller Sep 9, 2021
74c22a7
fix(bam_sort_samtools): Update include paths
edmundmiller Sep 9, 2021
bff3917
test(bam_sort_samtools): Fix path
edmundmiller Sep 9, 2021
523ad10
style: Clean up addParams
edmundmiller Sep 9, 2021
a0da34c
test(samtools_sort): Add suffix for test
edmundmiller Sep 10, 2021
6f9acc4
test(align_bowtie2): Add samtools_options for suffix
edmundmiller Sep 10, 2021
2081525
test(bam_stats_samtools): Update path
edmundmiller Sep 10, 2021
7dfcff2
test(bam_stats_samtools): Use stats input
edmundmiller Sep 10, 2021
31c36b8
ci(linting): Skip module linting of subworkflows
edmundmiller Sep 10, 2021
23df41d
ci(linting): Clean up startsWith statement
edmundmiller Sep 10, 2021
dfa14f7
test(bam_stats_samtools): Use single end test data for single end test
edmundmiller Sep 10, 2021
3570238
test(bam_stats_samtools): Add expected files
edmundmiller Sep 10, 2021
8165c5e
test(align_bowtie2): Add paired-end test
edmundmiller Sep 10, 2021
c2699af
test(align_bowtie2): Sort order of output
edmundmiller Sep 10, 2021
d1fc847
test(align_bowtie2): Update hashes
edmundmiller Sep 10, 2021
22af942
docs(align_bowtie2): Fix typo
edmundmiller Sep 10, 2021
275294a
test(align_bowtie2): Update samtools output names
edmundmiller Sep 10, 2021
d044ac7
test(align_bowtie2): Remove md5sums for bam/bai
edmundmiller Sep 10, 2021
74a2e77
feat(subworkflows): Add nextflow.configs
edmundmiller Sep 10, 2021
867c550
docs(subworkflows): Include modules instead of tools
edmundmiller Sep 10, 2021
b5f9432
fix: Update to versions
edmundmiller Oct 7, 2021
53fc924
chore(align_bowtie2): Remove duplicate tag
edmundmiller Oct 7, 2021
d02097a
style: Format yamls
edmundmiller Oct 7, 2021
e7c9c63
test(subworkflows): Only check versions for modules
edmundmiller Oct 7, 2021
9e84fb6
chore: Update subworkflows to match rnaseq dev
edmundmiller Oct 8, 2021
1b4ee12
fix(subworkflows): Update paths
edmundmiller Oct 8, 2021
dd43643
fix(bam_sort_samtools): Fix sort parameters for testing
edmundmiller Oct 8, 2021
08809e3
Apply suggestions from code review
edmundmiller Oct 8, 2021
c7e5b01
docs: Update TODOs with a message
edmundmiller Oct 8, 2021
009e0b3
ci: Try using a matrix for strategy
edmundmiller Oct 8, 2021
74ffbd3
ci: Try passing an array
edmundmiller Oct 8, 2021
05898f5
Revert "ci: Try passing an array"
edmundmiller Oct 8, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/workflows/nf-core-linting.yml
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,8 @@ jobs:
- name: Lint ${{ matrix.tags }}
run: nf-core modules lint ${{ matrix.tags }}
# HACK
if: startsWith( matrix.tags, 'subworkflow' ) != true

- uses: actions/cache@v2
with:
Expand Down
47 changes: 47 additions & 0 deletions subworkflows/nf-core/align_bowtie2/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
//
// Alignment with Bowtie2
//

params.align_options = [:]
params.samtools_sort_options = [:]
edmundmiller marked this conversation as resolved.
Show resolved Hide resolved
params.samtools_index_options = [:]
params.samtools_stats_options = [:]

include { BOWTIE2_ALIGN } from '../../../modules/bowtie2/align/main' addParams( options: params.align_options )
grst marked this conversation as resolved.
Show resolved Hide resolved
include { BAM_SORT_SAMTOOLS } from '../bam_sort_samtools/main' addParams( sort_options: params.samtools_sort_options, index_options: params.samtools_index_options, stats_options: params.samtools_stats_options )

workflow ALIGN_BOWTIE2 {
take:
reads // channel: [ val(meta), [ reads ] ]
index // channel: /path/to/bowtie2/index/

main:

ch_versions = Channel.empty()

//
// Map reads with Bowtie2
//
BOWTIE2_ALIGN ( reads, index )
ch_versions = ch_versions.mix(BOWTIE2_ALIGN.out.versions.first())

//
// Sort, index BAM file and run samtools stats, flagstat and idxstats
//
BAM_SORT_SAMTOOLS ( BOWTIE2_ALIGN.out.bam )
ch_versions = ch_versions.mix(BAM_SORT_SAMTOOLS.out.versions)

emit:
bam_orig = BOWTIE2_ALIGN.out.bam // channel: [ val(meta), bam ]
log_out = BOWTIE2_ALIGN.out.log // channel: [ val(meta), log ]
fastq = BOWTIE2_ALIGN.out.fastq // channel: [ val(meta), fastq ]

bam = BAM_SORT_SAMTOOLS.out.bam // channel: [ val(meta), [ bam ] ]
bai = BAM_SORT_SAMTOOLS.out.bai // channel: [ val(meta), [ bai ] ]
csi = BAM_SORT_SAMTOOLS.out.csi // channel: [ val(meta), [ csi ] ]
stats = BAM_SORT_SAMTOOLS.out.stats // channel: [ val(meta), [ stats ] ]
flagstat = BAM_SORT_SAMTOOLS.out.flagstat // channel: [ val(meta), [ flagstat ] ]
idxstats = BAM_SORT_SAMTOOLS.out.idxstats // channel: [ val(meta), [ idxstats ] ]

versions = ch_versions // channel: [ versions.yml ]
}
50 changes: 50 additions & 0 deletions subworkflows/nf-core/align_bowtie2/meta.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
name: align_bowtie2
description: Align reads to a reference genome using bowtie2 then sort with samtools
keywords:
- align
- fasta
- genome
- reference
modules:
- bowtie2/align
- samtools/sort
- samtools/index
- samtools/stats
- samtools/idxstats
- samtools/flagstat
input:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- reads:
type: file
description: |
List of input FastQ files of size 1 and 2 for single-end and paired-end data,
respectively.
- index:
type: file
description: Bowtie2 genome index files
pattern: '*.ebwt'
# TODO
edmundmiller marked this conversation as resolved.
Show resolved Hide resolved
output:
- bam:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many more outputs than this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this is why you put the TODO statement above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly! We hadn't come up with a standard for the meta.yml for the subworkflows yet, and I'm trying to not have this get caught in limbo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough 👍🏽

type: file
description: Output BAM file containing read alignments
pattern: '*.{bam}'
- versions:
type: file
description: File containing software versions
pattern: 'versions.yml'
- fastq:
type: file
description: Unaligned FastQ files
pattern: '*.fastq.gz'
- log:
type: file
description: Alignment log
pattern: '*.log'
# TODO Add samtools outputs
authors:
- '@drpatelh'
2 changes: 2 additions & 0 deletions subworkflows/nf-core/align_bowtie2/nextflow.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
params.align_options = [:]
params.samtools_options = [:]
53 changes: 53 additions & 0 deletions subworkflows/nf-core/bam_sort_samtools/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
//
// Sort, index BAM file and run samtools stats, flagstat and idxstats
//

params.sort_options = [:]
params.index_options = [:]
params.stats_options = [:]

include { SAMTOOLS_SORT } from '../../../modules/samtools/sort/main' addParams( options: params.sort_options )
include { SAMTOOLS_INDEX } from '../../../modules/samtools/index/main' addParams( options: params.index_options )
include { BAM_STATS_SAMTOOLS } from '../bam_stats_samtools/main' addParams( options: params.stats_options )

workflow BAM_SORT_SAMTOOLS {
take:
ch_bam // channel: [ val(meta), [ bam ] ]

main:

ch_versions = Channel.empty()

SAMTOOLS_SORT ( ch_bam )
ch_versions = ch_versions.mix(SAMTOOLS_SORT.out.versions.first())

SAMTOOLS_INDEX ( SAMTOOLS_SORT.out.bam )
ch_versions = ch_versions.mix(SAMTOOLS_INDEX.out.versions.first())

SAMTOOLS_SORT.out.bam
.join(SAMTOOLS_INDEX.out.bai, by: [0], remainder: true)
.join(SAMTOOLS_INDEX.out.csi, by: [0], remainder: true)
.map {
meta, bam, bai, csi ->
if (bai) {
[ meta, bam, bai ]
} else {
[ meta, bam, csi ]
}
}
.set { ch_bam_bai }

BAM_STATS_SAMTOOLS ( ch_bam_bai )
ch_versions = ch_versions.mix(BAM_STATS_SAMTOOLS.out.versions)

emit:
bam = SAMTOOLS_SORT.out.bam // channel: [ val(meta), [ bam ] ]
bai = SAMTOOLS_INDEX.out.bai // channel: [ val(meta), [ bai ] ]
csi = SAMTOOLS_INDEX.out.csi // channel: [ val(meta), [ csi ] ]

stats = BAM_STATS_SAMTOOLS.out.stats // channel: [ val(meta), [ stats ] ]
flagstat = BAM_STATS_SAMTOOLS.out.flagstat // channel: [ val(meta), [ flagstat ] ]
idxstats = BAM_STATS_SAMTOOLS.out.idxstats // channel: [ val(meta), [ idxstats ] ]

versions = ch_versions // channel: [ versions.yml ]
}
41 changes: 41 additions & 0 deletions subworkflows/nf-core/bam_sort_samtools/meta.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
name: bam_sort_samtools
description: Sort SAM/BAM/CRAM file
keywords:
- sort
- bam
- sam
- cram
modules:
- samtools/sort
- samtools/index
- samtools/stats
- samtools/idxstats
- samtools/flagstat
input:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- bam:
type: file
description: BAM/CRAM/SAM file
pattern: '*.{bam,cram,sam}'
# TODO
edmundmiller marked this conversation as resolved.
Show resolved Hide resolved
output:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- bam:
type: file
description: Sorted BAM/CRAM/SAM file
pattern: '*.{bam,cram,sam}'
- versions:
type: file
description: File containing software versions
pattern: 'versions.yml'
authors:
- '@drpatelh'
- '@ewels'
1 change: 1 addition & 0 deletions subworkflows/nf-core/bam_sort_samtools/nextflow.config
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
params.options = [:]
33 changes: 33 additions & 0 deletions subworkflows/nf-core/bam_stats_samtools/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
//
// Run SAMtools stats, flagstat and idxstats
//

params.options = [:]

include { SAMTOOLS_STATS } from '../../../modules/samtools/stats/main' addParams( options: params.options )
include { SAMTOOLS_IDXSTATS } from '../../../modules/samtools/idxstats/main' addParams( options: params.options )
include { SAMTOOLS_FLAGSTAT } from '../../../modules/samtools/flagstat/main' addParams( options: params.options )

workflow BAM_STATS_SAMTOOLS {
take:
ch_bam_bai // channel: [ val(meta), [ bam ], [bai/csi] ]

main:
ch_versions = Channel.empty()

SAMTOOLS_STATS ( ch_bam_bai )
ch_versions = ch_versions.mix(SAMTOOLS_STATS.out.versions.first())

SAMTOOLS_FLAGSTAT ( ch_bam_bai )
ch_versions = ch_versions.mix(SAMTOOLS_FLAGSTAT.out.versions.first())

SAMTOOLS_IDXSTATS ( ch_bam_bai )
ch_versions = ch_versions.mix(SAMTOOLS_IDXSTATS.out.versions.first())

emit:
stats = SAMTOOLS_STATS.out.stats // channel: [ val(meta), [ stats ] ]
flagstat = SAMTOOLS_FLAGSTAT.out.flagstat // channel: [ val(meta), [ flagstat ] ]
idxstats = SAMTOOLS_IDXSTATS.out.idxstats // channel: [ val(meta), [ idxstats ] ]

versions = ch_versions // channel: [ versions.yml ]
}
43 changes: 43 additions & 0 deletions subworkflows/nf-core/bam_stats_samtools/meta.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
name: samtools_stats
description: Produces comprehensive statistics from SAM/BAM/CRAM file
keywords:
- statistics
- counts
- bam
- sam
- cram
modules:
- samtools/stats
- samtools/idxstats
- samtools/flagstat
input:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- bam:
type: file
description: BAM/CRAM/SAM file
pattern: '*.{bam,cram,sam}'
- bai:
type: file
description: Index for BAM/CRAM/SAM file
pattern: '*.{bai,crai,sai}'
# TODO
edmundmiller marked this conversation as resolved.
Show resolved Hide resolved
output:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- stats:
type: file
description: File containing samtools stats output
pattern: '*.{stats}'
- versions:
type: file
description: File containing software versions
pattern: 'versions.yml'
authors:
- '@drpatelh'
1 change: 1 addition & 0 deletions subworkflows/nf-core/bam_stats_samtools/nextflow.config
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
params.options = [:]
12 changes: 12 additions & 0 deletions tests/config/pytest_modules.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1074,3 +1074,15 @@ yara/index:
yara/mapper:
- modules/yara/mapper/**
- tests/modules/yara/mapper/**

subworkflows/align_bowtie2:
- subworkflows/nf-core/align_bowtie2/**
- tests/subworkflows/nf-core/align_bowtie2/**

subworkflows/bam_stats_samtools:
- subworkflows/nf-core/bam_stats_samtools/**
- tests/subworkflows/nf-core/bam_stats_samtools/**

subworkflows/bam_sort_samtools:
- subworkflows/nf-core/bam_sort_samtools/**
- tests/subworkflows/nf-core/bam_sort_samtools/**
drpatelh marked this conversation as resolved.
Show resolved Hide resolved
2 changes: 1 addition & 1 deletion tests/modules/samtools/sort/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

nextflow.enable.dsl = 2

include { SAMTOOLS_SORT } from '../../../../modules/samtools/sort/main.nf' addParams( options: [:] )
include { SAMTOOLS_SORT } from '../../../../modules/samtools/sort/main.nf' addParams( options: ['suffix': '.sorted'] )

workflow test_samtools_sort {
input = [ [ id:'test', single_end:false ], // meta map
Expand Down
4 changes: 2 additions & 2 deletions tests/modules/samtools/sort/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,5 @@
- samtools
- samtools/sort
files:
- path: output/samtools/test.bam
md5sum: bdc2d9e3f579f84df1e242207b627f89
- path: output/samtools/test.sorted.bam
md5sum: bbb2db225f140e69a4ac577f74ccc90f
27 changes: 27 additions & 0 deletions tests/subworkflows/nf-core/align_bowtie2/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
#!/usr/bin/env nextflow

nextflow.enable.dsl = 2

include { BOWTIE2_BUILD } from '../../../../modules/bowtie2/build/main.nf' addParams( options: [:] )
include { ALIGN_BOWTIE2 } from '../../../../subworkflows/nf-core/align_bowtie2/main.nf' addParams( 'samtools_sort_options': ['suffix': '.sorted'] )

workflow test_align_bowtie2_single_end {
input = [ [ id:'test', single_end:true ], // meta map
[ file(params.test_data['sarscov2']['illumina']['test_1_fastq_gz'], checkIfExists: true) ]
]
fasta = file(params.test_data['sarscov2']['genome']['genome_fasta'], checkIfExists: true)

BOWTIE2_BUILD ( fasta )
ALIGN_BOWTIE2 ( input, BOWTIE2_BUILD.out.index )
}

workflow test_align_bowtie2_paired_end {
input = [ [ id:'test', single_end:false ], // meta map
[ file(params.test_data['sarscov2']['illumina']['test_1_fastq_gz'], checkIfExists: true),
file(params.test_data['sarscov2']['illumina']['test_2_fastq_gz'], checkIfExists: true) ]
]
fasta = file(params.test_data['sarscov2']['genome']['genome_fasta'], checkIfExists: true)

BOWTIE2_BUILD ( fasta )
ALIGN_BOWTIE2 ( input, BOWTIE2_BUILD.out.index )
}
Loading