Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Align parabricks subworkflow #6876

Open
wants to merge 64 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 45 commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
df512a6
add parabricks
famosab Oct 28, 2024
296188a
remove config tag
famosab Oct 28, 2024
661bdd4
fix typo
famosab Oct 28, 2024
b7abc0a
fix typo
famosab Oct 28, 2024
83bd443
fix typo
famosab Oct 28, 2024
dc00c7b
update paths
famosab Oct 28, 2024
ec71494
update paths
famosab Oct 28, 2024
202d305
remove ch
famosab Oct 28, 2024
68d2e46
change gpu access
famosab Oct 29, 2024
b3e43e8
change fasta
famosab Oct 29, 2024
99cd5c1
update container
famosab Oct 29, 2024
d34c6df
low memory
famosab Oct 29, 2024
41c206c
indey
famosab Oct 29, 2024
72c6ea5
index bwamem
famosab Oct 29, 2024
55901ff
index bwamem
famosab Oct 29, 2024
b662fa5
index bwa
famosab Oct 29, 2024
2e6a27d
add index file
famosab Oct 29, 2024
2e8dfa9
add index file
famosab Oct 29, 2024
6e7b6f5
add index file
famosab Oct 29, 2024
7eced10
add index file
famosab Oct 29, 2024
5cbee34
stage in
famosab Oct 29, 2024
7367d15
stage in
famosab Oct 29, 2024
e676ee9
workdir
famosab Oct 29, 2024
280feec
revert workdir
famosab Oct 29, 2024
472a3a9
revert workdir
famosab Oct 29, 2024
ad8cd22
add bwa index
famosab Oct 29, 2024
5e6202a
add bwa index link
famosab Oct 29, 2024
e0227b6
add bwa index link
famosab Oct 29, 2024
3ce2d86
add bwa index link
famosab Oct 29, 2024
87daaf9
rm stage
famosab Oct 29, 2024
cd9faa6
please work now
famosab Oct 29, 2024
332eea6
remove fq2bam from this PR
famosab Oct 30, 2024
7dceaa3
Merge branch 'master' into parabricks-sbwf
famosab Oct 30, 2024
f5c8cc4
update tests
famosab Oct 30, 2024
a27baac
change inputs in test and to fq2bam
famosab Oct 30, 2024
f9088af
add low memory
famosab Oct 30, 2024
fcd7bd8
adjust applybqsr input
famosab Oct 30, 2024
38bbe78
adjust io to be consistent
famosab Oct 30, 2024
52be7aa
Merge branch 'master' into parabricks-sbwf
famosab Oct 30, 2024
c179f67
Merge branch 'master' into parabricks-sbwf
famosab Nov 15, 2024
0450b3c
Merge branch 'master' into parabricks-sbwf
famosab Nov 18, 2024
84ff84f
wip
famosab Nov 18, 2024
a39b3db
Merge branch 'parabricks-sbwf' of github.com:famosab/modules into par…
famosab Nov 18, 2024
1084460
try applybqsr
famosab Nov 18, 2024
87576ae
Merge branch 'master' into parabricks-sbwf
famosab Nov 18, 2024
cad4876
minor updates
sateeshperi Nov 18, 2024
7bb0222
update snap
famosab Nov 18, 2024
9748d56
update snap
famosab Nov 18, 2024
eb13562
update snap - problem is the naming in applybqsr
famosab Nov 19, 2024
2a4c49f
add tag gpu
famosab Dec 2, 2024
64963ad
Merge branch 'master' into parabricks-sbwf
famosab Dec 2, 2024
8ce25d8
Merge branch 'master' into parabricks-sbwf
famosab Dec 16, 2024
82a0754
update meta
famosab Dec 16, 2024
d760937
Merge branch 'parabricks-sbwf' of github.com:famosab/modules into par…
famosab Dec 16, 2024
cd5ec00
update config
famosab Dec 16, 2024
83df033
Merge branch 'master' into parabricks-sbwf
famosab Dec 16, 2024
f4ca194
Merge branch 'master' into parabricks-sbwf
famosab Dec 17, 2024
5c967c1
Apply suggestions from code review
famosab Dec 18, 2024
c357cc9
Merge branch 'master' into parabricks-sbwf
famosab Dec 18, 2024
27309d1
Merge branch 'master' into parabricks-sbwf
famosab Dec 18, 2024
91712d7
Merge branch 'master' into parabricks-sbwf
famosab Dec 18, 2024
788040e
Merge branch 'master' into parabricks-sbwf
famosab Dec 20, 2024
7ce98df
Merge branch 'master' into parabricks-sbwf
famosab Jan 7, 2025
cd4314d
Merge branch 'master' into parabricks-sbwf
famosab Jan 9, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 48 additions & 0 deletions subworkflows/nf-core/fastq_align_parabricks/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
//
// Alignment and BQSR with Nvidia CLARA Parabricks
//

include { PARABRICKS_FQ2BAM } from '../../../modules/nf-core/parabricks/fq2bam/main'
include { PARABRICKS_APPLYBQSR } from '../../../modules/nf-core/parabricks/applybqsr/main'

workflow FASTQ_ALIGN_PARABRICKS {

take:
ch_reads // channel: [mandatory] meta, reads
ch_fasta // channel: [mandatory] meta, fasta
ch_index // channel: [mandatory] meta, index
ch_interval_file // channel: [optional for parabricks] meta, intervals_bed_combined
ch_known_sites // channel [optional for parabricks] known_sites_indels
famosab marked this conversation as resolved.
Show resolved Hide resolved

main:
ch_versions = Channel.empty()
ch_bam = Channel.empty()
ch_bai = Channel.empty()
ch_bqsr_table = Channel.empty()
ch_qc_metrics = Channel.empty()
ch_duplicate_metrics = Channel.empty()

PARABRICKS_FQ2BAM(ch_reads, ch_fasta, ch_index, ch_interval_file, ch_known_sites)

// Collecting FQ2BAM outputs
ch_qc_metrics = ch_qc_metrics.mix(PARABRICKS_FQ2BAM.out.qc_metrics)
ch_duplicate_metrics = ch_duplicate_metrics.mix(PARABRICKS_FQ2BAM.out.duplicate_metrics)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where do you pass this downstream?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I planned to emit it in the end, its not needed for applybqsr though.


// Apply BQSR
PARABRICKS_APPLYBQSR(
PARABRICKS_FQ2BAM.out.bam,
PARABRICKS_FQ2BAM.out.bai,
PARABRICKS_FQ2BAM.out.bqsr_table,
ch_interval_file,
ch_fasta
)

ch_versions = ch_versions.mix(PARABRICKS_FQ2BAM.out.versions)
ch_versions = ch_versions.mix(PARABRICKS_APPLYBQSR.out.versions)

emit:
bam = PARABRICKS_APPLYBQSR.out.bam // channel: [ [meta], bam ]
bai = PARABRICKS_APPLYBQSR.out.bai // channel: [ [meta], bai ]
versions = ch_versions // channel: [ versions.yml ]

}
51 changes: 51 additions & 0 deletions subworkflows/nf-core/fastq_align_parabricks/meta.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/subworkflows/yaml-schema.json
name: "fastq_align_parabricks"
## TODO nf-core: Add a description of the subworkflow and list keywords
description: Sort SAM/BAM/CRAM file
keywords:
- sort
- bam
- sam
- cram
## TODO nf-core: Add a list of the modules and/or subworkflows used in the subworkflow
components:
- parabricks/fq2bam
- parabricks/applybqsr
## TODO nf-core: List all of the channels used as input with a description and their structure
input:
- ch_bam:
type: file
description: |
The input channel containing the BAM/CRAM/SAM files
Structure: [ val(meta), path(bam) ]
pattern: "*.{bam/cram/sam}"
## TODO nf-core: List all of the channels used as output with a descriptions and their structure
output:
- bam:
type: file
description: |
Channel containing BAM files
Structure: [ val(meta), path(bam) ]
pattern: "*.bam"
- bai:
type: file
description: |
Channel containing indexed BAM (BAI) files
Structure: [ val(meta), path(bai) ]
pattern: "*.bai"
- csi:
type: file
description: |
Channel containing CSI files
Structure: [ val(meta), path(csi) ]
pattern: "*.csi"
- versions:
type: file
description: |
File containing software versions
Structure: [ path(versions.yml) ]
pattern: "versions.yml"
authors:
- "@famosab"
maintainers:
- "@famosab"
13 changes: 13 additions & 0 deletions subworkflows/nf-core/fastq_align_parabricks/tests/lowmem.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
process {

withName: 'PARABRICKS_FQ2BAM' {
ext.args = '--low-memory'
}

withName: 'PARABRICKS_APPLYBQSR' {
stageInMode = 'copy'
}
// Ref: https://forums.developer.nvidia.com/t/problem-with-gpu/256825/6
// Parabricks’s fq2bam requires 24GB of memory.
// Using --low-memory for testing
}
105 changes: 105 additions & 0 deletions subworkflows/nf-core/fastq_align_parabricks/tests/main.nf.test
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
nextflow_workflow {

name "Test Subworkflow FASTQ_ALIGN_PARABRICKS"
script "../main.nf"
workflow "FASTQ_ALIGN_PARABRICKS"

tag "subworkflows"
tag "subworkflows_nfcore"
tag "subworkflows/fastq_align_parabricks"
tag "parabricks"
tag "parabricks/fq2bam"
tag "parabricks/applybqsr"
tag "bwa"
tag "bwa/index"

config "./lowmem.config"

setup {
run("BWA_INDEX") {
script "../../../../modules/nf-core/bwa/index/main.nf"
process {
"""
input[0] = Channel.of([
[ id:'test' ], // meta map
file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)
])
"""
}
}
}

test("sarscov2 - fastq.gz - single end") {

when {
workflow {
"""
input[0] = Channel.of([
[ id:'test', single_end:true ],
[ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true)]
])
input[1] = Channel.value([
[id: 'reference'],
file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)
])
input[2] = BWA_INDEX.out.index
input[3] = Channel.value([
[id: 'intervals'],
file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/picard/baits.interval_list', checkIfExists: true)
])
input[4] = file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true)
"""
}
}

then {
assertAll(
{ assert workflow.success},
{ assert snapshot(
bam(workflow.out.bam[0][1]).getReadsMD5(),
file(workflow.out.bai[0][1]).name,
workflow.out.versions
).match()
}
)
}
}

test("sarscov2 - fastq.gz - paired end") {

when {
workflow {
"""
input[0] = Channel.of([[ id:'test', single_end:false ],
[
file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true),
file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true)
]
])
input[1] = Channel.value([
[id: 'reference'],
file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)
])
input[2] = BWA_INDEX.out.index
input[3] = Channel.value([
[id: 'intervals'],
file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/picard/baits.interval_list', checkIfExists: true)
])
input[4] = file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true)
"""
}
}

then {
assertAll(
{ assert workflow.success},
{ assert snapshot(
bam(workflow.out.bam[0][1]).getReadsMD5(),
file(workflow.out.bai[0][1]).name,
workflow.out.versions
).match()
}
)
}
}
}
137 changes: 137 additions & 0 deletions subworkflows/nf-core/fastq_align_parabricks/tests/main.nf.test.snap
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
{
"fastq_align_parabricks_single_end": {
"content": [
{
"0": [

],
"1": [

],
"2": [

],
"bai": [

],
"bam": [

],
"versions": [

]
}
],
"meta": {
"nf-test": "0.9.1",
"nextflow": "24.04.4"
},
"timestamp": "2024-10-28T16:07:36.691396"
},
"sarscov2 - fastq.gz - single end": {
"content": [
{
"0": [
[
{
"id": "test",
"single_end": true
},
"test.bam:md5,a989684840fb07e1cf8c74cc2208b283"
]
],
"1": [
[
{
"id": "test",
"single_end": true
},
"test.bam.bai:md5,b42d497c4b8bfda390bc49777fafee75"
]
],
"2": [
"versions.yml:md5,4d671c4d60b6a0279cfca507525daa77"
],
"bai": [
[
{
"id": "test",
"single_end": true
},
"test.bam.bai:md5,b42d497c4b8bfda390bc49777fafee75"
]
],
"bam": [
[
{
"id": "test",
"single_end": true
},
"test.bam:md5,a989684840fb07e1cf8c74cc2208b283"
]
],
"versions": [
"versions.yml:md5,4d671c4d60b6a0279cfca507525daa77"
]
}
],
"meta": {
"nf-test": "0.9.2",
"nextflow": "24.10.0"
},
"timestamp": "2024-11-18T11:42:27.869160615"
},
"sarscov2 - fastq.gz - paired end": {
"content": [
{
"0": [
[
{
"id": "test",
"single_end": false
},
"test.bam:md5,729ae53d6dcb627b478d5e3aa454dd90"
]
],
"1": [
[
{
"id": "test",
"single_end": false
},
"test.bam.bai:md5,ad5084ca0975b685e0f36322ca2fa137"
]
],
"2": [
"versions.yml:md5,4d671c4d60b6a0279cfca507525daa77"
],
"bai": [
[
{
"id": "test",
"single_end": false
},
"test.bam.bai:md5,ad5084ca0975b685e0f36322ca2fa137"
]
],
"bam": [
[
{
"id": "test",
"single_end": false
},
"test.bam:md5,729ae53d6dcb627b478d5e3aa454dd90"
]
],
"versions": [
"versions.yml:md5,4d671c4d60b6a0279cfca507525daa77"
]
}
],
"meta": {
"nf-test": "0.9.2",
"nextflow": "24.10.0"
},
"timestamp": "2024-11-18T11:43:11.246974608"
}
}
Loading