-
Notifications
You must be signed in to change notification settings - Fork 417
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sarek bcftools normalization #1682
base: dev
Are you sure you want to change the base?
Changes from 19 commits
4772da1
451aaec
d97726b
e034ff0
8469832
e885888
9e94a05
2bdba7e
1214f10
b7ba4f2
34bf47b
6dff9af
d289261
c78af62
d646ec3
8fb64b2
92094af
fbbfe1b
24791dc
50f1b4b
fb4bb1e
a80cf11
b0f6c12
3bcc27b
f3c6ac6
f9c815d
f60d60d
c0a6ffc
188cf86
1fe12e3
f9e5204
7c96c98
b5909f2
ea7d25a
391f1ea
0bdb5d4
969014f
3d58991
1864091
ca4de22
08bda4b
8044e66
c5146ec
138a06b
fed7bdc
0dbcabd
3f99fb2
2ed8ad1
817847c
af736b4
e877ed4
bb48905
4980695
a31e7db
74fd353
ee2bcb9
4b69e35
b0a44d6
6c39a63
7ae3b44
6f1f8ec
01b974b
fe5027a
f58bc10
5784bb7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -16,6 +16,30 @@ | |
|
||
process { | ||
|
||
withName: 'GERMLINE_VCFS_NORM'{ | ||
ext.args = { [ | ||
'--multiallelics - both', //split multiallelic sites into biallelic records and both SNPs and indels should be merged separately into two records | ||
'--rm-dup all' //output only the first instance of a record which is present multiple times | ||
].join(' ') } | ||
ext.when = { params.concatenate_vcfs } | ||
publishDir = [ | ||
mode: params.publish_dir_mode, | ||
path: { "${params.outdir}/variant_calling/concat/${meta.id}/" } | ||
] | ||
} | ||
|
||
withName: 'VCFS_NORM'{ | ||
ext.args = { [ | ||
'--multiallelics - both', //split multiallelic sites into biallelic records and both SNPs and indels should be merged separately into two records | ||
'--rm-dup all' //output only the first instance of a record which is present multiple times | ||
].join(' ') } | ||
ext.when = { params.normalized_vcfs } | ||
publishDir = [ | ||
mode: params.publish_dir_mode, | ||
path: { "${params.outdir}/variant_calling/normalized/${meta.id}/" } | ||
] | ||
} | ||
|
||
withName: 'GERMLINE_VCFS_CONCAT'{ | ||
ext.args = { "-a" } | ||
ext.when = { params.concatenate_vcfs } | ||
|
@@ -34,11 +58,25 @@ process { | |
] | ||
} | ||
|
||
withName: 'VCFS__SORT'{ | ||
Patricie34 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
ext.prefix = { "${meta.id}.norm" } | ||
ext.when = { params.normalized_vcfs } | ||
publishDir = [ | ||
mode: params.publish_dir_mode, | ||
path: { "${params.outdir}/variant_calling/normalized/${meta.id}/" } | ||
] | ||
} | ||
|
||
withName: 'TABIX_EXT_VCF' { | ||
ext.prefix = { "${input.baseName}" } | ||
ext.when = { params.concatenate_vcfs } | ||
} | ||
|
||
withName: 'TABIX_VCF' { | ||
ext.prefix = { "${input.baseName}" } | ||
ext.when = { params.normalized_vcfs } | ||
} | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. these ones can be combined There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. looking at the module, I think you can actually output the tbi in the same process as the vcf, so no need to spin up an extra process for it There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've checked the bcftools_norm module and it needs as inputs vcf and tbi. So I guess, we can't exclude it. Or do you mean to output tbi from variant callers directly? But what could be excluded is tabix at the end after sorting, right? Because tbi is ouput from bcftools norm process and is transferred to bcftools sort, so it should end up with sorted vcf and tbi at the end |
||
withName: 'TABIX_GERMLINE_VCFS_CONCAT_SORT'{ | ||
ext.prefix = { "${meta.id}.germline" } | ||
ext.when = { params.concatenate_vcfs } | ||
|
@@ -47,4 +85,13 @@ process { | |
path: { "${params.outdir}/variant_calling/concat/${meta.id}/" } | ||
] | ||
} | ||
|
||
withName: 'TABIX_VCFS_INDEX'{ | ||
ext.prefix = { "${meta.id}.norm" } | ||
ext.when = { params.normalized_vcfs } | ||
publishDir = [ | ||
mode: params.publish_dir_mode, | ||
path: { "${params.outdir}/variant_calling/norm/${meta.id}/" } | ||
] | ||
} | ||
} |
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you run |
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be condensed into a single configuration. Why are you publishing the normalised vcfs into two different subdirectories?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Tabix below is not published in the same way, we would end up with the tbi in a different directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I explained it below, two different processes for either normalisation and concatenation of germline vcfs or normalisation of all vcfs.