-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding Kraken to Krona conversion to blastoff.wdl #544
base: master
Are you sure you want to change the base?
Conversation
@@ -414,3 +414,13 @@ workflows: | |||
primaryDescriptorPath: /pipes/WDL/workflows/create_enterics_qc_viz_general.wdl | |||
testParameterFiles: | |||
- /empty.json | |||
- name: blastoff |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you insert these added blocks above, so they're in alphabetical order with the other task names?
@@ -0,0 +1,390 @@ | |||
version 1.0 | |||
|
|||
task trim_rmdup_subsamp { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for exposing assembly.py trim_rmdup_subsamp
in a WDL task. Since the functionality isn't metablast-specific, could you move the task to tasks_read_utils.wdl
? (and then change the location it's imported from in the workflows using it)
assembly.py trim_rmdup_subsamp \ | ||
"~{inBam}" \ | ||
"~{clipDb}" \ | ||
"$(pwd)/outbam.bam" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you expose the output bam among outputs of this task, so this task can be used independently for applications expecting bam-format?
|
||
#samtools [OutBam -> FASTA] | ||
#-f 4 (f = include only) (4 = unmapped reads) https://broadinstitute.github.io/picard/explain-flags.html | ||
samtools fasta "$(pwd)/outbam.bam" > "~{bam_basename}.fasta" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe wrap the fasta output in a WDL conditional so it is only produced if a task input (emit_fasta
, say) is set to true
? (the task output would then need to have an optional File?
type, since it may or may not exist)
This fasta has the potential to be quite large. Consider compressing it (with xz
) here, and then decompress it in the task consuming it (iff it is compressed).
version 1.0 | ||
|
||
task trim_rmdup_subsamp { | ||
meta { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we may want to mark this task as volatile: true
, but it looks like the samtools-based downsampling in viral-core uses a constant seed, so it should be deterministic and we don't need to mark it as volatile.
echo "--END STAGE 2" | ||
echo "---START STAGE 3" | ||
|
||
#Stage 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot is going on in this stage. Can inline comments be added to walk through what's happening?
(and maybe also split up the long lines across multiple lines)
} | ||
parameter_meta { | ||
krona_taxonomy_tab: { | ||
description: "Krona taxonomy database containing a single file: 'taxonomy.tab' (exact name), or possibly just a compressed 'taxonomy.tab" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would likely help users if the description says something about how to create or where to obtain a taxonomy.tab
file (ex. "[...] as created by ktUpdateTaxonomy")
@@ -0,0 +1,56 @@ | |||
version 1.0 | |||
|
|||
import "../tasks/tasks_megablast.wdl" as tools |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe change the input alias to something more descriptive like metablast
rather than tools
(& change usage below to match)
@@ -0,0 +1,30 @@ | |||
version 1.0 | |||
|
|||
import "../tasks/tasks_megablast.wdl" as tools |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above about renaming the import alias to as megablast
allowNestedInputs: true | ||
} | ||
input { | ||
File inFasta |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be consistent, underscores rather than camel-case?
Kraken to Krona Conversion: Utilizes the krona command to convert Kraken output into Krona HTML visualizations, which allows for dynamic visualization of blastoff Kraken outputs.