-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changed count-collection tasks to use NIO for BAMs in CNV WDLs. #5015
Conversation
This brings read-count + allelic-count collection down to ~34 cents / TCGA WGS BAM from ~57 cents / BAM. @LeeTL1220 TAG team said they would like to wait for this change to be in FireCloud before starting their tests. When should we make it (along with the necessary changes to get tests passing here---and should that be a fork)? |
@jnktsj see this PR to get an idea of how minimal the changes would be to switch the FC Featured versions of the ModelSegments WDLs to use NIO. For various reasons, I cannot easily make the switch to the WDLs in the repo here (as you can see, this causes tests to fail). So if you would like to go ahead and start testing, I would suggest that you simply clone the Featured WDLs and make the changes yourself, if that's something you're comfortable with. |
@samuelklee Do we want to do this before we have the NIO changes (supported in cromwell 34) new in WDL? Otherwise, we have to fork. |
I'm fine with adding this whenever is most convenient, definitely no rush. |
@samuelklee If you want to retain call caching once you have NIO you can use:
For all of the parameters that were Files and are now Strings. Also you should no longer need to input the bam_idx at all (if it's sitting next to the bam in the bucket). |
@samuelklee What's the status of this? Now that the default version of cromwell in gatk is v36 should this be updated? |
I think we are also waiting for FC to update, so that NIO can be call cached. Not sure what the status is on that. |
Closing for now, should address in #6502. |
Going ahead and making this change so TAG team can start trying out the ModelSegments pipeline.
It's possible that we could use NIO for other files (reference, interval lists), but this will not have as much impact as using it for BAMs (and in some cases, decreases performance, see comments in #4806).
Closes #4806.