Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ScatterIntervals produces interval_list instead of intervals #5392

Merged
merged 1 commit into from
Nov 6, 2018

Conversation

lbergelson
Copy link
Member

@ldgauthier

@samuelklee
Copy link
Contributor

@lbergelson Don't forget to change the WDLs, e.g., the CNN WDL task still globs *.intervals output.

Is there any difference between the functionality of this tool and that provided by Picard's IntervalListTools, @davidbenjamin @lucidtronix? I only ask now because @yfarjoun recently added some functionality to IntervalListTools that will allow us to scatter by interval count (rather than genomic length), which will replace a hacky bit of bash code that currently is responsible for doing this scatter in the gCNV WDL when I get around to it. So it might be worth revisiting.

@davidbenjamin
Copy link
Contributor

@samuelklee From ~10 minutes of reading code it looks like the functionality is the same. It's been a while but I suspect we created the GATK tool to get rid of a Picard jar dependency.

@ldgauthier
Copy link
Contributor

@lbergelson you beat me because I was stuck trying to actually run a Picard tool in the integration test. (For future reference, that needs a workaround because the test running adds the ERROR level logging to all command lines and Barclay can't parse that for Picard tools for some reason.)

The big reason I was using this instead of IntervalListTools is because the Picard version creates a terrible output file structure that I was having trouble capturing with a simple glob in WDL. I agree that the functionality here is largely redundant, but it was helping me get my workflow working faster at the moment.

public void testOneIntervalAlternateExtension() {
final int scatterCount = 5;
final File outputDir = createTempDir("output");
final String extension = "-scattered.with.a.wierd.extension";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obviously Picard won't read this in, but would the GATK engine? I'm a little worried about giving users the (BB) gun to shoot themselves in the foot.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured we'd want a way to let users specify the old pattern if they want.

* This matches what picard expects an interval list to be named.
* Added a new --extension argument to allow changing the extension.
* Fixes #5390
@lbergelson lbergelson force-pushed the lb_split_intervals_extension branch from 6d4847f to 1ac0038 Compare November 5, 2018 20:35
@codecov-io
Copy link

codecov-io commented Nov 5, 2018

Codecov Report

Merging #5392 into master will increase coverage by 0.003%.
The diff coverage is 100%.

@@               Coverage Diff               @@
##              master     #5392       +/-   ##
===============================================
+ Coverage     86.981%   86.983%   +0.003%     
- Complexity     30420     30421        +1     
===============================================
  Files           1850      1850               
  Lines         140932    140944       +12     
  Branches       15505     15505               
===============================================
+ Hits          122584    122598       +14     
+ Misses         12709     12707        -2     
  Partials        5639      5639
Impacted Files Coverage Δ Complexity Δ
...ender/utils/nio/SeekableByteChannelPrefetcher.java 78.313% <ø> (ø) 27 <0> (ø) ⬇️
...itute/hellbender/tools/walkers/SplitIntervals.java 88.889% <100%> (+0.654%) 6 <1> (ø) ⬇️
...r/tools/walkers/SplitIntervalsIntegrationTest.java 100% <100%> (ø) 12 <4> (+1) ⬆️
...typecaller/PairHMMLikelihoodCalculationEngine.java 87.662% <100%> (ø) 38 <1> (ø) ⬇️
...e/hellbender/engine/spark/SparkContextFactory.java 73.973% <0%> (+2.74%) 11% <0%> (ø) ⬇️

Copy link
Contributor

@ldgauthier ldgauthier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@lbergelson lbergelson merged commit 2b48dcf into master Nov 6, 2018
@lbergelson lbergelson deleted the lb_split_intervals_extension branch November 6, 2018 16:08
EdwardDixon pushed a commit to EdwardDixon/gatk that referenced this pull request Nov 9, 2018
…stitute#5392)

* This matches what picard expects an interval list to be named.
* Added a new --extension argument to allow changing the extension .
* Fixes broadinstitute#5390
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants