Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adaptive pruning option for local assembly #5473

Merged
merged 6 commits into from
Dec 5, 2018
Merged

Conversation

davidbenjamin
Copy link
Contributor

Closes #4867.

@takutosato Here it is. I'm not quite ready to make it the M2 default, but it looks really good.

@meganshand I have tested it on every mixture in your workspace and results look very similar to the previous hand-tuned pruning results. I'm hoping it's good enough to become best practices for mitochondria and would appreciate if you gave it a shot. You have the right to review if you wish but there's no pressure to do so.

@ldgauthier HaplotypeCaller might also benefit from this. In particular, I wonder about #3697. I'll test it out.

@codecov-io
Copy link

codecov-io commented Dec 2, 2018

Codecov Report

Merging #5473 into master will increase coverage by 0.088%.
The diff coverage is 94.631%.

@@              Coverage Diff               @@
##              master    #5473       +/-   ##
==============================================
+ Coverage     86.982%   87.07%   +0.088%     
- Complexity     31186    31244       +58     
==============================================
  Files           1914     1922        +8     
  Lines         144117   144210       +93     
  Branches       15933    15916       -17     
==============================================
+ Hits          125356   125564      +208     
+ Misses         13006    12872      -134     
- Partials        5755     5774       +19
Impacted Files Coverage Δ Complexity Δ
...ools/walkers/haplotypecaller/graphs/BaseGraph.java 82.49% <ø> (-0.135%) 93 <0> (-1)
...lotypecaller/readthreading/ReadThreadingGraph.java 88.608% <ø> (ø) 144 <0> (ø) ⬇️
...walkers/haplotypecaller/HaplotypeCallerEngine.java 78.125% <0%> (ø) 74 <0> (ø) ⬇️
...der/tools/walkers/mutect/M2ArgumentCollection.java 88% <100%> (+0.5%) 9 <1> (+1) ⬆️
...kers/haplotypecaller/AssemblyBasedCallerUtils.java 77.869% <100%> (+1.125%) 35 <0> (ø) ⬇️
.../readthreading/ReadThreadingAssemblerUnitTest.java 98.712% <100%> (-0.005%) 38 <0> (ø)
...der/tools/walkers/haplotypecaller/graphs/Path.java 95.161% <100%> (+0.424%) 26 <2> (+2) ⬆️
...ecaller/AssemblyBasedCallerArgumentCollection.java 100% <100%> (ø) 3 <1> (+2) ⬆️
...hellbender/tools/walkers/mutect/Mutect2Engine.java 90.116% <100%> (+0.235%) 65 <2> (+2) ⬆️
...pecaller/readthreading/ReadThreadingAssembler.java 68.539% <100%> (+0.041%) 51 <2> (-1) ⬇️
... and 39 more

@meganshand
Copy link
Contributor

@davidbenjamin Thank you for doing this! Can you share the results you got on the mixtures? I'd be happy to try out this branch on our technical replicates next week.

@ldgauthier
Copy link
Contributor

@davidbenjamin Fantastic! Have you talked to Sarah yet? Should I pass along a jar?

@davidbenjamin
Copy link
Contributor Author

@ldgauthier I gave her one about ten days ago. It looks fine so far on her RNA data.

@davidbenjamin
Copy link
Contributor Author

@meganshand I ran the "Full Pipeline" workflows in a clone of your FC workspace: https://portal.firecloud.org/#workspaces/broad-firecloud-dsde/copy-of-megans-m2-mito-validations. I did not run any of the things that generate graphs because they were harder for me to understand. To compare the new results to your previous ones, I took all variants that were either PASS or had only the contamination filter applied, extracted just the locus and alleles columns, then manually inspected the diff. For the 5% and 50% spike-ins there were usually no differences at all, while for the 1% spike-in the difference was usually 2-5 variants that straddled the LOD threshold.

@ldgauthier
Copy link
Contributor

ldgauthier commented Dec 3, 2018 via email

@davidbenjamin
Copy link
Contributor Author

@takutosato Based on all of our validations I added a commit to make this the default for M2. Because M2 shares a nested argument collection with HaplotypeCaller, this was pretty awkward. Louis told me this was the best among bad options.

Copy link
Contributor

@takutosato takutosato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very cool method. Just a couple very minor comments. Looks good otherwise.


import java.util.*;
import java.util.stream.Collectors;
import java.util.stream.IntStream;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some of these imports are not used

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

return FastMath.max(leftLogOdds, rightLogOdds);
}

// is the chain
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

either remove or add a doc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


import static org.testng.Assert.*;

public class AdaptiveChainPrunerUnitTest {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either remove or add tests here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted -- the tests are all in ChainPRunerUnitTest.

@davidbenjamin
Copy link
Contributor Author

back to @takutosato

@davidbenjamin
Copy link
Contributor Author

Oh wait, approving review, got it.

@davidbenjamin davidbenjamin merged commit 079d34a into master Dec 5, 2018
@davidbenjamin davidbenjamin deleted the db_pruning branch December 5, 2018 23:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants