-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adaptive pruning option for local assembly #5473
Conversation
Codecov Report
@@ Coverage Diff @@
## master #5473 +/- ##
==============================================
+ Coverage 86.982% 87.07% +0.088%
- Complexity 31186 31244 +58
==============================================
Files 1914 1922 +8
Lines 144117 144210 +93
Branches 15933 15916 -17
==============================================
+ Hits 125356 125564 +208
+ Misses 13006 12872 -134
- Partials 5755 5774 +19
|
@davidbenjamin Thank you for doing this! Can you share the results you got on the mixtures? I'd be happy to try out this branch on our technical replicates next week. |
@davidbenjamin Fantastic! Have you talked to Sarah yet? Should I pass along a jar? |
@ldgauthier I gave her one about ten days ago. It looks fine so far on her RNA data. |
@meganshand I ran the "Full Pipeline" workflows in a clone of your FC workspace: https://portal.firecloud.org/#workspaces/broad-firecloud-dsde/copy-of-megans-m2-mito-validations. I did not run any of the things that generate graphs because they were harder for me to understand. To compare the new results to your previous ones, I took all variants that were either PASS or had only the contamination filter applied, extracted just the locus and alleles columns, then manually inspected the diff. For the 5% and 50% spike-ins there were usually no differences at all, while for the 1% spike-in the difference was usually 2-5 variants that straddled the LOD threshold. |
Great, thanks!
…On Mon, Dec 3, 2018 at 10:15 AM David Benjamin ***@***.***> wrote:
@meganshand <https://github.com/meganshand> I ran the "Full Pipeline"
workflows in a clone of your FC workspace:
https://portal.firecloud.org/#workspaces/broad-firecloud-dsde/copy-of-megans-m2-mito-validations.
I did not run any of the things that generate graphs because they were
harder for me to understand. To compare the new results to your previous
ones, I took all variants that were either PASS or had only the
contamination filter applied, extracted just the locus and alleles columns,
then manually inspected the diff. For the 5% and 50% spike-ins there were
usually no differences at all, while for the 1% spike-in the difference was
usually 2-5 variants that straddled the LOD threshold.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#5473 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGRhdMwCcuQyzMweZjxWrXBODTCBaOSIks5u1T_-gaJpZM4Y9STI>
.
--
Laura Doyle Gauthier, Ph.D.
Associate Director, Germline Methods
Data Sciences Platform
[email protected]
Broad Institute of MIT & Harvard
320 Charles St.
Cambridge MA 0214
|
a11aa05
to
bb66388
Compare
bb66388
to
56849e1
Compare
56849e1
to
ce8051c
Compare
@takutosato Based on all of our validations I added a commit to make this the default for M2. Because M2 shares a nested argument collection with HaplotypeCaller, this was pretty awkward. Louis told me this was the best among bad options. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very cool method. Just a couple very minor comments. Looks good otherwise.
|
||
import java.util.*; | ||
import java.util.stream.Collectors; | ||
import java.util.stream.IntStream; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some of these imports are not used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
return FastMath.max(leftLogOdds, rightLogOdds); | ||
} | ||
|
||
// is the chain |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
either remove or add a doc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
|
||
import static org.testng.Assert.*; | ||
|
||
public class AdaptiveChainPrunerUnitTest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either remove or add tests here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deleted -- the tests are all in ChainPRunerUnitTest
.
back to @takutosato |
Oh wait, approving review, got it. |
Closes #4867.
@takutosato Here it is. I'm not quite ready to make it the M2 default, but it looks really good.
@meganshand I have tested it on every mixture in your workspace and results look very similar to the previous hand-tuned pruning results. I'm hoping it's good enough to become best practices for mitochondria and would appreciate if you gave it a shot. You have the right to review if you wish but there's no pressure to do so.
@ldgauthier HaplotypeCaller might also benefit from this. In particular, I wonder about #3697. I'll test it out.