-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Abolish unfilled likelihoods and revamp VariantAnnotator #6172
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions about preserving old test behavior.
.../java/org/broadinstitute/hellbender/cmdline/argumentcollections/DbsnpArgumentCollection.java
Show resolved
Hide resolved
src/main/java/org/broadinstitute/hellbender/tools/walkers/annotator/VariantAnnotator.java
Outdated
Show resolved
Hide resolved
src/main/java/org/broadinstitute/hellbender/tools/walkers/annotator/VariantAnnotator.java
Outdated
Show resolved
Hide resolved
src/main/java/org/broadinstitute/hellbender/tools/walkers/vqsr/VariantRecalibrator.java
Show resolved
Hide resolved
.../resources/org/broadinstitute/hellbender/tools/walkers/annotator/VariantAnnotator/indels.vcf
Outdated
Show resolved
Hide resolved
...a/org/broadinstitute/hellbender/tools/walkers/annotator/VariantAnnotatorIntegrationTest.java
Show resolved
Hide resolved
final ArgumentsBuilder args = new ArgumentsBuilder() | ||
.addVCF(inputVCF) | ||
.addOutput(outputVCF) | ||
.addArgument(DbsnpArgumentCollection.DBSNP_LONG_NAME, dbsnp_138_b37_20_21_vcf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we supposed to match alleles with dbSNP now? There certainly was a time when we didn't, i.e. a different ALT at the same position would still get annotated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's unchanged. The dbSNP annotation still cares only about position.
...a/org/broadinstitute/hellbender/tools/walkers/annotator/VariantAnnotatorIntegrationTest.java
Show resolved
Hide resolved
...a/org/broadinstitute/hellbender/tools/walkers/annotator/VariantAnnotatorIntegrationTest.java
Show resolved
Hide resolved
...a/org/broadinstitute/hellbender/tools/walkers/annotator/VariantAnnotatorIntegrationTest.java
Show resolved
Hide resolved
0cef5f2
to
5ffdb83
Compare
Back to @ldgauthier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Here's the rundown of this PR:
We had this awkward implementation of
AlleleLikelihoods
calledUnfilledReadLikelihoods
, which didn't contain likelihoods but did have pileups. This was necessary because the annotation engine expects likelihoods butVariantAnnotator
only has the reads. Several annotations had fallback code to annotate a likelihoods object that had no likelihoods. @vruano This is the class that you disliked so much in your recent code review of #5783A few issues with this state of things:
AlleleLikelihoods
, which forced the class to have methods likehasLikelihoods()
.VariantAnnotator
only applied the few annotations that had custom pileup-based fallback code.So the first step was the option that @lbergelson and @jamesemery liked most: create a regular likelihoods object in
VariantAnnotator
by hard-assigning of each read to the allele it best matches. This is exactly what all the custom fallback modes were doing in effect, but now it's implemented in one place instead of six or so. This lets us deleteUnfilledLikelihoods
and also letsVariantAnnotator
apply any annotation.@ldgauthier Since the most non-trivial aspect is the new integration test I'm inclined to assign you the review, but a case could be made for someone on the engine team.
This completely broke the
VariantAnnotator
tests, which were based on exact matches. This had been an issue before and has always been a bit of a nuisance, but now overhauling the tests became completely unavoidable. So, I rewrote all the tests and wrote a rigorous test based on concordance with annotations fromMutect2
.If I were reviewing I would start with the new code in
VariantAnnotator
that constructs the likelihoods object from the reads and verify that it is just a more polished version of the fallback code that several annotations used to have. Then I would look at the newVariantAnnotator
integration tests. Some of the tolerances are fairly liberal but it's worth noting that much of the old exact match "truth" annotations were completely bogus. This is better than what we had before by a long shot but it's still use at your own risk.