-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#251 Address gvcf no-calls missing QUALapprox and other features #7146
Conversation
// create VET output | ||
if (!variant.isReferenceBlock()) { | ||
// write to VET if NOT reference block and NOT a no call (set GQ to 0 for variant) | ||
if (!variant.isReferenceBlock() && !variant.getGenotype(0).isNoCall()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why getGenotype(0)
? what is 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0 is the index/position of what getGenotype returns for variant
and in this case it gives back the GT for variant
which then is used to check if its a no-call. aka is ./. = no-call for those that are no calls
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if a variant is a no call, it should have only one genotype where isNoCall is true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so is that to say, it's safe to only look at the first item in the list of GTs? a sample could have more than one GT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the code is a little hard to track down, but I do think that for isNoCall to be set to true, this must hold
public static boolean wouldBeNoCallAllele(final byte[] bases) { return bases.length == 1 && bases[0] == htsjdk.variant.vcf.VCFConstants.NO_CALL_ALLELE; }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
variant.getGenotype()
returns a list of lists, one list for each sample. if we are certain that there will always be exactly one sample per gvcf being processed here, then this is safe. I just don't know if this is always going to be true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sushma added a constant and some comments to make this assumption super clear
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This tools runs on a single-sample gVCF, I think it's totally reasonable to assume there is a single genotype (since there is a single sample and a sample can only have one genotype)
src/main/java/org/broadinstitute/hellbender/tools/variantdb/nextgen/PetTsvCreator.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
...ain/java/org/broadinstitute/hellbender/tools/variantdb/nextgen/CreateVariantIngestFiles.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sample index helps a lot thanks!
// write to VET if NOT reference block and NOT a no call | ||
// getGenotypes() returns list of lists for all samples at variant | ||
// assuming one sample per gvcf, getGenotype(0) retrieves GT for sample at index 0 | ||
final int SAMPLE_INDEX_NOCALL_CHECK = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just style... I think it's fine to not have this pulled out into a variable... but if you do
- all caps typically indicates a static constant (so I would move this up and do that)
- you do the same getGenotype(0) in PetTsvCreator, so it should refer to the same constant
- at that point you might as well pull this into a static method like isNoCall(VariantContext v) can call that where you need it.
BUT -- I would say just doing the call to getGenotype(0) with the above comment is totally appropriate, just right now it's sort of halfway between the two options
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i liked option 3 so i gave it a go, think that it works and i did it correctly since testing all checks out but if i could get eyes on if i organized that method/code physically in the right place within the structure of the code that would be amazing.
@@ -20,6 +20,8 @@ | |||
import java.util.Set; | |||
import java.util.stream.Collectors; | |||
|
|||
import static org.broadinstitute.hellbender.tools.variantdb.nextgen.CreateVariantIngestFiles.isNoCall; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm a little surprised you needed this import. I think it's common when calling static methods on other classes to actually use the class name:
CreateVariantIngestFiles.isNoCall(variant)
so maybe just import the class as a regular import (instead of a static import of the method)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
* write variants with GQ no call to PET as GQ 0 and omit from VET * remove unused imports * update integration test with no_calls vcf file and ingest files * created constant for sample index and modified comments * create static method for isNoCall * remove static import of method
https://github.com/broadinstitute/dsp-spec-ops/issues/251