Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

properly merge AD values when no PLs #7836

Merged
merged 8 commits into from
Jul 14, 2022
Merged

Conversation

RoriCremer
Copy link
Contributor

No description provided.

@RoriCremer RoriCremer requested a review from kcibul May 6, 2022 13:33
Copy link
Contributor

@kcibul kcibul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about a unit test as well? Likely you can add/modify an existing.

} else if (GenotypeGVCFsEngine.excludeFromAnnotations(g)) {
genotypeBuilder.alleles(Collections.nCopies(ploidy, Allele.NO_CALL));
}
if (g.hasAD()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move this up instead so the code is less branched/duplicative… and you've also lost the else-if case when we have ADs. like

if (g.hasPL() || g.hasAD()) {
  # do common stuff (like perSampleIndexesOfRelevantAlleles)
  if (g.hasPL() {
     # do PL only stuff
   }
   if (g.hasAD()) {   
      # do AD stuff
   }
} else if (GenotypeGVCFsEngine.excludeFromAnnotations(g)) {
   # as is
}

} else if (GenotypeGVCFsEngine.excludeFromAnnotations(g)) {
genotypeBuilder.alleles(Collections.nCopies(ploidy, Allele.NO_CALL));
}
if (g.hasAD()) {
int[] perSampleIndexesOfRelevantAlleles = AlleleSubsettingUtils.getIndexesOfRelevantAllelesForGVCF(remappedAlleles, targetAlleles, vc.getStart(), g, false);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea how expensive this operation is; might it be worth popping this a scoping level to be right under line 540 with a if (g.hasPL() || g.hasAD()) {... so it never gets called twice (lines 543 and 554)?

@codecov
Copy link

codecov bot commented May 6, 2022

Codecov Report

Merging #7836 (00c232b) into master (bd640ea) will increase coverage by 0.409%.
The diff coverage is 97.222%.

@@               Coverage Diff               @@
##              master     #7836       +/-   ##
===============================================
+ Coverage     86.666%   87.075%   +0.409%     
- Complexity     36781     37000      +219     
===============================================
  Files           2214      2221        +7     
  Lines         173551    173955      +404     
  Branches       18737     18795       +58     
===============================================
+ Hits          150409    151471     +1062     
+ Misses         16565     15847      -718     
- Partials        6577      6637       +60     
Impacted Files Coverage Δ
...lkers/ReferenceConfidenceVariantContextMerger.java 95.146% <90.000%> (+0.048%) ⬆️
...ferenceConfidenceVariantContextMergerUnitTest.java 98.030% <100.000%> (+0.277%) ⬆️
...ools/walkers/contamination/GetPileupSummaries.java 79.032% <0.000%> (-6.968%) ⬇️
...tools/walkers/haplotypecaller/HaplotypeCaller.java 85.714% <0.000%> (-6.286%) ⬇️
...on/MultisampleMultidimensionalKernelSegmenter.java 89.375% <0.000%> (-4.375%) ⬇️
...stitute/hellbender/tools/walkers/vqsr/Tranche.java 62.222% <0.000%> (-3.210%) ⬇️
...lbender/utils/read/SAMRecordToGATKReadAdapter.java 91.003% <0.000%> (-1.854%) ⬇️
...e/hellbender/tools/sv/cluster/SVClusterEngine.java 93.269% <0.000%> (-1.002%) ⬇️
...tools/walkers/sv/JointGermlineCNVSegmentation.java 86.047% <0.000%> (-0.752%) ⬇️
...lkers/vqsr/VariantRecalibratorIntegrationTest.java 98.315% <0.000%> (-0.481%) ⬇️
... and 106 more

@RoriCremer RoriCremer force-pushed the rc-correctly-merge-AD-values branch from 90d68a8 to 2152ad8 Compare May 10, 2022 17:30
@RoriCremer RoriCremer marked this pull request as ready for review May 10, 2022 17:32
@RoriCremer RoriCremer force-pushed the rc-correctly-merge-AD-values branch from 2152ad8 to b777040 Compare May 10, 2022 20:48
Copy link
Contributor

@kcibul kcibul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the test infrastructure correctly, but it seems right and I'm counting on a GATK reviewer to look at that more deeply

@@ -538,21 +538,26 @@ private GenotypesContext mergeRefConfidenceGenotypes(final VariantContext vc,
final int ploidy = g.getPloidy();
final GenotypeBuilder genotypeBuilder = new GenotypeBuilder(g);
if (!doSomaticMerge) {
if (g.hasPL()) {
// lazy initialization of the genotype index map by ploidy.
if (g.hasPL() | g.hasAD()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although technically the bitwise or operator | does what you want here, convention is to use conditional-or || unless you really want the non-conditional behavior.

tests.add(new Object[]{"test12",Arrays.asList(vcA_C_G_noPLs, vcA_C_G_ALT_noPLs), loc, false, true,
new VariantContextBuilder(VCbase).alleles(A_C_G).genotypes(
new GenotypeBuilder("A_C_G.test2").AD(new int[]{60,9,0}).alleles(noCalls).make(),
new GenotypeBuilder("A_C_G.test").AD(new int[]{60,9,0}).alleles(noCalls).make()).make()});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variant context has Arrays.asList(Aref, C, G, Allele.NON_REF_ALLELE) so the ADs should be of length 4. Do you mean LAD? If this is how you want it for GVS, then I'd like to make that more obvious some how.

Copy link
Contributor Author

@RoriCremer RoriCremer Jun 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the test (on line 51) we hardcode the filtering out of the Allele.NON_REF_ALLELE
( @param removeNonRefSymbolicAllele if true, remove the <NON_REF> allele from the merged VC )
during the merge.

So we dont expect to have a corresponding AD value for it (local or otherwise---but this shouldn't be for local AD)

@RoriCremer
Copy link
Contributor Author

merging w blessing from Jonn

@RoriCremer RoriCremer merged commit 804d1e2 into master Jul 14, 2022
@RoriCremer RoriCremer deleted the rc-correctly-merge-AD-values branch July 14, 2022 17:07
orlicohen pushed a commit that referenced this pull request Jul 18, 2022
* properly merge AD values when no PLs

* dont check for AD twice

* AD test

* conditional or

* update unit tests

* no reason to uniquify sample names

* got a lil test happy

* better comments
@koncheto-broad koncheto-broad restored the rc-correctly-merge-AD-values branch October 13, 2022 16:40
@RoriCremer RoriCremer deleted the rc-correctly-merge-AD-values branch December 13, 2023 20:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants