Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HaplotypeCaller GGA mode crashes with certain spanning deletions #5337

Closed
gmagoon opened this issue Oct 22, 2018 · 1 comment
Closed

HaplotypeCaller GGA mode crashes with certain spanning deletions #5337

gmagoon opened this issue Oct 22, 2018 · 1 comment

Comments

@gmagoon
Copy link

gmagoon commented Oct 22, 2018

Using HaplotypeCaller with GENOTYPE_GIVEN_ALLELES ("GGA") mode, I came across a couple of cases that crashed, and I traced them to spanning deletions (of the type considered in #4963).

The first case involved the following spanning deletion in the --alleles input:

22	16137300	rs567136176	TAG	T
22	16137302	rs573978809	G	C

and it crashed with:

java.lang.IllegalStateException: Allele in genotype TAG* not in the variant context [G*, *, C]
	at htsjdk.variant.variantcontext.VariantContext.validateGenotypes(VariantContext.java:1360)
	at htsjdk.variant.variantcontext.VariantContext.validate(VariantContext.java:1298)
	at htsjdk.variant.variantcontext.VariantContext.<init>(VariantContext.java:401)
	at htsjdk.variant.variantcontext.VariantContextBuilder.make(VariantContextBuilder.java:494)
	at htsjdk.variant.variantcontext.VariantContextBuilder.make(VariantContextBuilder.java:488)
	at org.broadinstitute.hellbender.utils.variant.GATKVariantContextUtils.simpleMerge(GATKVariantContextUtils.java:864)
	at org.broadinstitute.hellbender.utils.variant.GATKVariantContextUtils.simpleMerge(GATKVariantContextUtils.java:646)
	at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.AssemblyBasedCallerUtils.makeMergedVariantContext(AssemblyBasedCallerUtils.java:221)
	at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCallerGenotypingEngine.assignGenotypeLikelihoods(HaplotypeCallerGenotypingEngine.java:150)
	at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCallerEngine.callRegion(HaplotypeCallerEngine.java:599)
	at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller.apply(HaplotypeCaller.java:236)
	at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.processReadShard(AssemblyRegionWalker.java:291)
	at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.traverse(AssemblyRegionWalker.java:267)
	at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:966)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
	at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
	at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
	at org.broadinstitute.hellbender.Main.main(Main.java:289)

The second case included the following --alleles input:

22	16464044	rs571268158	CCAGGTCT	C
22	16464051	rs569099729	T	C

and it crashed similarly, with:

java.lang.IllegalStateException: Allele in genotype CCAGGTCT* not in the variant context [T*, *, C]
	at htsjdk.variant.variantcontext.VariantContext.validateGenotypes(VariantContext.java:1360)
	at htsjdk.variant.variantcontext.VariantContext.validate(VariantContext.java:1298)
	at htsjdk.variant.variantcontext.VariantContext.<init>(VariantContext.java:401)
	at htsjdk.variant.variantcontext.VariantContextBuilder.make(VariantContextBuilder.java:494)
	at htsjdk.variant.variantcontext.VariantContextBuilder.make(VariantContextBuilder.java:488)
	at org.broadinstitute.hellbender.utils.variant.GATKVariantContextUtils.simpleMerge(GATKVariantContextUtils.java:864)
	at org.broadinstitute.hellbender.utils.variant.GATKVariantContextUtils.simpleMerge(GATKVariantContextUtils.java:646)
	at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.AssemblyBasedCallerUtils.makeMergedVariantContext(AssemblyBasedCallerUtils.java:221)
	at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCallerGenotypingEngine.assignGenotypeLikelihoods(HaplotypeCallerGenotypingEngine.java:150)
	at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCallerEngine.callRegion(HaplotypeCallerEngine.java:599)
	at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller.apply(HaplotypeCaller.java:236)
	at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.processReadShard(AssemblyRegionWalker.java:291)
	at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.traverse(AssemblyRegionWalker.java:267)
	at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:966)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
	at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
	at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
	at org.broadinstitute.hellbender.Main.main(Main.java:289)

Based on the discussion surrounding #4963 and the test VCF, I gather that cases like these are intended to work, without crashing.

I was trying to figure out what was unique in these problematic cases, compared to the spanning deletion in the aforementioned test VCF. I noticed that the problematic cases both have the SNP at the very last base of the spanning deletion. I'm just speculating here, but maybe the issue is related to some sort of "off-by-one" bug?

This is based on testing with version 4.0.9.0.
I also tried with 4.0.5.1, and it didn't crash, but rather displayed warnings of the type discussed in #4963:
00:02:10.995 WARN HaplotypeCallerEngine - Multiple valid VCF records detected in the alleles input file at site 22:16137302-16137302, only considering the first record
00:03:08.220 WARN HaplotypeCallerEngine - Multiple valid VCF records detected in the alleles input file at site 22:16464051-16464051, only considering the first record

@gmagoon
Copy link
Author

gmagoon commented Oct 22, 2018

sorry, this is duplicate of #5336
I posted twice due to confusion related to recent github issues

@gmagoon gmagoon closed this as completed Oct 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant