Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GenotypeLikelihoodCalculators ArrayIndexOutOfBoundsException with HaplotypeCallerSpark #4661

Closed
chapmanb opened this issue Apr 16, 2018 · 3 comments

Comments

@chapmanb
Copy link

I'm running in intermittent issues when running HaplotypeCallerSpark with GATK 4.0.3.0 and was hoping to generate ideas to debug further. The underlying error is an index error when calculating likelihoods:

java.lang.ArrayIndexOutOfBoundsException: 4
	at org.broadinstitute.hellbender.tools.walkers.genotyper.GenotypeLikelihoodCalculators.calculateGenotypeCountUsingTables(GenotypeLikelihoodCalculators.java:388)

I've been unable to generate a reproducible test case. Re-running on the same machine (Amazon m4.4xlarge instances with 16 cores and 64Gb of memory) works. I've seen the error on two different datasets but it happens infrequently as I've also run hundreds using the same setup without any exceptions.

The only other thing I spot when looking through the traceback is block issues about the RDDs but I'm not sure if these are a symptom of the failure or a cause:

18/04/15 03:55:19 WARN BlockManager: Putting block rdd_18_12 failed due to an exception
18/04/15 03:55:19 WARN BlockManager: Block rdd_18_12 could not be removed as it was not found on disk or in memory

Here's the full traceback of the failure:

[2018-04-15T03:55Z] ip-10-0-0-57: 18/04/15 03:55:19 WARN BlockManager: Putting block rdd_18_12 failed due to an exception
[2018-04-15T03:55Z] ip-10-0-0-57: 18/04/15 03:55:19 WARN BlockManager: Block rdd_18_12 could not be removed as it was not found on disk or in memory
[2018-04-15T03:55Z] ip-10-0-0-57: 18/04/15 03:55:19 ERROR Executor: Exception in task 12.0 in stage 7.0 (TID 828)
[2018-04-15T03:55Z] ip-10-0-0-57: java.lang.ArrayIndexOutOfBoundsException: 4
[2018-04-15T03:55Z] ip-10-0-0-57: 	at org.broadinstitute.hellbender.tools.walkers.genotyper.GenotypeLikelihoodCalculators.calculateGenotypeCountUsingTables(GenotypeLikelihoodCalculators.java:388)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at org.broadinstitute.hellbender.tools.walkers.genotyper.GenotypeLikelihoodCalculators.getInstance(GenotypeLikelihoodCalculators.java:263)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at org.broadinstitute.hellbender.utils.variant.GATKVariantContextUtils.makeGenotypeCall(GATKVariantContextUtils.java:282)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at org.broadinstitute.hellbender.tools.walkers.genotyper.AlleleSubsettingUtils.subsetAlleles(AlleleSubsettingUtils.java:84)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at org.broadinstitute.hellbender.tools.walkers.genotyper.GenotypingEngine.calculateGenotypes(GenotypingEngine.java:296)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at org.broadinstitute.hellbender.tools.walkers.genotyper.GenotypingEngine.calculateGenotypes(GenotypingEngine.java:210)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCallerGenotypingEngine.assignGenotypeLikelihoods(HaplotypeCallerGenotypingEngine.java:150)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCallerEngine.callRegion(HaplotypeCallerEngine.java:565)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at org.broadinstitute.hellbender.tools.HaplotypeCallerSpark.lambda$regionToVariants$1(HaplotypeCallerSpark.java:271)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:267)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at java.util.Spliterators$IteratorSpliterator.tryAdvance(Spliterators.java:1812)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at java.util.stream.StreamSpliterators$WrappingSpliterator.lambda$initPartialTraversalState$0(StreamSpliterators.java:294)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.fillBuffer(StreamSpliterators.java:206)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.doAdvance(StreamSpliterators.java:169)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at java.util.stream.StreamSpliterators$WrappingSpliterator.tryAdvance(StreamSpliterators.java:300)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at java.util.Spliterators$1Adapter.hasNext(Spliterators.java:681)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:215)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1038)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1029)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:969)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1029)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:760)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:334)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at org.apache.spark.rdd.RDD.iterator(RDD.scala:285)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at org.apache.spark.scheduler.Task.run(Task.scala:108)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[2018-04-15T03:55Z] ip-10-0-0-57: 	at java.lang.Thread.run(Thread.java:745)

and in case I am missing anything in how I'm calling HaplotypeCallerSpark, here is the full command line we're using:

gatk-launch --java-options '-Xms1000m -Xmx46965m -Djava.io.tmpdir=/mnt/work/cwl/bcbio_validation_workflows/giab-joint/bunny_work/main-giab-joint-2018-04-14-195952.723/root/variantcall/2/variantcall_batch_region/3/bcbiotx/tmpno7wyh' HaplotypeCallerSpark --reference /mnt/work/cwl/bcbio_validation_workflows/giab-joint/biodata/collections/hg38/ucsc/hg38.2bit --annotation MappingQualityRankSumTest --annotation MappingQualityZero --annotation QualByDepth --annotation ReadPosRankSumTest --annotation RMSMappingQuality --annotation BaseQualityRankSumTest --annotation FisherStrand --annotation MappingQuality --annotation DepthPerAlleleBySample --annotation Coverage -I /mnt/work/cwl/bcbio_validation_workflows/giab-joint/bunny_work/main-giab-joint-2018-04-14-195952.723/root/alignment/1/process_alignment/align/NA24385/NA24385-sort.bam -L /mnt/work/cwl/bcbio_validation_workflows/giab-joint/bunny_work/main-giab-joint-2018-04-14-195952.723/root/variantcall/2/variantcall_batch_region/3/gatk-haplotype/chr15/NA24385-chr15_0_101977614-block-regions.bed --interval-set-rule INTERSECTION --spark-master local[16] --conf spark.local.dir=/mnt/work/cwl/bcbio_validation_workflows/giab-joint/bunny_work/main-giab-joint-2018-04-14-195952.723/root/variantcall/2/variantcall_batch_region/3/bcbiotx/tmpno7wyh --annotation ClippingRankSumTest --annotation DepthPerSampleHC --emit-ref-confidence GVCF -GQB 10 -GQB 20 -GQB 30 -GQB 40 -GQB 60 -GQB 80 --output /mnt/work/cwl/bcbio_validation_workflows/giab-joint/bunny_work/main-giab-joint-2018-04-14-195952.723/root/variantcall/2/variantcall_batch_region/3/bcbiotx/tmpno7wyh/NA24385-chr15_0_101977614-block.vcf

Thanks so much for any clues about how to debug further or avoid the issue.

@tomwhite
Copy link
Contributor

I've just seen this error too when running genome_reads-pipeline_hdfs.sh from https://github.com/broadinstitute/gatk/tree/master/scripts/spark_eval.

It's odd that we get an ArrayIndexOutOfBoundsException even though there's a ensureCapacity call in the previous block...

@tomwhite
Copy link
Contributor

It could be a synchronization issue. E.g. if thread A is asking for an alleleCount of 3 and and thread B an alleleCount of 4, then thread A could grow the array to 3 after thread B grows it to 4 (meaning the array is grown to size 4 but then set back to size 3), but before thread B reads position 4. This read will then fail.

BTW I was seeing quite a lot of task failures, around 10%.

@tomwhite
Copy link
Contributor

Also here's the stacktrace I got:

java.lang.ArrayIndexOutOfBoundsException: 3
	at org.broadinstitute.hellbender.tools.walkers.genotyper.GenotypeLikelihoodCalculators.calculateGenotypeCountUsingTables(GenotypeLikelihoodCalculators.java:388)
	at org.broadinstitute.hellbender.tools.walkers.genotyper.GenotypeLikelihoodCalculators.getInstance(GenotypeLikelihoodCalculators.java:263)
	at org.broadinstitute.hellbender.utils.variant.GATKVariantContextUtils.makeGenotypeCall(GATKVariantContextUtils.java:283)
	at org.broadinstitute.hellbender.tools.walkers.genotyper.AlleleSubsettingUtils.subsetAlleles(AlleleSubsettingUtils.java:92)
	at org.broadinstitute.hellbender.tools.walkers.genotyper.GenotypingEngine.calculateGenotypes(GenotypingEngine.java:296)
	at org.broadinstitute.hellbender.tools.walkers.genotyper.GenotypingEngine.calculateGenotypes(GenotypingEngine.java:210)
	at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCallerGenotypingEngine.assignGenotypeLikelihoods(HaplotypeCallerGenotypingEngine.java:162)
	at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCallerEngine.callRegion(HaplotypeCallerEngine.java:596)
	at org.broadinstitute.hellbender.tools.HaplotypeCallerSpark.lambda$regionToVariants$1(HaplotypeCallerSpark.java:282)
	at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:267)
	at java.util.Spliterators$IteratorSpliterator.tryAdvance(Spliterators.java:1812)
	at java.util.stream.StreamSpliterators$WrappingSpliterator.lambda$initPartialTraversalState$0(StreamSpliterators.java:294)
	at java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.fillBuffer(StreamSpliterators.java:206)
	at java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.doAdvance(StreamSpliterators.java:169)
	at java.util.stream.StreamSpliterators$WrappingSpliterator.tryAdvance(StreamSpliterators.java:300)
	at java.util.Spliterators$1Adapter.hasNext(Spliterators.java:681)
	at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42)
	at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:215)
	at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1038)
	at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1029)
	at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:969)
	at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1029)
	at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:760)
	at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:334)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:285)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
	at org.apache.spark.scheduler.Task.run(Task.scala:108)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants