-
Notifications
You must be signed in to change notification settings - Fork 596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GenotypeLikelihoodCalculators ArrayIndexOutOfBoundsException with HaplotypeCallerSpark #4661
Comments
I've just seen this error too when running It's odd that we get an |
It could be a synchronization issue. E.g. if thread A is asking for an alleleCount of 3 and and thread B an alleleCount of 4, then thread A could grow the array to 3 after thread B grows it to 4 (meaning the array is grown to size 4 but then set back to size 3), but before thread B reads position 4. This read will then fail. BTW I was seeing quite a lot of task failures, around 10%. |
Also here's the stacktrace I got:
|
I'm running in intermittent issues when running HaplotypeCallerSpark with GATK 4.0.3.0 and was hoping to generate ideas to debug further. The underlying error is an index error when calculating likelihoods:
I've been unable to generate a reproducible test case. Re-running on the same machine (Amazon m4.4xlarge instances with 16 cores and 64Gb of memory) works. I've seen the error on two different datasets but it happens infrequently as I've also run hundreds using the same setup without any exceptions.
The only other thing I spot when looking through the traceback is block issues about the RDDs but I'm not sure if these are a symptom of the failure or a cause:
Here's the full traceback of the failure:
and in case I am missing anything in how I'm calling HaplotypeCallerSpark, here is the full command line we're using:
Thanks so much for any clues about how to debug further or avoid the issue.
The text was updated successfully, but these errors were encountered: