Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FilterMutectCalls: errorRate must be good probability but got NaN #5821

Closed
igordot opened this issue Mar 20, 2019 · 24 comments
Closed

FilterMutectCalls: errorRate must be good probability but got NaN #5821

igordot opened this issue Mar 20, 2019 · 24 comments

Comments

@igordot
Copy link

igordot commented Mar 20, 2019

I ran GATK 4.1.0.0 Mutect2 on a small (~1Mb) targeted panel. I am using a normal control that is not the same individual (basically to exclude technical artifacts), so I do expect to see more variants than with a proper matched normal. I was getting around 100-300 variants per sample with GATK 4.0.6.0. I am still roughly in the same range for some samples GATK 4.1.0.0, but I am getting 0 for some.

The problem seems to be at the FilterMutectCalls stage where I am seeing the following error:

[March 19, 2019 10:43:17 PM EDT] org.broadinstitute.hellbender.tools.walkers.mutect.FilterMutectCalls done. Elapsed time: 0.04 minutes.
Runtime.totalMemory()=8851030016
java.lang.IllegalArgumentException: errorRate must be good probability but got NaN
    at org.broadinstitute.hellbender.utils.Utils.validateArg(Utils.java:730)
    at org.broadinstitute.hellbender.utils.QualityUtils.errorProbToQual(QualityUtils.java:227)
    at org.broadinstitute.hellbender.utils.QualityUtils.errorProbToQual(QualityUtils.java:211)
    at org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2FilteringEngine.applyContaminationFilter(Mutect2FilteringEngine.java:79)
    at org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2FilteringEngine.calculateFilters(Mutect2FilteringEngine.java:518)
    at org.broadinstitute.hellbender.tools.walkers.mutect.FilterMutectCalls.firstPassApply(FilterMutectCalls.java:130)
    at org.broadinstitute.hellbender.engine.TwoPassVariantWalker.lambda$traverseVariants$0(TwoPassVariantWalker.java:76)
    at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
    at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
    at java.util.Iterator.forEachRemaining(Iterator.java:116)
    at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
    at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
    at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
    at org.broadinstitute.hellbender.engine.TwoPassVariantWalker.traverseVariants(TwoPassVariantWalker.java:74)
    at org.broadinstitute.hellbender.engine.TwoPassVariantWalker.traverse(TwoPassVariantWalker.java:27)
    at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:966)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:138)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:162)
    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:205)
    at org.broadinstitute.hellbender.Main.main(Main.java:291)

I see there was a previous similar issue (#5553), but that one was apparently resolved. Curiously, I used several versions of GATK 4 and it never came up for me before.

@davidbenjamin
Copy link
Contributor

@igordot I believe this is fixed in master and will be in the 4.1.1.0 release this Thursday.

@igordot
Copy link
Author

igordot commented Mar 29, 2019

I tried 4.1.1.0. Although that error is fixed, now I am getting a new one:

java.lang.IllegalArgumentException: log10 p: Values must be non-infinite and non-NAN
	at org.broadinstitute.hellbender.utils.MathUtils.log10SumLog10(MathUtils.java:1023)
	at org.broadinstitute.hellbender.utils.MathUtils.log10SumLog10(MathUtils.java:995)
	at org.broadinstitute.hellbender.utils.MathUtils.log10SumLog10(MathUtils.java:999)
	at org.broadinstitute.hellbender.utils.MathUtils.normalizeLog10(MathUtils.java:1098)
	at org.broadinstitute.hellbender.utils.MathUtils.normalizeFromLog10ToLinearSpace(MathUtils.java:1074)
	at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.Mutect2FilteringEngine.posteriorProbabilityOfError(Mutect2FilteringEngine.java:91)
	at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.Mutect2FilteringEngine.posteriorProbabilityOfError(Mutect2FilteringEngine.java:76)
	at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.ContaminationFilter.calculateErrorProbability(ContaminationFilter.java:60)
	at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.Mutect2VariantFilter.errorProbability(Mutect2VariantFilter.java:15)
	at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.ErrorProbabilities.lambda$new$1(ErrorProbabilities.java:19)
	at java.util.stream.Collectors.lambda$toMap$58(Collectors.java:1321)
	at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169)
	at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374)
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
	at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.ErrorProbabilities.<init>(ErrorProbabilities.java:19)
	at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.Mutect2FilteringEngine.accumulateData(Mutect2FilteringEngine.java:136)
	at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.FilterMutectCalls.nthPassApply(FilterMutectCalls.java:140)
	at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.lambda$traverse$0(MultiplePassVariantWalker.java:31)
	at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.lambda$traverseVariants$1(MultiplePassVariantWalker.java:68)
	at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
	at java.util.Iterator.forEachRemaining(Iterator.java:116)
	at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
	at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
	at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
	at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.traverseVariants(MultiplePassVariantWalker.java:66)
	at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.traverse(MultiplePassVariantWalker.java:31)
	at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:984)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:138)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
	at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:162)
	at org.broadinstitute.hellbender.Main.mainEntry(Main.java:205)
	at org.broadinstitute.hellbender.Main.main(Main.java:291)

@davidbenjamin
Copy link
Contributor

@igordot Could you provide your command line? Also, could you check whether the error persists when you use a panel of normals, or nothing at all, instead of the unmatched normal?

@davidbenjamin
Copy link
Contributor

Also, is this hg38?

@igordot
Copy link
Author

igordot commented Mar 29, 2019

This is the command:

/path/gatk-4.1.1.0/gatk --java-options "-Xms8G -Xms8G" FilterMutectCalls \
--verbosity WARNING \
--reference /path/ref/hg38/genome.fa \
--unique-alt-read-count 5 \
--variant /path/SERACARE-NS18-16-HAPMAP-NS18-17.unfiltered.vcf \
--contamination-table /path/SERACARE-NS18-16-HAPMAP-NS18-17.contamination.txt \
--output /path/SERACARE-NS18-16-HAPMAP-NS18-17.original.vcf.gz  

And yes, this is hg38. Do you think it's related to that?

@davidbenjamin
Copy link
Contributor

A few recent bugs, which are all entirely my fault, came about because the liftover of gnomAD to hg38 (there is no official hg38 gnomAD yet) exposed some new edge cases, such as AF=. and AF=0, that caused errors. I suspect that you are seeing one that we hadn't found yet. Unfortunately, we do not have nearly validation on hg38.

Here's what I will do: 1) correct our hg38 gnomAD to fix liftover artifacts and put this new resource in the GATK bucket. 2) Create a Firecloud workspace with a few hg38 samples in order to reproduce the error and to make sure future changes don't create new problems 3) try to fix the error because even if 1) works it's sloppy to rely on the fact that gnomAD won't have these edge cases. I hope 1) succeeds because it will be available immediately without waiting for the next release.

@igordot
Copy link
Author

igordot commented Mar 31, 2019

Thank you for looking into it. I am curious if that is really the problem. If the reference files were causing problems, shouldn't that impact all samples? I am seeing this error with some, but not most of the samples. Even using a different matched control sample with the same tumor sample will cause or fix the error.

@davidbenjamin
Copy link
Contributor

What we saw recently wasn't the reference itself, but rather our AF-only gnomAD resource lifted-over to the hg38 reference. The error only came up for sites that reached genotyping, which depends on the specific tumor sample as well as the lack of evidence in the normal. That's why it only appeared in some tumor-normal combinations.

That being said, it might be something else. If you are able to share the unfiltered vcf file and the vcf.stats file it would be the most direct way to debug.

@davidbenjamin
Copy link
Contributor

@igordot I have not yet succeeded in reproducing the error with the few hg38 samples I have tested (2) and nothing obvious showed up in various grep regexes of our hg38 gnomAD (1). I am starting to think that we actually have solved all the hg38 issues and this is unrelated to my initial guess.

If you can share your unfiltered vcf input it would be very helpful, but if that's not possible could you post the contents of your contamination-table input? I have a hunch that the small size of the panel is causing CalculateContamination to give an unreliable result. For targeted panels we recommend running Mutect2 without CalculateContamination. If you're running from Terra/Firecloud or from the wdl, this means leaving the optional variants_for_contamination input empty.

@davidbenjamin
Copy link
Contributor

By the way, I believe the same issue has come up recently on the GATK forum: https://gatkforums.broadinstitute.org/gatk/discussion/23685/issues-on-filtermutectcalls-log10-probability-must-be-0-or-less#latest.

@igordot If your contamination table shows a contamination of 1 or greater don't bother sending the vcf -- contamination would definitely be the issue in that case.

@igordot
Copy link
Author

igordot commented Apr 2, 2019

You are right. The error seems to be happening when contamination is 1 or NaN. That is probably due to a non-matched normal. The same panel with a true matched normal gives much more reasonable results (<0.01), so I don't know if the panel size is entirely at fault here.

Should FilterMutectCalls just ignore contamination if the error is sufficiently high?

@davidbenjamin
Copy link
Contributor

davidbenjamin commented Apr 3, 2019

@igordot Thanks very much for following up on this. Just to clarify, are you saying that the 1/NaN contamination occurs when you run GetPileupSummaries on both the tumor and a non-matched normal and then run CalculateContamination using both of these outputs? If so, that will definitely cause problems. Running CalculateContamination on just the tumor output from GetPileupSummaries should work much better. We went to great lengths to make CalculateContamination work in tumor-only mode, although I would still be wary if your target territory is less than a few megabases.

Also, I would not recommend using a non-matched normal anywhere in Mutect2. Unless your panel has unique technical artifacts that don't resemble those in an exome I would recommend you run tumor-only mode but use use of the publicly-available panels of normals in the GATK bucket. A worse alternative but in my opinion still better than a non-matched normal would be to run Mutect2 in tumor-only mode separately on the tumor samples and the unmatched normal, then use the unmatched normal vcf as a blacklist for the tumor calls with SelectVariants.

@igordot
Copy link
Author

igordot commented Apr 3, 2019

Thank you for the suggestions.

I realize using a non-matched normal is not ideal. I was using it to at least filter out any technical artifacts. It seems to work well for that.

Where do I find the panel of normals that you mentioned?

@davidbenjamin
Copy link
Contributor

@igordot A few panel can be found in the GATK best practices bucket, for example: gs://gatk-best-practices/somatic-b37/Mutect2-exome-panel.vcf

@igordot
Copy link
Author

igordot commented Apr 10, 2019

How do I access that? I thought that the GATK resources were located here: https://console.cloud.google.com/storage/browser/genomics-public-data/resources/broad/hg38/v0/

Is there a reason this is not in the GATK resource bundle?

@davidbenjamin
Copy link
Contributor

They're public, so just install Google Cloud gsutil and copy with gsutil cp gs://gatk-best-practices/somatic-b37/Mutect2-exome-panel.vcf <local path to copy to>. Or, if you're running on the cloud, you don't even need to download anything, just run Mutect2 with the cloud paths eg

gatk Mutect2 -R ref.fasta -I tumor.bam -pon gs://gatk-best-practices/somatic-b37/Mutect2-exome-panel.vcf -O calls.vcf

If you install gsutil this works when running locally as well, but for speed I would recommend downloading the pon.

Is there a reason this is not in the GATK resource bundle?

Not that I can think of.

@igordot
Copy link
Author

igordot commented Apr 11, 2019

Is there also a hg38 version?

@davidbenjamin
Copy link
Contributor

There's gs://gatk-best-practices/somatic-hg38/1000g_pon.hg38.vcf.gz for hg38.

@MikeWLloyd
Copy link

I think this issue still persists:
The Genome Analysis Toolkit (GATK) v4.1.3.0

2019-10-29T18:18:04.000640760Z java.lang.IllegalArgumentException: log10 p: Values must be non-infinite and non-NAN
2019-10-29T18:18:04.001018863Z 	at org.broadinstitute.hellbender.utils.NaturalLogUtils.logSumExp(NaturalLogUtils.java:84)
2019-10-29T18:18:04.001194549Z 	at org.broadinstitute.hellbender.utils.NaturalLogUtils.normalizeLog(NaturalLogUtils.java:51)
2019-10-29T18:18:04.001367357Z 	at org.broadinstitute.hellbender.tools.walkers.mutect.clustering.SomaticClusteringModel.clusterProbabilities(SomaticClusteringModel.java:203)
2019-10-29T18:18:04.001518160Z 	at org.broadinstitute.hellbender.tools.walkers.mutect.clustering.SomaticClusteringModel.probabilityOfSequencingError(SomaticClusteringModel.java:96)
2019-10-29T18:18:04.001673083Z 	at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.TumorEvidenceFilter.calculateErrorProbability(TumorEvidenceFilter.java:27)
2019-10-29T18:18:04.001846904Z 	at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.Mutect2VariantFilter.errorProbability(Mutect2VariantFilter.java:15)
2019-10-29T18:18:04.002024760Z 	at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.ErrorProbabilities.lambda$new$1(ErrorProbabilities.java:19)
2019-10-29T18:18:04.002140012Z 	at java.util.stream.Collectors.lambda$toMap$58(Collectors.java:1321)
2019-10-29T18:18:04.002232542Z 	at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169)
2019-10-29T18:18:04.002242727Z 	at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
2019-10-29T18:18:04.002292461Z 	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
2019-10-29T18:18:04.002301667Z 	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
2019-10-29T18:18:04.002307019Z 	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
2019-10-29T18:18:04.002311722Z 	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
2019-10-29T18:18:04.002316449Z 	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
2019-10-29T18:18:04.002321526Z 	at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.ErrorProbabilities.<init>(ErrorProbabilities.java:19)
2019-10-29T18:18:04.002358113Z 	at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.Mutect2FilteringEngine.accumulateData(Mutect2FilteringEngine.java:141)
2019-10-29T18:18:04.002377342Z 	at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.FilterMutectCalls.nthPassApply(FilterMutectCalls.java:146)
2019-10-29T18:18:04.002383406Z 	at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.lambda$traverse$0(MultiplePassVariantWalker.java:40)
2019-10-29T18:18:04.002431769Z 	at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.lambda$traverseVariants$1(MultiplePassVariantWalker.java:77)
2019-10-29T18:18:04.002441351Z 	at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
2019-10-29T18:18:04.002446409Z 	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
2019-10-29T18:18:04.002493533Z 	at java.util.Iterator.forEachRemaining(Iterator.java:116)
2019-10-29T18:18:04.002503311Z 	at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
2019-10-29T18:18:04.002508016Z 	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
2019-10-29T18:18:04.002512520Z 	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
2019-10-29T18:18:04.002574562Z 	at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
2019-10-29T18:18:04.002625341Z 	at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
2019-10-29T18:18:04.002635077Z 	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
2019-10-29T18:18:04.002683298Z 	at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
2019-10-29T18:18:04.002692496Z 	at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.traverseVariants(MultiplePassVariantWalker.java:75)
2019-10-29T18:18:04.002697751Z 	at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.traverse(MultiplePassVariantWalker.java:40)
2019-10-29T18:18:04.002731707Z 	at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1048)
2019-10-29T18:18:04.002740306Z 	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139)
2019-10-29T18:18:04.002745164Z 	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
2019-10-29T18:18:04.002777218Z 	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
2019-10-29T18:18:04.002785268Z 	at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:162)
2019-10-29T18:18:04.002855927Z 	at org.broadinstitute.hellbender.Main.mainEntry(Main.java:205)
2019-10-29T18:18:04.002867030Z 	at org.broadinstitute.hellbender.Main.main(Main.java:291)

I am using ExAC lifted to hg38 as a germline resource in mutect2 with only a tumor sample, and getting the above error in filtermutectcalls. I recently updated to v4.1.3.0 to have the latest changes to mutect2. I was not having this issue with v4.0.5.1.

Here is extracted information from the VCF which caused the issue.

DP=1;ECNT=2;FS=0.000;MBQ=0,20;MFRL=0,91;MMQ=60,46;MPOS=6;MQ=46.00;POPAF=5.08;TLOD=4.20	GT:AD:AF:DP:F1R2:F2R1:PGT:PID:PS:SB	0|1:0,1:0.667:1:0,1:0,0:0|1:11155815_C_T:11155815:0,0,1,0
DP=1;ECNT=2;FS=0.000;MBQ=0,20;MFRL=0,91;MMQ=60,46;MPOS=16;MQ=46.00;POPAF=5.08;TLOD=4.20	GT:AD:AF:DP:F1R2:F2R1:PGT:PID:PS:SB	0|1:0,1:0.667:1:0,1:0,0:0|1:11155815_C_T:11155815:0,0,1,0
DP=1;ECNT=2;FS=0.000;MBQ=0,34;MFRL=0,272;MMQ=60,30;MPOS=25;MQ=30.00;POPAF=4.13;TLOD=4.20	GT:AD:AF:DP:F1R2:F2R1:PGT:PID:PS:SB	0|1:0,1:0.667:1:0,1:0,0:0|1:11350899_C_T:11350899:0,0,1,0
DP=1;ECNT=2;FS=0.000;MBQ=0,32;MFRL=0,272;MMQ=60,30;MPOS=15;MQ=30.00;POPAF=4.23;TLOD=4.20	GT:AD:AF:DP:F1R2:F2R1:PGT:PID:PS:SB	0|1:0,1:0.667:1:0,1:0,0:0|1:11350899_C_T:11350899:0,0,1,0

@davidbenjamin
Copy link
Contributor

@MikeWLloyd can you try version 4.1.4.0?

@MikeWLloyd
Copy link

Yes, I will try. I didn't realize 4.1.4.0 was available.

@davidbenjamin
Copy link
Contributor

No worries; fingers crossed.

@MikeWLloyd
Copy link

MikeWLloyd commented Oct 30, 2019

I loaded the docker repo GATK v4.1.4.0 and had the same (or similar) error result.

2019-10-30T13:35:51.791637449Z java.lang.IllegalArgumentException: log10 p: Values must be non-infinite and non-NAN
2019-10-30T13:35:51.792001654Z 	at org.broadinstitute.hellbender.utils.NaturalLogUtils.logSumExp(NaturalLogUtils.java:84)
2019-10-30T13:35:51.792175325Z 	at org.broadinstitute.hellbender.utils.NaturalLogUtils.normalizeLog(NaturalLogUtils.java:51)
2019-10-30T13:35:51.792358868Z 	at org.broadinstitute.hellbender.tools.walkers.mutect.clustering.SomaticClusteringModel.clusterProbabilities(SomaticClusteringModel.java:203)
2019-10-30T13:35:51.792559803Z 	at org.broadinstitute.hellbender.tools.walkers.mutect.clustering.SomaticClusteringModel.probabilityOfSequencingError(SomaticClusteringModel.java:96)
2019-10-30T13:35:51.792736667Z 	at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.TumorEvidenceFilter.calculateErrorProbability(TumorEvidenceFilter.java:27)
2019-10-30T13:35:51.792905235Z 	at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.Mutect2VariantFilter.errorProbability(Mutect2VariantFilter.java:15)
2019-10-30T13:35:51.793072365Z 	at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.ErrorProbabilities.lambda$new$1(ErrorProbabilities.java:19)
2019-10-30T13:35:51.793261944Z 	at java.util.stream.Collectors.lambda$toMap$58(Collectors.java:1321)
2019-10-30T13:35:51.793456807Z 	at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169)
2019-10-30T13:35:51.793619935Z 	at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
2019-10-30T13:35:51.793810301Z 	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
2019-10-30T13:35:51.794006885Z 	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
2019-10-30T13:35:51.794191116Z 	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
2019-10-30T13:35:51.794367593Z 	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
2019-10-30T13:35:51.794548129Z 	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
2019-10-30T13:35:51.794722501Z 	at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.ErrorProbabilities.<init>(ErrorProbabilities.java:19)
2019-10-30T13:35:51.794896154Z 	at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.Mutect2FilteringEngine.accumulateData(Mutect2FilteringEngine.java:141)
2019-10-30T13:35:51.795082090Z 	at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.FilterMutectCalls.nthPassApply(FilterMutectCalls.java:146)
2019-10-30T13:35:51.795253632Z 	at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.lambda$traverse$0(MultiplePassVariantWalker.java:40)
2019-10-30T13:35:51.795448274Z 	at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.lambda$traverseVariants$1(MultiplePassVariantWalker.java:77)
2019-10-30T13:35:51.795607447Z 	at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
2019-10-30T13:35:51.795775473Z 	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
2019-10-30T13:35:51.795944490Z 	at java.util.Iterator.forEachRemaining(Iterator.java:116)
2019-10-30T13:35:51.796108757Z 	at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
2019-10-30T13:35:51.796277399Z 	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
2019-10-30T13:35:51.796441683Z 	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
2019-10-30T13:35:51.796940319Z 	at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
2019-10-30T13:35:51.797119562Z 	at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
2019-10-30T13:35:51.797275911Z 	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
2019-10-30T13:35:51.797439525Z 	at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
2019-10-30T13:35:51.797567816Z 	at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.traverseVariants(MultiplePassVariantWalker.java:75)
2019-10-30T13:35:51.797740910Z 	at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.traverse(MultiplePassVariantWalker.java:40)
2019-10-30T13:35:51.797896360Z 	at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1048)
2019-10-30T13:35:51.798060735Z 	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139)
2019-10-30T13:35:51.798212290Z 	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
2019-10-30T13:35:51.798375318Z 	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
2019-10-30T13:35:51.798498807Z 	at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163)
2019-10-30T13:35:51.798643830Z 	at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206)
2019-10-30T13:35:51.798803653Z 	at org.broadinstitute.hellbender.Main.main(Main.java:292)
2019-10-30T13:35:51.902177042Z Using GATK jar /gatk/gatk-package-4.1.4.0-local.jar

Contents of *.vcf.stats

statistic	value
callable	0.0

Below is output from: grep -v '#' *.vcf | cut -f 8,9,10,11 | head -n 100. The VCF that failed in this case only has 3 calls. I am running Mutect2 in parallel by chromosome defined by interval files.

DP=1;ECNT=1;FS=0.000;MBQ=0,36;MFRL=0,143;MMQ=60,36;MPOS=24;MQ=36.00;POPAF=5.08;TLOD=3.78	GT:AD:AF:DP:F1R2:F2R1:SB	0/1:0,1:0.667:1:0,0:0,1:0,0,0,1
DP=4;ECNT=2;FS=0.000;MBQ=0,34;MFRL=0,275;MMQ=60,24;MPOS=7;MQ=24.78;POPAF=5.08;TLOD=17.30	GT:AD:AF:DP:F1R2:F2R1:PGT:PID:PS:SB	0|1:0,4:0.833:4:0,1:0,3:0|1:11155815_C_T:11155815:0,0,2,2
DP=4;ECNT=2;FS=0.000;MBQ=0,33;MFRL=0,275;MMQ=60,24;MPOS=17;MQ=24.78;POPAF=5.08;TLOD=17.30	GT:AD:AF:DP:F1R2:F2R1:PGT:PID:PS:SB	0|1:0,4:0.833:4:0,1:0,3:0|1:11155815_C_T:11155815:0,0,2,2

@davidbenjamin
Copy link
Contributor

davidbenjamin commented Oct 30, 2019

@MikeWLloyd It seems like you are running Mutect2 in parallel, which is totally fine, and then running FilterMutectCalls in parallel as well, which is not how the tool works. You need to run MergeMutectStats on the .vcf.stats files and run MergeVcfs on the scattered .vcfs, and then run FilterMutectCalls with the merged files as inputs. This is implemented in the mutect2 WDL: https://github.com/broadinstitute/gatk/blob/master/scripts/mutect2_wdl/mutect2.wdl and the featured workspace in Terra: https://app.terra.bio/#workspaces/help-gatk/Somatic-SNVs-Indels-GATK4.

The error seems to occur because the .stats file for the failing interval shows no callable depth. That is, every locus in the interval had a depth less than 10. Once you merge your files I would hope that somewhere in the genome there is a site with greater depth. (If not, you can adjust the threshold with the --callable-depth argument)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants