Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FilterMutectCalls in 4.2.0.0 has an error with AS_UNIQ_ALT_READ_COUNT annotation format #7298

Open
GATKSupportTeam opened this issue Jun 8, 2021 · 3 comments
Assignees

Comments

@GATKSupportTeam
Copy link
Collaborator

GATKSupportTeam commented Jun 8, 2021

It looks like this is a bug with 4.2.0.0 because the same Mutect2 output has no issues with FilterMutectCalls 4.1.6.0.

This request was created from a contribution made by Qihan Long on June 04, 2021 03:21 UTC.

Link: https://gatk.broadinstitute.org/hc/en-us/community/posts/360078236332-The-output-of-Mutect2-cannot-be-accepted-by-FilterMutectCalls-in-GATK-4-2-0-0

--

If you are seeing an error, please provide(REQUIRED) :

a) GATK version used: 4.2.0.0

b) Exact command used: 

gatk FilterMutectCalls \

-R /public1/data/resources/ref_genome/GRCh38/GRCh38.d1.vd1.fa \

-V somatic_mutation/Mutect2/test.vcf.gz \

-O somatic_mutation/FilterMutectCalls/test.vcf.gz

c) Entire error log:

I used the "--enable-all-annotations" option within Mutect2 to get a vcf file with abundant information. However, the following FilterMutectCalls step seemed to be intolerant of some information within previous step's vcf file record.

The intolerated record within vcf listed below: 

chr1 6197724 . C CT,CTT,CTTT . . AC=1,1,1;AF=0.167,0.167,0.167;AN=6;AS_BaseQRankSum=-6.431;AS_MQ=60.00,60.00,60.00;AS_MQRankSum=0.000;AS_ReadPosRankSum=5.751;AS_SB_TABLE=42,880|3,164|3,32|0,14;AS_UNIQ_ALT_READ_COUNT=167|35|14;BQHIST=5,1,0,0,1,11,2,0,0,0,14,2,0,0,1,15,1,0,0,0,16,1,0,0,0,17,0,2,0,0,18,2,0,0,1,19,6,0,1,0,20,25,0,2,2,21,13,0,1,2,22,20,0,3,3,23,2,1,2,1,24,6,0,2,0,25,21,1,4,7,26,33,0,5,6,27,18,0,0,7,28,29,0,0,4,29,26,2,4,8,30,161,4,5,51,31,263,2,3,51,32,129,2,3,22,33,41,0,0,0,34,15,0,0,0,35,20,0,0,0,36,19,0,0,0,37,12,0,0,0,38,1,0,0,0,39,9,0,0,0,41,18,0,0,0,44,26,0,0,0;BaseQRankSum=-6.431;ClippingRankSum=-7.714;DP=1323;ECNT=1;FS=0.000;LikelihoodRankSum=-7.886;MBQ=31,30,26,30;MFRL=6590,6585,4819,6586;MMQ=60,60,60,60;MPOS=16,15,7;MQ=59.98;MQ0=0;MQRankSum=0.000;NALOD=0.569,1.49,1.49;NCC=0;NCount=0;NLOD=27.80,30.51,30.97;OCM=0;POPAF=6.00,6.00,6.00;REF_BASES=GAACTTGCTTCTTTTTTTTGC;RPA=8,9,10,11;RU=T;ReadPosRankSum=5.751;SOR=1.152;STR;Samples=TCGA-NJ-A55R-01A-11R-A262-07;TLOD=284.47,51.82,3.50 GT:AD:AF:DP:F1R2:F2R1:SB 0/1/2/3:819,166,35,14:0.161,0.034,0.014:1034:365,76,17,4:440,87,17,8:16,803,6,209 0/0:103,1,0,0:0.017,8.250e-03,8.221e-03:104:50,1,0,0:52,0,0,0:26,77,0,1

The error log that FilterMutectCalls emited was listed below:

Using GATK jar /home/lqh/software/GATK-4.2.0.0/gatk-package-4.2.0.0-local.jar

Running:

java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /home/lqh/software/GATK-4.2.0.0/gatk-package-4.2.0.0-local.jar FilterMutectCalls -R /public1/data/resources/ref_genome/GRCh38/GRCh38.d1.vd1.fa -V somatic_mutation/Mutect2/test.vcf.gz -O somatic_mutation/FilterMutectCalls/test.vcf.gz

11:03:39.517 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/lqh/software/GATK-4.2.0.0/gatk-package-4.2.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so

Jun 04, 2021 11:03:49 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine

INFO: Failed to detect whether we are running on Google Compute Engine.

11:03:49.968 INFO FilterMutectCalls - ------------------------------------------------------------

11:03:49.969 INFO FilterMutectCalls - The Genome Analysis Toolkit (GATK) v4.2.0.0

11:03:49.969 INFO FilterMutectCalls - For support and documentation go to https://software.broadinstitute.org/gatk/

11:03:49.969 INFO FilterMutectCalls - Executing as lqh@master on Linux v5.6.14-1.el7.elrepo.x86_64 amd64

11:03:49.969 INFO FilterMutectCalls - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_152-b16

11:03:49.969 INFO FilterMutectCalls - Start Date/Time: June 4, 2021 11:03:39 AM CST

11:03:49.969 INFO FilterMutectCalls - ------------------------------------------------------------

11:03:49.969 INFO FilterMutectCalls - ------------------------------------------------------------

11:03:49.970 INFO FilterMutectCalls - HTSJDK Version: 2.24.0

11:03:49.970 INFO FilterMutectCalls - Picard Version: 2.25.0

11:03:49.970 INFO FilterMutectCalls - Built for Spark Version: 2.4.5

11:03:49.970 INFO FilterMutectCalls - HTSJDK Defaults.COMPRESSION_LEVEL : 2

11:03:49.970 INFO FilterMutectCalls - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false

11:03:49.970 INFO FilterMutectCalls - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true

11:03:49.970 INFO FilterMutectCalls - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false

11:03:49.970 INFO FilterMutectCalls - Deflater: IntelDeflater

11:03:49.971 INFO FilterMutectCalls - Inflater: IntelInflater

11:03:49.971 INFO FilterMutectCalls - GCS max retries/reopens: 20

11:03:49.971 INFO FilterMutectCalls - Requester pays: disabled

11:03:49.971 INFO FilterMutectCalls - Initializing engine

11:03:50.504 INFO FeatureManager - Using codec VCFCodec to read file file:///home/lqh/somatic_mutation/Mutect2/test.vcf.gz

11:03:50.696 INFO FilterMutectCalls - Done initializing engine

11:03:50.840 INFO ProgressMeter - Starting traversal

11:03:50.840 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute

11:03:50.841 INFO FilterMutectCalls - Starting pass 0 through the variants

11:03:51.014 INFO FilterMutectCalls - Shutting down engine

[June 4, 2021 11:03:51 AM CST] org.broadinstitute.hellbender.tools.walkers.mutect.filtering.FilterMutectCalls done. Elapsed time: 0.19 minutes.

Runtime.totalMemory()=625999872

java.lang.NumberFormatException: For input string: "167|35|14"

at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)

at java.lang.Integer.parseInt(Integer.java:580)

at java.lang.Integer.valueOf(Integer.java:766)

at htsjdk.variant.variantcontext.CommonInfo.lambda$getAttributeAsIntList$1(CommonInfo.java:288)

at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)

at java.util.Collections$2.tryAdvance(Collections.java:4717)

at java.util.Collections$2.forEachRemaining(Collections.java:4725)

at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)

at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)

at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)

at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)

at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)

at htsjdk.variant.variantcontext.CommonInfo.getAttributeAsList(CommonInfo.java:274)

at htsjdk.variant.variantcontext.CommonInfo.getAttributeAsIntList(CommonInfo.java:282)

at htsjdk.variant.variantcontext.VariantContext.getAttributeAsIntList(VariantContext.java:827)

at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.DuplicatedAltReadFilter.areAllelesArtifacts(DuplicatedAltReadFilter.java:26)

at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.HardAlleleFilter.calculateErrorProbabilityForAlleles(HardAlleleFilter.java:16)

at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.Mutect2AlleleFilter.errorProbabilities(Mutect2AlleleFilter.java:86)

at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.ErrorProbabilities.lambda$new$0(ErrorProbabilities.java:27)

at java.util.stream.Collectors.lambda$toMap$58(Collectors.java:1321)

at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169)

at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)

at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)

at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)

at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)

at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)

at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)

at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.ErrorProbabilities.(ErrorProbabilities.java:25)

at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.Mutect2FilteringEngine.accumulateData(Mutect2FilteringEngine.java:138)

at org.broadinstitute.hellbender.tools.walkers.mutect.filtering.FilterMutectCalls.nthPassApply(FilterMutectCalls.java:154)

at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.lambda$traverse$0(MultiplePassVariantWalker.java:40)

at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.lambda$traverseVariants$1(MultiplePassVariantWalker.java:77)

at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)

at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)

at java.util.Iterator.forEachRemaining(Iterator.java:116)

at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)

at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)

at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)

at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)

at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)

at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)

at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)

at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.traverseVariants(MultiplePassVariantWalker.java:75)

at org.broadinstitute.hellbender.engine.MultiplePassVariantWalker.traverse(MultiplePassVariantWalker.java:40)

at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1058)

at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)

at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)

at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)

at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)

at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)

at org.broadinstitute.hellbender.Main.main(Main.java:289)

What's worth noted is that I tested FilterMutectCalls within GATK-4.1.6.0, it surly can accept the vcf record above and output normal records as below: 

chr1 6197724 . C CT,CTT,CTTT . multiallelic AC=1,1,1;AF=0.167,0.167,0.167;AN=6;AS_BaseQRankSum=-6.431;AS_MQ=60.00,60.00,60.00;AS_MQRankSum=0.000;AS_ReadPosRankSum=5.751;AS_SB_TABLE=42,880|3,164|3,32|0,14;AS_UNIQ_ALT_READ_COUNT=167|35|14;BQHIST=5,1,0,0,1,11,2,0,0,0,14,2,0,0,1,15,1,0,0,0,16,1,0,0,0,17,0,2,0,0,18,2,0,0,1,19,6,0,1,0,20,25,0,2,2,21,13,0,1,2,22,20,0,3,3,23,2,1,2,1,24,6,0,2,0,25,21,1,4,7,26,33,0,5,6,27,18,0,0,7,28,29,0,0,4,29,26,2,4,8,30,161,4,5,51,31,263,2,3,51,32,129,2,3,22,33,41,0,0,0,34,15,0,0,0,35,20,0,0,0,36,19,0,0,0,37,12,0,0,0,38,1,0,0,0,39,9,0,0,0,41,18,0,0,0,44,26,0,0,0;BaseQRankSum=-6.431;CONTQ=93;ClippingRankSum=-7.714;DP=1323;ECNT=1;FS=0.000;GERMQ=93;LikelihoodRankSum=-7.886;MBQ=31,30,26,30;MFRL=6590,6585,4819,6586;MMQ=60,60,60,60;MPOS=16,15,7;MQ=59.98;MQ0=0;MQRankSum=0.000;NALOD=0.569,1.49,1.49;NCC=0;NCount=0;NLOD=27.80,30.51,30.97;OCM=0;POPAF=6.00,6.00,6.00;REF_BASES=GAACTTGCTTCTTTTTTTTGC;RPA=8,9,10,11;RU=T;ReadPosRankSum=5.751;SEQQ=93;SOR=1.152;STR;STRANDQ=93;STRQ=21;Samples=TCGA-NJ-A55R-01A-11R-A262-07;TLOD=284.47,51.82,3.50 GT:AD:AF:DP:F1R2:F2R1:SB 0/1/2/3:819,166,35,14:0.161,0.034,0.014:1034:365,76,17,4:440,87,17,8:16,803,6,209 0/0:103,1,0,0:0.017,8.250e-03,8.221e-03:104:50,1,0,0:52,0,0,0:26,77,0,1

Much appreciated for your precious time to inspect my report, thanks!

(created from Zendesk ticket #163431)
gz#163431

gz#276553

(related to Zendesk ticket #276553)

@gbrandt6
Copy link
Contributor

The user found another issue with Mutect2 in GATK 4.2.0.0: the AS_MQ annotation contains a comma and cannot be parsed by other tools. I have recommended that the user work around this by leaving out this annotation but we should continue to look at a fix for these issues in GATK 4.2.0.0

@fleharty
Copy link
Contributor

@davidbenjamin Could you take a look into this, sounds like an easy fix.

@gbrandt6
Copy link
Contributor

There is another issue with how this annotation interacts with germline calling, specifically an issue coming up in GenomicsDBImport when users add all annotations to their GVCF from HaplotypeCaller. Forum post: https://gatk.broadinstitute.org/hc/en-us/community/posts/4659308628891-GenomicsDBImport-error-from-fields-in-AS-UNIQ-ALT-READ-COUNT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants