-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Truvari bench finds no calls in vcf file. #137
Comments
Hello, Those warnings are just warnings, not errors. Truvari will throw errors if there is a problem with the VCF such as corrupted qual fields. Instead, the warnings are telling us that the
It's possible I'm forgetting something, so please check these things and if you're still having issues I will need to see the VCF. Perhaps not the entire file, but just your Have a great day, |
Hello,
Sorry, of course I meant warnings, not errors.
Yes, there is an
I hope that the
Yes mason does only create entrys with
The bed file is created by
Here is the start of my ##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed">
##INFO=<ID=END,Number=1,Type=Integer,Description="End position of the variant described in this record">
##INFO=<ID=SVLEN,Number=.,Type=Integer,Description="Difference in length between REF and ALT alleles">
##INFO=<ID=SVTYPE,Number=1,Type=String,Description="Type of structural variant">
##INFO=<ID=TARGETPOS,Number=1,Type=String,Description="Target position for duplications.">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##contig=<ID=chr21,length=46709983>
##ALT=<ID=DEL,Description="Deletion">
##ALT=<ID=INS,Description="Insertion of novel sequence">
##ALT=<ID=DUP,Description="Duplication">
##ALT=<ID=INV,Description="Inversion">
##reference=hg38_chr21.fa
##source=mason_variator
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT simulated
chr21 5030334 sim_trans_0 N A[chr21:5031186[ 1 PASS SVTYPE=BND GT ./.
chr21 5030335 sim_trans_0 N [chr21:5047468[A 1 PASS SVTYPE=BND GT ./.
chr21 5031185 sim_trans_0 N C[chr21:5047469[ 1 PASS SVTYPE=BND GT ./.
chr21 5031186 sim_trans_0 N [chr21:5030334[A 1 PASS SVTYPE=BND GT ./.
chr21 5047468 sim_trans_0 N T[chr21:5030335[ 1 PASS SVTYPE=BND GT ./.
chr21 5047469 sim_trans_0 N [chr21:5031185[C 1 PASS SVTYPE=BND GT ./.
chr21 5060293 sim_dup_0 N <DUP> 1 PASS END=5073031;SVLEN=12738;SVTYPE=DUP;TARGETPOS=chr21:5087033 GT ./.
chr21 5105328 sim_small_indel_0 N CTTTCAC 1 PASS . GT ./.
chr21 5126727 sim_small_indel_1 N C 1 PASS . GT ./.
chr21 5127415 sim_sv_indel_0 N <DEL> 1 PASS SVLEN=-18647;SVTYPE=DEL;END=5146062 GT ./.
chr21 5152875 sim_small_indel_2 N ACT 1 PASS . GT ./.
chr21 5217965 sim_sv_indel_1 N <INS> 1 PASS SVLEN=5919;SVTYPE=INS;END=5217965 GT ./.
chr21 5223979 sim_sv_indel_2 N <DEL> 1 PASS SVLEN=-9486;SVTYPE=DEL;END=5233465 GT ./.
chr21 5252802 sim_trans_1 N T[chr21:5271385[ 1 PASS SVTYPE=BND GT ./.
chr21 5252803 sim_trans_1 N [chr21:5288484[T 1 PASS SVTYPE=BND GT ./.
chr21 5256763 sim_small_indel_3 N C 1 PASS . GT ./.
chr21 5271384 sim_trans_1 N A[chr21:5288485[ 1 PASS SVTYPE=BND GT ./.
chr21 5271385 sim_trans_1 N [chr21:5252802[A 1 PASS SVTYPE=BND GT ./.
chr21 5288484 sim_trans_1 N T[chr21:5252803[ 1 PASS SVTYPE=BND GT ./.
chr21 5288485 sim_trans_1 N [chr21:5271384[T 1 PASS SVTYPE=BND GT ./.
chr21 5315180 sim_small_indel_4 N AGACAGAGAGGCTTGGA 1 PASS . GT ./.
chr21 5316839 sim_inv_0 N <INV> 1 PASS END=5335052;SVLEN=18213;SVTYPE=INV GT ./.
chr21 5337350 sim_dup_1 N <DUP> 1 PASS END=5350120;SVLEN=12770;SVTYPE=DUP;TARGETPOS=chr21:5366707 GT ./.
...
Thank you for your quick and helpful answer! |
I believe the problem might be with the The reason I'm suspicious of the bed is that the example entries you provided above worked just fine (see below). git checkout tags/v3.0.0
python3 -m pip install .
truvari bench -b ticket137.vcf.gz -c ticket137.vcf.gz -o test3.0 -f reference/grch38/GRCh38_1kg_mainchrs.fa
cat test3.0/summary.txt
{
"TP-base": 6,
"TP-call": 6,
"FP": 0,
"FN": 0,
"precision": 1.0,
"recall": 1.0,
"f1": 1.0,
"base cnt": 6,
"call cnt": 6,
"TP-call_TP-gt": 6,
"TP-call_FP-gt": 0,
"TP-base_TP-gt": 6,
"TP-base_FP-gt": 0,
"gt_concordance": 1.0
} |
It was indeed the bedfile. Thank you very much for your help. Now I have learned a lot more about truvari. |
Thanks for reporting this. As a note: I looked into it more and I believe you've actually found an error. Apparently pyintervaltree isn't behaving as I was expecting and therefore your bed files didn't contain the SVs. For example, say we have coordinates from BedOps >>> x = IntervalTree()
>>> x.addi(66234, 66235) # convert2bed coordinates of first entry
>>> x.overlaps(66234) # pysam.VariantRecord.start
True
>>> x.overlaps(66235) # pysam.VariantRecord.stop
False Position 66235 not overlapping position 66235 is not desired behavior. I'll close this ticket after I get the change checked in |
Version :
Truvari v3.0.0
Describe the bug :
Hello.
I am currently simulating a vcf file with the help of Mason and a fastq file from it. My tool calculates its own vcf from the fastq file and with
truvari bench
I compare the truth (the vcf from Mason) with my own.My problem: Mason's vcf does not seem to meet truvari's requirements, as truvari does not find any calls:
However, I don't get any error message and therefore wanted to ask which header information and which column information truvari accesses.
To Reproduce :
Here is my simulation script: https://github.com/Irallia/iGenVar/blob/TEST/benchmarks/dataset_comparison_simulate_data/test/benchmark/simulation/mason_simulation.sh
I can also send you my created vcf file for running
truvari bench
with it.Expected behavior :
A error message like:
Row 80 in your base vcf file was not accepted because the QUAL field was corrupted.
would be super helpful.Example Data :
As the files are huge, please ask and I give you access.
The text was updated successfully, but these errors were encountered: