Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge STR: More than one value found for END #196

Open
bharathramh opened this issue Dec 3, 2023 · 1 comment
Open

Merge STR: More than one value found for END #196

bharathramh opened this issue Dec 3, 2023 · 1 comment

Comments

@bharathramh
Copy link

bharathramh commented Dec 3, 2023

Im trying to merge around 3 files of GangSTR output into one mergefile. while doing it, Im facing an issue of

More than one value found for END

it ran for 89k lines and it stopped after this line

(bharath) []$ tail -1 test.vcf.vcf
chr10   17809113        .       CCTCCCCTCCCCTCCCCTCCCCTCC       .       .       .       END=17809163;PERIOD=5;RU=cctcc;REF=5.0;STUTTERUP=0.05;STUTTERDOWN=0.05;STUTTERP=0.9;EXPTHRESH=-1 GT:DP:Q:REPCN:REPCI:RC:ML:INS:STDERR:ENCLREADS:FLNKREADS:QEXP   0/0:49:1.0:5,5:5-5,5-5:29,20,0,0:286.292:419.097,96.3636:0.0,0.0:5,29:NULL:-1.0,-1.0,-1.0        0/0:28:0.999683:5,5:5-5,5-5:16,12,0,0:167.282:416.998,95.8552:0.0,0.0:5,16:NULL:-1.0,-1.0,-1.0   0/0:42:1.0:5,5:5-5,5-5:25,17,0,0:251.281:415.347,94.2128:0.0,0.0:5,25:NULL:-1.0,-1.0,-1.0

I couldn't able to figure out the error from the vcf files . the following lines are the next lines in each file, I have found out that multiple END values are given for the same location. but how do i resolve this issue?

(bharath) []$ zcat *.vcf.gz | grep -w "17813632"
chr10   17813632        .       TATA    .       .       .       END=17813701;EXPTHRESH=-1;GRID=1,5;PERIOD=2;REF=2;RU=ta;STUTTERDOWN=0.05;STUTTERP=0.9;STUTTERUP=0.05     GT:DP:Q:REPCN:REPCI:RC:ENCLREADS:FLNKREADS:ML:INS:STDERR:QEXP   0/0:27:0.970717:2,2:2-2,2-2:8,19,0,0:2,8:NULL:187.217:415.347,94.2128:0,0:-1,-1,-1
chr10   17813632        .       TATA    .       .       .       END=17813703;EXPTHRESH=-1;GRID=1,5;PERIOD=2;REF=2;RU=ta;STUTTERDOWN=0.05;STUTTERP=0.9;STUTTERUP=0.05     GT:DP:Q:REPCN:REPCI:RC:ENCLREADS:FLNKREADS:ML:INS:STDERR:QEXP   0/0:27:0.962654:2,2:2-2,2-2:8,19,0,0:2,8:NULL:187.622:415.347,94.2128:0,0:-1,-1,-1
chr10   17813632        .       TATA    .       .       .       END=17813703;EXPTHRESH=-1;GRID=1,5;PERIOD=2;REF=2;RU=ta;STUTTERDOWN=0.05;STUTTERP=0.9;STUTTERUP=0.05     GT:DP:Q:REPCN:REPCI:RC:ENCLREADS:FLNKREADS:ML:INS:STDERR:QEXP   0/0:20:0.315071:2,2:1-3,1-3:2,18,0,0:2,2:NULL:160.457:416.998,95.8552:0.466294,0.466294:-1,-1,-1
chr10   17813632        .       TATA    TATATA  .       .       END=17813701;EXPTHRESH=-1;GRID=1,6;PERIOD=2;REF=2;RU=ta;STUTTERDOWN=0.05;STUTTERP=0.9;STUTTERUP=0.05     GT:DP:Q:REPCN:REPCI:RC:ENCLREADS:FLNKREADS:ML:INS:STDERR:QEXP   0/1:24:0.33128:2,3:2-3,2-6:2,19,0,3:2,2:2,2|3,1:194.26:416.998,95.8552:0.500759,0.76488:-1,-1,-1
chr10   17813632        .       TATA    .       .       .       END=17813701;EXPTHRESH=-1;GRID=1,103;PERIOD=2;REF=2;RU=ta;STUTTERDOWN=0.05;STUTTERP=0.9;STUTTERUP=0.05   GT:DP:Q:REPCN:REPCI:RC:ENCLREADS:FLNKREADS:ML:INS:STDERR:QEXP   0/0:14:0.00196308:2,2:1-23,1-23:0,14,0,0:NULL:NULL:115.062:419.097,96.3636:7.19692,7.19692:-1,-1,-1
chr10   17813632        .       TATA    TATATA  .       .       END=17813703;EXPTHRESH=-1;GRID=1,103;PERIOD=2;REF=2;RU=ta;STUTTERDOWN=0.05;STUTTERP=0.9;STUTTERUP=0.05   GT:DP:Q:REPCN:REPCI:RC:ENCLREADS:FLNKREADS:ML:INS:STDERR:QEXP   1/1:14:0.00181527:3,3:1-24,1-24:0,14,0,0:NULL:NULL:115.131:419.097,96.3636:6.13539,6.13539:-1,-1,-1
@LiterallyUniqueLogin
Copy link
Contributor

On the line merging stopped at, looking at the POS and length of the REF allele you would conclude that the coordinate of the last base pair of the REF allele is "17809137". But the given END info field is "17809163". I assume it's erroring out because those don't match. I would look to see which of the POS/REF/END fields was incorrectly set upstream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants