Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The vcfhub remove about 300Mb data in a 2.5Gb genome #7

Open
ld9866 opened this issue Feb 22, 2024 · 0 comments
Open

The vcfhub remove about 300Mb data in a 2.5Gb genome #7

ld9866 opened this issue Feb 22, 2024 · 0 comments

Comments

@ld9866
Copy link

ld9866 commented Feb 22, 2024

Dear developer:
We used Minigraph-Cactus to build a pan-genome and used Pangenie for individual typing. We found a large fragment of variation in some chromosomes in the genome group, which was lost after quality control. What caused this? Will it affect the subsequent analysis? Because we were trying to do a genome-wide association analysis of SV, we were puzzled by the lack of information in some chromosome fragments all the types: SNP, Indel, and SV.
Best day!

Code:
vcfbub -l 0 -r 100000 --input chr2.vcf.gz > chr2.ready.vcf

Example:
2 10510824 >18339279>18339282 GT AG 60.0 . GT 0 0 0 0 0 0 1 0 0 0 >
2 10510828 >18339282>18339285 CA TG 60.0 . GT 0 0 0 0 0 0 1 0 0 0 >
2 10510833 >18339285>18339288 CA TT 60.0 . GT 0 0 0 0 0 0 1 0 0 0 >
2 10510838 >18339288>18339291 CT TC 60.0 . GT 0 0 0 0 0 0 1 0 0 0 >
2 10510842 >18339291>18339294 GC TT 60.0 . GT 0 0 0 0 0 0 1 0 0 0 >
2 10510848 >18339294>18339297 C G 60.0 . GT 0 0 0 0 0 0 1 0 0 0 >
2 10510856 >18339297>18339300 GG CT 60.0 . GT 0 0 0 0 0 0 1 0 0 0 >
2 10510865 >18339300>18339303 A G 60.0 . GT 0 0 0 0 0 0 1 0 0 0 >
2 10510867 >18339303>18339305 TAAC T 60.0 . GT 0 0 0 0 0 0 1 0 0 0 0 >
2 10510886 >18339305>18339308 A G 60.0 . GT 0 0 0 0 0 0 1 0 0 0 >
2 10510889 >18339308>18339311 TACC ATTG 60.0 . GT 0 0 0 0 0 0 1 0 0 0 >
2 79046966 >21226444>21226446 A AACGAATCCGACTAGGAACCATGAGGTTGCAGGTTCGGTCCCTGCCCTTGCTCAGTGGGTTAACGATCCGGCGTTGCCGTGAGCTGTGGTGTAGATCACAGATGCAGCTTAGATCCTGAGTTGCTGTGGCTGTGGCATATGGTGGCAGCTGCTATCTGATTCGACCCCTAGACTGGGAACCTCCATATACCACGAGTGCAGTCCTA>
2 79047027 >21226446>21226449 A G 60.0 . GT 0 0 0 0 0 0 0 0 0 0 >
2 79047042 >21226449>21226452 CC CT,CCT 60.0 . GT 0 0 0 0 >
2 79047076 >21226452>21226455 A G 60.0 . GT 1 0 1 1 0 1 0 0 1 1 >
2 79047081 >21226455>21226457 CC C 60.0 . GT 0 0 0 0 0 0 0 0 0 0 0 >
2 79047086 >21226457>21226460 T C 60.0 . GT 0 0 0 0 0 0 0 0 0 0 >
2 79047149 >21226460>21226463 CC CAT 60.0 . GT 0 0 0 0 0 0 0 0 0 0 >
2 79047156 >21226463>21226466 CCG CA 60.0 . GT 0 0 0 0 0 0 0 0 0 0 >
2 79047160 >21226466>21226469 G A 60.0 . GT 0 0 0 0 0 0 0 0 0 0 >
2 79047166 >21226469>21226471 GG G 60.0 . GT 0 0 0 0 0 0 0 0 0 0 0 >

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant