Skip to content

Scripts to make dosage files for PrediXcan directly from VCFs.

Notifications You must be signed in to change notification settings

danielsarj/VCF-to-dosages

Repository files navigation

TOPMed-MESA-scripts

Python scripts to make dosage files without ambiguous SNPs and INDELs directly from vcf.gz. Both scripts essentially work the same, in which the outputs will be a sample .txt file (which is required to run PrediXcan) and a dosage txt.gz file per chromosome, including non-autosomes (if provided). The only difference is that TOPMed_vcf2dosage_a.py takes a .vcf.gz containg only a single chromosome as input, whereas TOPMed_vcf2dosage_b.py takes as input a vcf.gz containing multiple chromosomes.

The scripts can also be customized if needed. As provided, they will rename all SNPs to the chr#:pos:ref:alt format and update chrX, chrXY, chrY, and chrM to their numeric versions.

Imported libraries:

  • argparse
  • gzip
  • os
  • sys

NOTE: this is a modified version of the script found in here.

About

Scripts to make dosage files for PrediXcan directly from VCFs.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages