Skip to content

Parse iVar output into beautiful heatmaps based on pre-defined list of SNVs associated with COVID-19 VOC/VOIs

Notifications You must be signed in to change notification settings

kbessonov1984/VCFParser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

52 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VCFParser

Render beautiful heatmaps of the SNVs of interest from any variant calling software. iVAR pipeline inputs are supported the most.

Extract non-duplicated SNVs from a VCF or iVar TSV file and render a heatmap for each SARS-COV2 Variable of Concern (VOC)

If SNVs of interest are not listed in the data/cov_lineage_variants.tsv file, append new ones or provide custom reference file by the -r parameter

To render all VOC plots defined in the cov_lineage_variants.tsv specify -voc all

Apply filters of min read coverage, SNV frequency and PHRED score quality

Requirements

  • pandas
  • python >= 3
  • matplotlib >= 3.3
  • openpyxl
  • pysam

Usage

$ vcfparser -h

usage: vcfparser parses VCF or TSV file and generates heatmaps and parsed VCF files on query SNVs/VOCs/VOIs.
The iVar TSV or VCF inputs are preferred (https://andersen-lab.github.io/ivar/html/manualpage.html) 

       [-h] (-i INPUT [INPUT ...] | -f INPUT_FILE | --clear_cov_cache)
       [-bam BAM_FILES [BAM_FILES ...]] [-voc VOC_NAMES] [-r REF_META]
       [--signature_snvs_only] [--key_snvs_only] [--stat_filter_snvs]
       [--subplots_mode SUBPLOTS_MODE] [--min_snv_freq_threshold [0-1]]
       [--min_depth_coverage [0-Inf]] [--min_quality [0-Inf]] [--annotate]
       [--dpi 400] [--font_size 2.5] [--annotate_text_color coral]

optional arguments:
  -h, --help            show this help message and exit
  -i INPUT [INPUT ...], --input INPUT [INPUT ...]
                        List of ivar_variants.vcf or ivar_variants.tsv to
                        summarise
  -f INPUT_FILE, --input_file INPUT_FILE
                        Input file with TSV/VCF and BAM file paths for batch
                        input
  --clear_cov_cache     Erase cache of SNV coverages generated by previous
                        runs (.cache_snv_coverages.json)
  -bam BAM_FILES [BAM_FILES ...], --bam_files BAM_FILES [BAM_FILES ...]
                        Optionally provide a list of corresponding bam files
                        in THE SAME ORDER as files provided for the -i
                        parameter
  -voc VOC_NAMES, --voc_names VOC_NAMES
                        List of Variants of Concern names (e.g. UK, SA,
                        Brazil, Nigeria)
  -r REF_META, --ref_meta REF_META
                        Path to metadata TSV file containing info on the key
                        mutations
  --signature_snvs_only
                        Check VCF for only signature/official snvs linked to a
                        VOC
  --key_snvs_only       Check VCF for only the key (S-gene associated) snvs
                        linked to a VOC
  --stat_filter_snvs    Filter snvs based on statistical significance (i.e. QC
                        PASS/FAIL flags)
  --subplots_mode SUBPLOTS_MODE
                        How to plot multiple plots (onerow, onecolumn,
                        oneplotperfile)
  --min_snv_freq_threshold [0-1]
                        Set minimum SNV frequency threshold to display
                        (default: 0)
  --min_depth_coverage [0-Inf]
                        Filter SNVs based on min depth coverage (default:0 =
                        no filtering)
  --min_quality [0-Inf]
                        Filter SNVs based on min PHRED sequencing quality
                        (default:0 = no filtering)
  --annotate            Annotate heatmap with SNV frequency values
  --dpi 400             DPI value for the heatmap rendering. Default value:
                        400
  --font_size 2.5       Labels font size for both axis: 2.5
  --annotate_text_color coral
                        Annotate text colour (freq. values)

About

Parse iVar output into beautiful heatmaps based on pre-defined list of SNVs associated with COVID-19 VOC/VOIs

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages