gvcfgenotyper

A utility for merging and genotyping strelka2 GVCFs.

This tool provides basic genome VCF (GVCF) merging and genotyping functionality to provide a multisample BCF/VCF suitable for cohort analysis. Variants are normalised and decomposed on-the-fly before merging. Samples that do not have a particular variant have their homozygous reference confidence estimated from the GVCF depth blocks using some simple heuristics.

Caution:

This software is in early development, it is largely functional but may contain bugs.

There are various flavours of GVCF in the wild, this tool only works with the format produced by Illumina pipelines.

Installation

The only requirement is a C++11 compatible compiler.

git clone https://github.com/Illumina/gvcfgenotyper.git
cd gvcfgenotyper/
make
bin/gvcfgenotyper

Running

find directory/ -name '*genome.vcf.gz' > gvcfs.txt
time ./gvcfgenotyper -f genome.fa -l gvcfs.txt -Ob -o output.bcf

or with some trivial parallelism:

for i in {1..22} X;
do 
    echo -r $i -f genome.fa -l gvcfs.txt -Ob -o output.chr${i}.bcf;
done | xargs -l -P 23 ./gvcfgenotyper

If you are looking for a sequencing cohort to try this out, have a look at Polaris.

Known issues

Homozygous reference confidence (GQ and DP) works well for SNPs but is less reliable for indels. Our homozygous reference likelihoods are currently just dummy values eg. PL=0,255,255 and should not be used for any sophisticated analysis such as denovo mutation calling (Strelka has good joint-calling-from BAM functionality for small pedigrees).

Complex variants can occasionally contain primitive alleles called in other samples. We are investigating decomposition approaches for this problem.

We are working on multi-threading to improve performance.

Feedback

Please open an issue on github to provide feedback or ask questions.

Acknowledgements

This tool depends on htslib, googletest and spdlog. We also borrowed some variant normalisation code from BCFtools.

Name		Name	Last commit message	Last commit date
Latest commit History 465 Commits
bin		bin
build		build
docs		docs
external		external
src		src
test		test
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE.txt		LICENSE.txt
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

gvcfgenotyper

Caution:

Installation

Running

Known issues

Feedback

Acknowledgements

About

Releases

Packages

Contributors 3

Languages

License

Illumina/gvcfgenotyper

Folders and files

Latest commit

History

Repository files navigation

gvcfgenotyper

Caution:

Installation

Running

Known issues

Feedback

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages