vg 1.58.0 - Cartari
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.58.0
Buildable Source Tarball: vg-v1.58.0.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg
build process needs.
This release includes:
vg deconstruct
now does path-based (formerly-e
) deconstruction by default. Old default behaviour of exhaustively processing (tiny) sites is deprecated.- if
-a
is not used,vg deconstruct
will recursive on child snarls of snarls it failed to process (likevg call
) - functionality that was, I think, dropped a while back. - Experimental option
-L
added tovg deconstruct
in order to cluster similar allele traversals together. The value given is a (length-weighted) threshold for the jaccard coefficient between the oriented nodes of two traversals. So if-L 0.75
is given, then alleles that have >= 0.75 similarity based on their graph positions will be merged into one. Two new FORMAT fields are added to keep track of the difference,TS
(jaccard distance) andTL
(length difference). Clustering is done greedily starting with selected reference paths. - new (experimental) option
-n
added tovg deconstruct
. Like-a
, it genotypes nested sites, but unlike-a
it does so top-down, setting various tags that keep track of the nesting relationship at the allele level (and also linking every site back to its position on the LV=0 reference chromosome). *-alleles (used in recent VCF versions to represent spanning alleles) are used. This option will not support nested insertions on GBZ/GBWT input -- so in practice it should be used on chromosome-level.vg
files (I will look into relaxing this). -R
option added tovg deconstruct
to toggle whether star-alleles are reported with-n
.- README now explains how to get vg on your
PATH
- README now explains how to build on multiple threads
- vg can now read GAM files generated by the long-read Giraffe prototype
vg filter
now lets you require exact matches for name filters instead of prefix matches with--exact-name
.deconstruct/call
can write giant VCF lines. This happens in, say, large svs with lots of samples that each get their own allele due to nested variation (hopefullydeconstruct -L
can mitigate this via merging). GiantAT
fields for each allele don't help. bcf apparently has a 2 gig line limit, and there's a case ofdeconstruct
seemingly truncating large records.vg deconstruct / call
are now modified to drop (with a warning) any lines>2Gb
to avoid these issues.vg giraffe
should no longer crash when mapping paired-end reads and reporting secondaries without a fragment length distributionvg inject
now supports GAF format with the new--output-format
/-o
option
Updated Submodules
libvgio