Skip to content

vg 1.58.0 - Cartari

Compare
Choose a tag to compare
@adamnovak adamnovak released this 01 Jul 21:13
· 244 commits to master since this release
a049c6b

Download for Linux

Don't forget to mark the static binary executable:

chmod +x vg

Docker Image: quay.io/vgteam/vg:v1.58.0

Buildable Source Tarball: vg-v1.58.0.tar.gz

Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg build process needs.

This release includes:

  • vg deconstruct now does path-based (formerly -e) deconstruction by default. Old default behaviour of exhaustively processing (tiny) sites is deprecated.
  • if -a is not used, vg deconstruct will recursive on child snarls of snarls it failed to process (like vg call) - functionality that was, I think, dropped a while back.
  • Experimental option -L added to vg deconstruct in order to cluster similar allele traversals together. The value given is a (length-weighted) threshold for the jaccard coefficient between the oriented nodes of two traversals. So if -L 0.75 is given, then alleles that have >= 0.75 similarity based on their graph positions will be merged into one. Two new FORMAT fields are added to keep track of the difference, TS (jaccard distance) and TL (length difference). Clustering is done greedily starting with selected reference paths.
  • new (experimental) option -n added to vg deconstruct. Like -a, it genotypes nested sites, but unlike -a it does so top-down, setting various tags that keep track of the nesting relationship at the allele level (and also linking every site back to its position on the LV=0 reference chromosome). *-alleles (used in recent VCF versions to represent spanning alleles) are used. This option will not support nested insertions on GBZ/GBWT input -- so in practice it should be used on chromosome-level .vg files (I will look into relaxing this).
  • -R option added to vg deconstruct to toggle whether star-alleles are reported with -n.
  • README now explains how to get vg on your PATH
  • README now explains how to build on multiple threads
  • vg can now read GAM files generated by the long-read Giraffe prototype
  • vg filter now lets you require exact matches for name filters instead of prefix matches with --exact-name.
  • deconstruct/call can write giant VCF lines. This happens in, say, large svs with lots of samples that each get their own allele due to nested variation (hopefully deconstruct -L can mitigate this via merging). Giant AT fields for each allele don't help. bcf apparently has a 2 gig line limit, and there's a case of deconstruct seemingly truncating large records. vg deconstruct / call are now modified to drop (with a warning) any lines >2Gb to avoid these issues.
  • vg giraffe should no longer crash when mapping paired-end reads and reporting secondaries without a fragment length distribution
  • vg inject now supports GAF format with the new --output-format/-o option

Updated Submodules

  • libvgio