Releases: vgteam/vg
v1.26.0 - Stornara
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.26.0
Buildable Source Tarball: vg-v1.26.0.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg
build process needs.
This release includes:
vg gbwt -a
adds a path cover of components without haplotypes to an existing GBWT. This resolves the Giraffe issue with a large number of incorrect mapq 60 reads in some graphs with the full GBWT.- New seeding algorithm targeted at miRNA
- More flexible build system
- Pangenome ontology now has linkZoomLayer
- Synchronized semantics for OVERLAP_THRESHOLD
- Chopping with
mod -X
changed to run natively on handle graphs vg stats
-F prints graph format- Major
vg index
refactor. - Always build GBWT from a single graph with a single thread source (VCF, GAM, GAF, paths).
- Index embedded paths as samples instead of contigs with
vg index
option--paths-as-samples
. - Correct names for long options in
vg gbwt
. - Ignore metadata in empty indexes when merging GBWTs.
- add vg stats -D option to print degree distribution
- GAM/GAF I/O code moved into libvgio
- Ability to specify contig ploidy in vg sim
- Alignment with dozeu can now handle large full length bonuses without crashing
- dozeu is now non-x86 compatible.
- Made match pruning in mpmap less aggressive for very short reads
- mpmap can now use the distance index's algorithm for rescue graph extraction
- mpmap now can use the distance index in place of a snarl manager to guide the structure of multipath alignments
- Better
vg map
error message when xg input not found - GBWT construction from GAM/GAF is faster and handles duplicate read names correctly.
vg sim
can now output a table of the path positions that reads were simulated from- Added a subcommand
gampcompare
to evaluate mapping correctness of multipath alignments - libvgio now only tries to find Protobuf once
- Memory should no longer blow up in
mpmap
on splice-abundant graphs - GaplessExtender returns all non-overlapping full-length extensions.
New and Updated Submodules
The gssw
submodule is pointing at a different repository. Use git submodule sync
to update the pointer if building from source.
The libvgio
, dozeu
, gbwt
, libbdsg
, gssw
, structures
, and gbwtgraph
submodules have been updated.
Make sure to git submodule update --init --recursive
if building from source.
vg 1.25.0 - Apice
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.25.0
Buildable Source Tarball: vg-v1.25.0.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg
build process needs.
This release includes:
- New development in
vg mcmc
- Graph Alignment Format (GAF) output support for
vg map
andvg giraffe
- Revisions to distance index format; distance indexes will need to be rebuilt
- STL-based multipath alignment internal format
- Bugfixes to Integrated Snarl Finder
- Improvements to dozeu alignment algorithm
- Bugfixes in banded global aligner algorithm
- CI tests rearranged into batches
- Bugfix for slow clustering in
vg giraffe
- Parallel snarl finder in
vg deconstruct
- Bugfixes to NetGraph in the absence of unary snarls
- Update to XG encoding
- Transfer mapqs when converting multipath alignments
- Bigger bins for coverage stats in vg call
- Better write parallelism in mpmap
- Case insensitivity for some mapper option values
- Fix build of SNarlManager with latest Protobuf which prohibits inheriting from messages
- Find the missing start indexes on multipath alignments
- New default gbwt id interval value in
vg rna
- Renamed
vg gaffe
tovg giraffe
- Avoid out-of-bounds CNV and inversion breakpoints when generating random test graphs
- Bugfixes to variant caller
- Disqualify ultrabubbles with tips
- Fix
vg rna
crash when using reverse embedded paths - Improvements to
vg sim
read simulation, especially for microRNA - Don't enter infinite loop in
vg sim
when simulating from short paths - Reduce information copying on multipath alignments
- Changed default
vg giraffe
parameters and added a fast mode - Fix incorrect assert in
vg sim
- Fix presets in
vg mpmap
- Made clustering distance limit adjust itself for longer read lengths for
vg giraffe
- Fix missing metadata for multipath alignments
- Added
vg call
option to output traversals to gentoype with rpvg - Update CI to latest toil-vg
- Better Giraffe rescue
- Ploidy option for
vg deconstruct
- Fix annotation boundary bug in vg rna
- Added
-T
option invg call
to pad traversals - Added overlap threshold in GaplessExtender to deduplicate nearly identical
vg giraffe
results - New seeding algorithm in
vg mpmap
targeted at miRNA
New and Updated Submodules
The gbwtgraph
and xg
, submodules have been updated.
Make sure to git submodule update --init --recursive
if building from source.
vg 1.24.0 - Montieri
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.24.0
Buildable Source Tarball: vg-v1.24.0.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg
build process needs.
This release includes:
- Bugfixes in
vg mpmap
- MEM finding heuristic in
vg mpmap
- Pointers to Gimbricate and Seqwish when trying to import a non-blunt GFA
- A new
IntegratedSnarlFinder
that can be faster and use less memory, accessible throughvg snarls --algorithm integrated
- Use of
dozeu
library forvg gaffe
rescue - Faster default mapping parameters in
vg rna
- Fix for excessive memory use bug in spliced
vg surject
- Fix for an off-by-one error in
vg annotate
gff parser - Fix for getting seed distances from distance index
- Additional spliced
vg surject
bugfixes - Added option to prespecify topological order for
dozeu
- Top level clustering improvements for
vg gaffe
vg add
can now work with various graph formats as input- Made banded alignment traceback not recursive
- Minor optimizations in
vg gaffe
- Added GBWT traversals option for vg call
- Shook more bugs out of mpmap presets and alignment
- Added a GBWT traversal finder
- Improved Turtle Pangenome Ontology
- Adjusted FTP URL where CI testing looks for its GIAB test data
- More use of
dozeu
invg mpmap
- Support for quality-adjusted alignment in
dozeu
- Revised
vg mpmap
CLI for better usability - Added a no clustering option to
vg mpmap
- Registered magic numbers for
libbdsg
graph types so they can be loaded withoutVPKG
encapsulation - Added node qualities to
vg pack
output format - Updated internal
dozeu
and X-drop alignment interface - Added ability to simulate from a specific sample in
vg sim
- Update handle graph API to include node ranks
- Changed XdropAligner usage to be one per thread.
- Removed support for reading GFAs with overlaps: please preprocess with Seqwish to squash the overlaps out
- Cap the number of threads used in the Docker build to a reasonable number for Quay potatoes
- Automask IUPAC codes to Ns in
vg construct
, because the internal data model can't use them. If you want them in the graph, make them into variants. - Chunk long paths when extracting paths Protobuf with
vg paths
- Bugfixes to the ODGI graph implementation
- Boosted timeout for simulated chr21 CI mapping test
- Tweaked error rates used for
vg call
- Substantial bugfixes in the
FlowCaller
- Prohibit empty nodes in
xg
graphs - Support for ODGI format in
vg convert
and as an input format more generally - Added
libbdsg
,xg
, and vg'salgorithms
andio
namespaces to vg Doxygen docs - Fixed excessive recursion when dealing with snarls
- Added MAPQ support for paired-end
vg gaffe
. - Added ability to convert GFA files to other graph formats in
vg convert
New and Updated Submodules
The dozeu
, gbwt
, gbwtgraph
, gfakluge
, gssw
, libbdsg
, libhandlegraph
, pinchesAndCacti
, structures
, and xg
submodules have been updated.
Make sure to git submodule update --init --recursive
if building from source.
vg 1.23.0 - Lavello
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.23.0
Buildable Source Tarball: vg-v1.23.0.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg
build process needs.
This release includes:
vg construct
now rejects not-yet-implemented*
allele from new VCF spec- vg CI now runs on Kubernetes
- vg Docker can now be built with Kaniko
vg gaffe
now has MAPQ capping logic to reduce the error rate at MAPQ 60- Default MAPQ settings in
vg mpmap
have been changed - vg CI tests now respect the core limit of the container they are running in, instead of trying to use all the host's cores
- A variety of
vg mpmap
options were removed. Some short options changed their meanings. - Mapping real RNA data with
vg mpmap
is now faster - vg CI tests now use an updated
toil-vg
vg circularize
adds graph edges once again when circularizing pathsvg gaffe
is now fastervg mpmap
MAPQs are no longer miscomputed in rare casesvg gaffe
's index finding has been completely redone. It can now create its own indexes or find indexes by prefix. This may lead to it picking up and using indexes that were not passed to it directly.- Surjection of softclips has been improved.
vg call
now uses a better, flow-based traversal finding algorithm to generate allelesvg mpmap
uses new algorithms for long-read alignment- vg CI reports are no longer totally empty for some tests
- The internal
XdropAligner
is now much simpler ProtoHandleGraph
is no longer needed and has been removedvg concat
now works via the HandleGraph API, and has a-p
option be path-guidedHashGraph
no longer crashes when incrementing node IDs- Machinery for haplotype-based rescue in
vg gaffe
has been added XdropAligner
now automatically finds the best pinning location
New and Updated Submodules
The gbwtgraph
and libbdsg
submodules have been updated.
Make sure to git submodule update --init --recursive
if building from source.
vg 1.22.0 - Rotella
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.22.0
Buildable Source Tarball: vg-v1.22.0.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg
build process needs.
This release includes:
- Support for haplotype transcripts and two training FASTQs in
vg sim
, as well as bugfixes - Paired-end support in
vg gaffe
- Changes to distance index format to sspport paired-end mapping. All distance indexes will need to be rebuilt.
- Support for all PathHandleGraph implementations in most
vg
commands. Many commands will now output in formats other than vg Protobuf. vg combine
command to do whatcat
does for the Protobuf format when working with graphs in all formats.- Ability to rename variants' contigs during
vg construct
- Updates to many submodules to respect build flags, to make packaging easier
- GFA edge orientations in output are now more canonical
- Removal of
vg stats -S
and the concept of from/to-siblings - Addition of new predicates to the RDF ontology
- Addition of man page for
vg gaffe
- XG edge vectorization now works correctly, which should make
vg pack
output and the things that use it more correct. - Fixed bug with
vg chunk -p -C -a
when not all paths are wanted in gam chunks - Added missing critical section in
vg chunk
- Fixed haplotype exon break bug in
vg rna
- Fixed bugs in new
vg chunk
gam splitting logic - Now
vg chunk
uses all threads by default - VPKG tag for ODGI files has been fixed. ODGI files will need to be regenerated.
- Now filtering N's in
vg augment
- Fixed a bug with
vg augment
's coverage filter - Fixed a crash bug in
vg surject
built with GCC - Report GQ and a real QUAL in
vg call
- Magic number prefixes in SerializableHandleGraph. HashGraph and PackedGraph output will not be readable by older
vg
versions. - Fix hash graph id sorting
- For RNA, add option to add splice-junctions from intron bed file
- Speed improvements in
vg surject
- Fix
vg augment -s
option - Fix rare out-of-bounds index bug in
vg mpmap
- Splice support in
vg surject
- Add a collective MAPQ option to
vg mpmap
- Fix quality string in augmented GAMs
- Ability to simulate RNA reads from haplotype transcripts
- Support for multiple minimizer indexes in
vg gaffe
- Addition of an internal dagification overlay HandleGraph
- Disable parallelization of second pass in
vg augment
- Do not change names of reference transcripts in
vg rna
- Fix an off-by-one error in
vg surject
- Add mpmap tests back to CI simulation tests
- Turn on ccache for Travis CI
- More memory efficient
vg rna
- MinimizerMapper refactoring
- Update calibration routine for mismapping detection in
vg mpmap
- Fully ignore single-node components when doing
vg snarls
- Use the HandleGraph API in more places
- Changes to Homebrew usage on Mac Travis CI
- Make filters in
vg filter
be fully off if not requested
New and Updated Submodules
The gbwt
, gbwtgraph
, gssw
, htslib
, libbdsg
, libhandlegraph
, libvgio
, pinchesAndCacti
, sha1
, sonLib
, structures
, vcflib
, and xg
submodules have been updated.
Make sure to git submodule update --init --recursive
if building from source.
vg 1.21.0 - Fanano
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.21.0
Buildable Source Tarball: vg-v1.21.0.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg
build process needs.
This release includes:
- Changes to how
vg paths
can be used to remove paths from a graph. - Use of a new
toil-vg
version in testing. - Changes to minimizer seed handling algorithms in
vg gaffe
. - Some paired-end clustering logic to be used in
vg gaffe
. - Improvements to the build system around the
libsdsl
build. - No-hit read counting and better S3 support in Giraffe Wrangler script.
- A new subcommand,
vg depth
. - Refactoring and race condition removal in
vg pack
. - Probabilistic calling support.
- Improvements to snarl traversal enumeration.
- Improved ability of the build system to handle switching compilers.
- Bugfixes to the vg RDF ontology.
vg augment
now cuts off softclips by default.vg explode
functionality now subsumed byvg chunk
.- Internal
VGSet
object now supports HandleGraphs. - Some manpages available via
make man
. - Better handling of reverse-strand cycles in
vg call
. - Better gtf/gff parsing in
vg rna
. - Snarl computation now parallel.
- Subpaths output by
vg chunk -p
now contain offsets. - Fix a race condition when reading from the
Paths
object. - Ability to chunk non-indexed GAM files by graph component.
- Bugfixes to kmer enumeration.
- Deprecated some
vg mod
flags, with warning messages. vg find
can produce more information about kmers.vg construct
should no longer drop the end breakpoints of deletions.
New and Updated Submodules
The BBHash
, libbdsg
, libhandlegraph
, pinchesAndCacti
, and xg
submodules have been updated.
Make sure to git submodule update --init --recursive
if building from source.
vg 1.20.0 - Ginestra
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.20.0
Buildable Source Tarball: vg-v1.20.0.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg
build process needs.
This release includes:
- Faster and more efficient minimizer finding and clustering in
vg gaffe
- Memory usage optimizations in
vg rna
vg call
bugfixes for cyclic graphs- Improved
giraffe-wrangler.sh
script - Improvements to tail anchor synthesis in
vg mpmap
that should resolve some slowdowns vg rna
bugfixes for variants at the ends of exons- Better use of overlays in
vg call
- BED-based extraction in
vg find
- High-degree node removal in
vg prune
- Bugfix for
vg call -b
parsing issue - New
-s
option tovg augment
to ignore out-of-graph alignments - Binaries and Dockers now built for Nehalem architecture and above; AVX1 instruction support should no longer be required
- Faster packing with
vg pack
- Ability to build a GBWT from sampled local haplotypes rather than a whole haplotype database
- Coverage threshold for faster
vg augment
- Distance indexing with
vg index
now supports any graph filetype, not just.vg
.
New and Updated Submodules
The BBHash
submodule has been added.
The gbwt
, gbwtgraph
, libbdsg
, libhandlegraph
, and libvgio
submodules have been updated.
Make sure to git submodule update --init --recursive
if building from source.
vg 1.19.0 - Tramutola
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.19.0
Buildable Source Tarball: vg-v1.19.0.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg
build process needs.
This release includes:
-
The experimental
vg mcmc
command, a Markov-chain-Monte-Carlo genotyper prototype -
Improved clustering and intermediate scoring logic in
vg gaffe
, and the elimination of MultipathAlignments from that pipeline -
vg gaffe
now emits MAPQ 0 for unmapped reads -
Handle-graph-ification of
vg map
, phase unfolding,vg vectorize,
vg find,
vg chunk,
vg call`, and other operations using the xg index. Many commands now support more file types as arguments. -
Phase unfolder path verification no longer runs in parallel
-
A new and improved XG index implementation. Note that backward compatibility is available only for XG indexes created with vg 1.18 or newer
-
Giraffe Wrangler script now counts only mapped reads for identity computation, can take its inputs from Amazon S3, and binds BWA to a single NUMA node
-
GBWT accesses now cached
-
Giraffe Facts now reports statistics by filter instead of by mapping stage
-
Vendored Protobuf submodule has been removed.
vg
can now build with (and indeed now requires) system-installed Protobuf 3 from a package manager -
vg call
algorithm has been revised to use the much more efficient "pack" format, allowing much larger single calling operations -
A paired-end clustering algorithm for use in
vg gaffe
has been added, but not yet used -
Warnings about deprecated pointer types in
lru_cache
eliminated -
A new Dockerfile for a multi-stage build has been integrated directly into the
vgteam/vg
repository. Thevgteam/vg_docker
repository should no longer be required. -
The
GBWTGraph
has been moved to a submodule -
Added a magic number system to identify bare XG files as XG files
-
Testing of builds of vg on Mac with GNU GCC has been discontinued. It is very difficult to get GNU GCC to build with the libc++ standard library, which packaged Protobuf builds for Mac use. GNU GCC should still work with Protobuf built with GNU GCC against its default libstdc++.
-
vg version
now reports the C++ standard library used in the build -
CFLAGS and CXXFLAGS are now honored by submodule dependencies in more (but not all) places. Some dependencies still ignore them.
-
There is now a
make static-docker
target which quickly builds a staticvg
binary in the checked-out source tree and dumps it into a Docker container -
The build system now has its own opinion about whether the current vg binary is statically linked, as recorded by a marker file in
lib/
. -
User support for
vg
now ought to happen on Biostars: https://www.biostars.org/t/vg/ -
Crash bug in
optimal_score_on_genome
has been fixed -
Memory leaks in
vg map
have been plugged
New System Dependencies
Protobuf 3.0 or greater, including libraries and compiler.
New and Updated Submodules
The protobuf
submodule has been removed.
The DYNAMIC
, gbwt
, gcsa2
, gfakluge
, libbdsg
, libhandlegraph
, libvgio
, lru_cache
, and sdsl-lite
submodules have been updated.
The gbwtgraph
, ips4o
, mmmultimap
, and xg
submodules have been added.
Make sure to git submodule update --init --recursive
if building from source.
vg 1.18.0 - Zungoli
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.18.0
Buildable Source Tarball: vg-v1.18.0.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg
build process needs.
This release includes:
- Completely removed support for gPBWT's stored in
XG
indexes. Must now usevg gbwt
. vg mod -g
now preserves embedded paths.vg call
andvg snarls
can now use anyHandleGraph
implementation, which can significantly reduce memory usage.- Fixed a bug that made it impossible to make distance indexes using
XG
files. - Output files are now produced in uncompressed form, but bgzipped files can still be read.
- Improved speed in experimental
gaffe
mapper through caching and altered alignment algorithms. - Introduced serialization for
GBWTGraphs
that can replace theXG
usedgaffe
, significantly lowering memory usage. - Transitioned to generic interface for
XG
to facilitate alternate backend data structures. - Improved input checking in
vg gamcompare
. - Fixed a bug that made invalid alignments in
vg mpmap
. - Fixed a bug in path editing for
VG
graphs. - Improved logic in
vg deconstruct
. - Various build system improvements.
New System Dependencies
None
New and Updated Submodules
The FlameGraph
submodule has been added.
The sglib
submodule has been replaced by libbdsg
.
The libvgio
submodule has been updated.
Make sure to git submodule update --init --recursive
if building from source.
vg 1.17.0 - Candida
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.17.0
Buildable Source Tarball: vg-v1.17.0.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg
build process needs.
This release includes:
-
VG annotate edited to search further for path hits.
-
Single-ended Dijkstra algorithm factored out into its own HandleGraph algorithm.
-
HandleGraphs can be loaded from files with dynamic implementation selection
-
Fixed bugs in minimum distance clustering
-
Fixed bugs in AlignmentEmitter
-
vg gaffe
improved speed and memory -
GBWT manipulation tools in
vg gbwt
-
vg mod -i
deprecated, replaced byvg augment -i
-
vg edit
mostly refactored into augment.cpp, except for the interface with lists of paths. -
'vg find -E' extracts nodes with an id range defined by a path interval.
-
vg convert
allows graph type conversion between graph implementations. -
vg gbwt
now considers unphased homozygous variants as phased to prevent phase breaks. This can be disabled with option-z
. -
Improved vg compilation on MacOS with gcc 9
-
pack -q
allows weighting support by mapping quality. -
Bug fixed in split strand overlay
-
Bug fixes with compilation errors using G++ 9.1.0.
-
Bug fix in tcmalloc.
-
Speed up
vg filter
-
vg call
now supports pack graph format. -
xg moved into vg namespace.
-
vg call
can now call sites in parallel. -
Names added to transcripts written to the gbwt index. Transcript naming has been simplified.
-
gam output option from gbwt is removed. Gam files can be generated from gbwt transcript index using
vg paths
. -
Small speed-up for merging packer objects in
vg pack
. -
Options added for adding either reference paths or non-reference paths as embedded paths in a graph.
-
vg deconstruct
now works with more input graphs. -
PackedGraph and HashGraph now moved to sglib.
-
vg paths -F
now writes embedded paths in FASTA format. -
Factored out path-to-component index from XG
-
Bug fix in rewrite_segment.