Skip to content

Releases: ComparativeGenomicsToolkit/cactus

Cactus 2.6.5 2023-07-27

27 Jul 15:19
cd80c48
Compare
Choose a tag to compare

Cactus 2.6.5 is available in the following forms:

WARNING: do not use the github automatically generated source files (Source code (zip) or Source code (tar.gz)), these are not correct.

The Docker images and binaries linked above are built using AVX2 extensions, and require a CPU that supports them, except the "Pre-compiled Binaries For Older CPU Architectures" which should be compatible with any 64-bit architecture (and, since version 2.3.1, support Cactus's pangenome pipeline).

Please subscribe to the cactus-announce low-volume mailing list to receive notice of Cactus release.

Release Notes

This release patches a Toil bug that broke GPU support on single-machine.

  • Update to Toil v5.12, which fixes issue where trying to use GPUs on single machine batch systems would lead to a crash
  • Make cactus more robust to numeric and duplicate internal node labels on input tree (ie ignore them instead of crashing with a cryptic scheduling error)
  • Fix hal2chains --targetGenomes option.

Cactus 2.6.4 2023-07-02

02 Jul 16:29
55a1d71
Compare
Choose a tag to compare

Cactus 2.6.4 is available in the following forms:

WARNING: do not use the github automatically generated source files (Source code (zip) or Source code (tar.gz)), these are not correct.

The Docker images and binaries linked above are built using AVX2 extensions, and require a CPU that supports them, except the "Pre-compiled Binaries For Older CPU Architectures" which should be compatible with any 64-bit architecture (and, since version 2.3.1, support Cactus's pangenome pipeline).

Please subscribe to the cactus-announce low-volume mailing list to receive notice of Cactus release.

Release Notes

This is another minor patch release to fix support for multiple values to --reference.

  • Don't fail if a given reference contig contains no sequences from the 2nd reference. This issue prevented completing of HPRC graphs with --reference GRCh38 CHM13 because CHM13 wasn't present in the non-chromosome components.
  • Fix --dupeMode consensus to output MAF with rows sorted (and, importantly, leaving the first row as reference).

Cactus 2.6.3 2023-06-29

29 Jun 15:50
c891c70
Compare
Choose a tag to compare

Cactus 2.6.3 is available in the following forms:

WARNING: do not use the github automatically generated source files (Source code (zip) or Source code (tar.gz)), these are not correct.

The Docker images and binaries linked above are built using AVX2 extensions, and require a CPU that supports them, except the "Pre-compiled Binaries For Older CPU Architectures" which should be compatible with any 64-bit architecture (and, since version 2.3.1, support Cactus's pangenome pipeline).

Please subscribe to the cactus-announce low-volume mailing list to receive notice of Cactus release.

Release Notes

This release contains a single, minor patch that only applies when passing multiple values to --reference.

  • Pangenome stub filtering changed to apply only to the first genome passed via --reference in order to be consistent with gap filtering (and enforce maximum of two stubs per graph component).

Cactus 2.6.2 2023-06-28

28 Jun 18:47
4656708
Compare
Choose a tag to compare

Cactus 2.6.2 is available in the following forms:

WARNING: do not use the github automatically generated source files (Source code (zip) or Source code (tar.gz)), these are not correct.

The Docker images and binaries linked above are built using AVX2 extensions, and require a CPU that supports them, except the "Pre-compiled Binaries For Older CPU Architectures" which should be compatible with any 64-bit architecture (and, since version 2.3.1, support Cactus's pangenome pipeline).

Please subscribe to the cactus-announce low-volume mailing list to receive notice of Cactus release.

Release Notes

This release patches a few bugs introduced or found in v2.6.1

  • Docker container fixed to include runtime libatomic dependency for odgi
  • --refContigs option fixed in cactus-graphmap-split
  • cactus-pangenome fixed to properly output intermediate GAF and unfiltered PAF alignment files

Cactus 2.6.1 2023-06-27

27 Jun 16:55
1d98698
Compare
Choose a tag to compare

Cactus 2.6.1 is available in the following forms:

WARNING: do not use the github automatically generated source files (Source code (zip) or Source code (tar.gz)), these are not correct.

The Docker images and binaries linked above are built using AVX2 extensions, and require a CPU that supports them, except the "Pre-compiled Binaries For Older CPU Architectures" which should be compatible with any 64-bit architecture (and, since version 2.3.1, support Cactus's pangenome pipeline).

Please subscribe to the cactus-announce low-volume mailing list to receive notice of Cactus release.

Release Notes

This Release adds SLURM cluster support for Cactus (both progressive and pangenome). It also adds some new visualization features to the pangenome pipeline, along with several bugfixes.

  • SLURM support added (via Toil v5.11). It's important to read the documentation about --consMemory and --indexMemory when running large alignments.
  • ODGI now integrated into Cactus. See cactus-pangenome options --viz, --odgi, --chrom-og and --draw for incorporating it into your output.
  • Add --chop option to cactus-pangenome make sure all output graphs have nodes chopped down to at most 1024bp. By default, the .gfa.gz is unchopped and the .gbz is, which can lead to annoying confusion when trying to compare node IDs across different files.
  • Fix bug where _MINIGRAPH_ paths ended up in the .full output graphs.
  • vg clip crash fixed.
  • mash distance for minigraph order now computed by sample (and not by haplotype). Fixes issue where, for example, diploid ordering would be dependent on whether assembly has chrX.
  • If --refContigs is not specified, minigraph-cactus now uses naming in addition to size to try to guess reference contigs.
  • --dupeMode consensus option added to cactus-hal2maf in order to use maf_stream to merge multiple rows into consensus rows, which may be the best compromise to get the data into PhyloP or the Browser.
  • halReplaceGenome patched to fix a regression from late 2022 where updating large alignments could lead to a crash.
  • Fix cactus-update-prepare replace to print the halRemoveGenome command rather than quietly running it on the input.
  • --consMemory and --indexMemory options bugs fixed.
  • Fix bug with multiple --reference samples; they are now all promoted properly to REFERENCE-sense paths.

Cactus 2.6.0 2023-06-09

09 Jun 15:00
ce2bd97
Compare
Choose a tag to compare

Cactus 2.6.0 is available in the following forms:

WARNING: do not use the github automatically generated source files (Source code (zip) or Source code (tar.gz)), these are not correct.

The Docker images and binaries linked above are built using AVX2 extensions, and require a CPU that supports them, except the "Pre-compiled Binaries For Older CPU Architectures" which should be compatible with any 64-bit architecture (and, since version 2.3.1, support Cactus's pangenome pipeline).

Please subscribe to the cactus-announce low-volume mailing list to receive notice of Cactus release.

Release Notes

This Release significantly updates Cactus's chaining logic which, in early tests at least, allows alignment of T2T-quality assemblies as well as WindowMasked (and not RepeatMasked) genomes.

  • Improved chaining of lastz's PAF output in order to support alignment of T2T-quality genomes
  • Minimum chain length in the Cactus graph now determined by branch length, so that more closely-related genomes can be chained more aggressively while not losing sensitivity along more distant branches where the alignment is more fragmented.
  • Early experiments show that the above changes make Cactus much less sensitive to the input repeat masking. Genomes that previously required masking with RepeatMasker were able to align with the WindowMasking-based fastas directly from NCBI.
  • Update to Toil 5.10.0
  • Update to latest Taffy
  • Specify memory requirements for all Toil jobs (in Progressive Cactus). Cactus Consolidated memory is estimated conservatively, but can be overridden with --consMemory.

Cactus 2.5.2 2023-05-15

15 May 20:13
a33d3ea
Compare
Choose a tag to compare

Cactus 2.5.2 is available in the following forms:

WARNING: do not use the github automatically generated source files (Source code (zip) or Source code (tar.gz)), these are not correct.

The Docker images and binaries linked above are built using AVX2 extensions, and require a CPU that supports them, except the "Pre-compiled Binaries For Older CPU Architectures" which should be compatible with any 64-bit architecture (and, since version 2.3.1, support Cactus's pangenome pipeline).

Please subscribe to the cactus-announce low-volume mailing list to receive notice of Cactus release.

Release Notes

This Release patches some bugs in the pangenome pipeline and makes it a bit more user-friendly

  • Fix support for multiple referenes via --reference and --vcfReference
  • Fix bug where certain combinations of options (ie returning filtered but not clipped index) could lead to crash
  • Fix crash when handling non-ascii characters in vg crash reports
  • Fix the --chrom-vg option in cactus-pangenome
  • New option --mgCores to specify number of cores for minigraph construction (rather than lumping in with --mapCores which is also used for mapping)
  • Better defaults for number of cores used in pangenome pipeline on singlemachine.
  • Fix bug where small contigs in the reference sample could lead to crashes if they couldn't map to themselves (and --refContigs was not used to specify chromosomes). --refContigs is now automatically set if not specifed.
  • Update to vg 1.48.0
  • Update pangenome paper citation from preprint to published version.

Cactus 2.5.1 2023-04-19

19 Apr 20:24
4823037
Compare
Choose a tag to compare

Cactus 2.5.1 is available in the following forms:

WARNING: do not use the github automatically generated source files (Source code (zip) or Source code (tar.gz)), these are not correct.

The Docker images and binaries linked above are built using AVX2 extensions, and require a CPU that supports them, except the "Pre-compiled Binaries For Older CPU Architectures" which should be compatible with any 64-bit architecture (and, since version 2.3.1, support Cactus's pangenome pipeline).

Please subscribe to the cactus-announce low-volume mailing list to receive notice of Cactus release.

Release Notes

This Release mostly patches some bugs in the pangenome pipeline

  • cactus-pangenome now saves PAF file in the output directory
  • Ship version of vg that is patched to not make too-slow giraffe indexes for some complex graphs
  • Fix bug where . characters in reference sample name could lead to strange errors at end of pipeline
  • Better sample name validation for all pangenome tools to prevent confusion around .s.
  • Update taffy to fix issues where --filterGapCausingDupes could lead to crashes in cactus-hal2maf
  • Strip defaults of taffy commands from being specified in cactus-hal2maf -- they are now taken from taffy
  • Add MHC and GRCh38-alt pangenome examples

Cactus 2.5.0 2023-04-03

03 Apr 22:39
0e334db
Compare
Choose a tag to compare

Cactus 2.5.0 is available in the following forms:

WARNING: do not use the github automatically generated source files (Source code (zip) or Source code (tar.gz)), these are not correct.

The Docker images and binaries linked above are built using AVX2 extensions, and require a CPU that supports them, except the "Pre-compiled Binaries For Older CPU Architectures" which should be compatible with any 64-bit architecture (and, since version 2.3.1, support Cactus's pangenome pipeline).

Please subscribe to the cactus-announce low-volume mailing list to receive notice of Cactus release.

Release Notes

This Release greatly simplifies the interface for building pangenomes

  • Introduction of cactus-pangenome command that, like cactus for the progressive aligner, runs the whole pipeline in one shot. Its inputs are a list of fasta sequences and sample names and it outputs the pangenome graph and various indexes, vcf, etc. Intermediate outputs are exported at the end of each stage, so low-level commands can be used to repeat or continue the workflow.
  • # characters in fasta contig names no longer need to be cleaned out with special invocation of cactus-preprocess to avoid conflicts with vg indexes. This now happens automatically within the pipeline.
  • The lower level pangenome commands are all still supported and documented. But cactus-align-batch is now deprecated. This means it is removed from the documentation (except when describing older results) and from continuous integration tests, and it will give a warning when used. Users should now just run cactus-align instead of cactus-align-batch where applicable, using the former's --batch option. cactus-align-batch was a hack required to scale up before the cactus v2.0 rewrite but used nested Toil workflows and needed to go.
  • Documentation updated to focus on the simpler interface
  • GFAffix updated to fix a rare crash
  • cactus-graphmap-join (and cactus-pangenome) will not fail in the event a VCF indexing error (ex from chromosomes that are too long for tbi). Instead it will give a warning and produce no index.
  • New cactus-hal2maf option --keepGapCausingDupes is changed to --filterGapCausingDupes and turned off by default. The underlying code has bugs that cause problems on certain datasets, and is not ready to be activated by default.

Cactus 2.4.4 2023-03-16

16 Mar 20:49
985e2c5
Compare
Choose a tag to compare

Cactus 2.4.4 is available in the following forms:

WARNING: do not use the github automatically generated source files (Source code (zip) or Source code (tar.gz)), these are not correct.

The Docker images and binaries linked above are built using AVX2 extensions, and require a CPU that supports them, except the "Pre-compiled Binaries For Older CPU Architectures" which should be compatible with any 64-bit architecture (and, since version 2.3.1, support Cactus's pangenome pipeline).

Please subscribe to the cactus-announce low-volume mailing list to receive notice of Cactus release.

Release Notes

This release includes some new export tools for the UCSC Genome Browser

  • cactus-hal2chains created in order to convert HAL output from Cactus into sets of pairwise alignment chains, using either halLiftover or halSynteny
  • cactus-maf2bigmaf created to convert .maf output from cactus-hal2maf to BigMaf and BigMaf Summary files for display on the Genome Browser
  • cactus-hal2maf typo fixed where 3 (instead of 30) was set for the default value of --maximumGapLength
  • Boost TAFFY normalization defaults in cactus-hal2maf, bringing --maxmimumGapLength to 100, and --maximumBlockLengthToMerge to 1000, and adding the heuristic block-breaking dupe filter from taffy norm. The latter is on by default to prevent over fragmentation, but can be disabled with --kepGapCausingDupes
  • Remove --onlyOrthologs and --noDupes options from cactus-hal2maf and replace with the --dupeMode option. --dupeMode single is now the recommended way of getting at most one row / species. More information about this added to the documentation.
  • --maxRefNFrac option added to cactus-hal2maf to filter out blocks where the reference sequence is mostly Ns (default to filter out >95% Ns).
  • Change abPOA scoring matrix to be more consistent with lastz parameters used by cactus, where N bases are penalized when aligned with other characters. Before, they could be aligned to anything. This will hopefully make the above filter less necessary.
  • Fix bug where cactus-blast --restart would not work.