Releases: ythuang0522/homopolish
Release v0.4.1
Code refactoring and performance improvement.
Release v0.4
We introduced two major updates in v0.4.
- Expansion of bacteria genomes from RefSeq (~180K) to NCBI (~1.19M). We encourage the users to update to the larger database as it further improves the polishing accuracy of both Homoolish and Modpolish since more related strains can be found from the 1.19 million genomes. As the sketch size increased from 720Mb to 3.3Gb, more RAM would be needed. Note that the v0.4 source code is tightly bounded with the larger database. Please don't use the old version of code (v0.3.x or earlier) with the larger database. If you wanted to stay with the earlier version, please use the old bacteria sketch instead.
- Introduction of Modpolish. Modpolish aims to correct mismatch errors due to novel modifications untrained in the ONT basecalling models. We observed some bacteria produced an unexpectedly large amount of mismatch errors due to novel modifications, which were not corrected by Medaka nor by Homopolish. Below are polishing results from two datasets, where one contains 12 in-house Listeria strains and the other is the Zymo Microbial Community.
Mismatches of 12 Listeria strains
Release v0.3.3
Output the unpolished contigs when the number of related genomes is insufficient.
Release v0.3.2
Fix an NaN issue reappeared after merging branches.
Fixing crashing issue of short contigs/plasmids
The v0.3 version went crashed for short contigs/plasmids (<~6kbp) because FastANI failed to generate output. The fixed version (v0.3.1) continues polishing using Mash-selected genomes. The users are recommended to update to this version.
Release v0.3
This version further improves the accuracy of Nanopore sequencing by FastANI distance recalibration after mash screening. We observed the higher quality of Nanopore (Guppy v3.4 and after) and PacBio CLR can benefit from better distance estimation. As such, the new release requires rebuilding your conda env as FastANI is now mandatory.
- Improved genome quality for Nanopore R9.4 and R10.3
- Support of polishing PacBio (CLR) genome is now available (-m pb.pkl)
Release v0.2
- fixed an NaN bug due to adjacent insertions and deletions.
- Change from url lib to pycurl for retrieving NCBI genomes. As such the old conda env has to be rebuild/updated.
- Support of local sequence database is now available via the new argument '-l dbpath'. This argument will search your own local sequence database instead of the NCBI microbiome database.
Release v0.1
Release of the version used in the manuscript. The current version works well for polishing genomes of high assembly contiguity (N50>1 Mbp). It should work well with old or new basecaller (Guppy 2.2 or earlier to Guppy 3.6). The recommended pipeline would be Racon+Medaka+Homopolish, though MarginPolish should also work well with Homopolish. It can not polish highly-fragmented MAGs yet.