Skip to content

Releases: ythuang0522/homopolish

Release v0.4.1

23 Nov 16:32
7d47b28
Compare
Choose a tag to compare

Code refactoring and performance improvement.

Release v0.4

19 Aug 09:06
432de13
Compare
Choose a tag to compare

We introduced two major updates in v0.4.

  1. Expansion of bacteria genomes from RefSeq (~180K) to NCBI (~1.19M). We encourage the users to update to the larger database as it further improves the polishing accuracy of both Homoolish and Modpolish since more related strains can be found from the 1.19 million genomes. As the sketch size increased from 720Mb to 3.3Gb, more RAM would be needed. Note that the v0.4 source code is tightly bounded with the larger database. Please don't use the old version of code (v0.3.x or earlier) with the larger database. If you wanted to stay with the earlier version, please use the old bacteria sketch instead.
  2. Introduction of Modpolish. Modpolish aims to correct mismatch errors due to novel modifications untrained in the ONT basecalling models. We observed some bacteria produced an unexpectedly large amount of mismatch errors due to novel modifications, which were not corrected by Medaka nor by Homopolish. Below are polishing results from two datasets, where one contains 12 in-house Listeria strains and the other is the Zymo Microbial Community.

Mismatches of 12 Listeria strains
image

Q scores of 12 Listeria strains
image

Mismatches of 8 Zymo bacteria
image

Q scores of 8 Zymo bacteria
image

Release v0.3.3

14 Sep 03:27
Compare
Choose a tag to compare

Output the unpolished contigs when the number of related genomes is insufficient.

Release v0.3.2

06 Sep 02:17
Compare
Choose a tag to compare

Fix an NaN issue reappeared after merging branches.

Fixing crashing issue of short contigs/plasmids

22 Jul 02:05
37514db
Compare
Choose a tag to compare

The v0.3 version went crashed for short contigs/plasmids (<~6kbp) because FastANI failed to generate output. The fixed version (v0.3.1) continues polishing using Mash-selected genomes. The users are recommended to update to this version.

Release v0.3

16 Jul 09:06
04df301
Compare
Choose a tag to compare

This version further improves the accuracy of Nanopore sequencing by FastANI distance recalibration after mash screening. We observed the higher quality of Nanopore (Guppy v3.4 and after) and PacBio CLR can benefit from better distance estimation. As such, the new release requires rebuilding your conda env as FastANI is now mandatory.

  1. Improved genome quality for Nanopore R9.4 and R10.3
  2. Support of polishing PacBio (CLR) genome is now available (-m pb.pkl)

Release v0.2

20 Apr 07:50
5f86647
Compare
Choose a tag to compare
  1. fixed an NaN bug due to adjacent insertions and deletions.
  2. Change from url lib to pycurl for retrieving NCBI genomes. As such the old conda env has to be rebuild/updated.
  3. Support of local sequence database is now available via the new argument '-l dbpath'. This argument will search your own local sequence database instead of the NCBI microbiome database.

Release v0.1

02 Dec 09:56
5ddd65c
Compare
Choose a tag to compare

Release of the version used in the manuscript. The current version works well for polishing genomes of high assembly contiguity (N50>1 Mbp). It should work well with old or new basecaller (Guppy 2.2 or earlier to Guppy 3.6). The recommended pipeline would be Racon+Medaka+Homopolish, though MarginPolish should also work well with Homopolish. It can not polish highly-fragmented MAGs yet.