Skip to content

Releases: twolinin/longphase

v1.7.3

25 Jun 02:04
bbb3cbb
Compare
Choose a tag to compare

Summary

  1. Fix the issue where the phase command calls fai_load() each time it fetches a reference sequence, changing it to call only once.
  2. It will no longer fetch reference sequences without SNP cover.

see detail changes

v1.7.2

15 May 06:34
e5271de
Compare
Choose a tag to compare

Summary

Haplotag will add the @PG tag to the header and add option --cram to output CRAM format.
Currently, phase combines different alignments of the same read into a single alignment.

see detail changes

v1.7.1

26 Apr 01:01
fd9ce10
Compare
Choose a tag to compare

Summary

If the INFO field of an SV variant does not contain RNAMES, phasing will not be performed for this variant.
Haplotag now includes documentation on tagging specific regions.

see detail changes

v1.7

14 Apr 14:08
5ae596c
Compare
Choose a tag to compare

Summary

Merge different alignments of a read to improve phasing integrity and adjust parameter weights to enhance phasing accuracy. Allow the use of phased modification VCF to increase the proportion of tagged reads. Address some known issues.

phase (-t 24) v1.6 SW v1.6 #Block v1.6 Block N50 v1.7 SW v1.7 #Block v1.7 Block N50
HG002 ONT R10.4.1 10x 1,137 7,212 774,928 1,117 7,100 807,877
HG002 ONT R10.4.1 20x 1,225 4,257 1,560,226 1,218 4,132 1,654,808
HG002 ONT R10.4.1 30x 1,194 3,499 1,903,900 1,180 3,372 2,042,500
HG002 ONT R10.4.1 40x 1,211 3,045 2,177,620 1,200 2,915 2,332,195
HG002 ONT R10.4.1 50x 1,216 2,797 2,470,461 1,213 2,679 2,606,645
HG002 ONT R10.4.1 60x 1,197 2,627 2,587,166 1,195 2,513 2,830,210

SW: Switch Error

see detail changes

v1.6

05 Jan 05:17
Compare
Choose a tag to compare

Summary

  1. Implement chromosome-level parallelization for the modcall and phase commands. The overall execution time is reduced 71% ~ 88%.
  2. Replace malloc with jemalloc.
  3. Remove and simplify unused parameters to improve memory usage.
  4. Adjust the weighting of low-quality variants in phasing.
  5. The VCF generated by modcall can be directly imported into IGV. Additionally, modcall can output all detected coordinates by using the --all parameter.

see detail changes

phase (-t 24) v1.5.2 (Time) v1.5.2 (Memory) v1.6 (Time) v1.6 (Memory)
HG002 ONT R10.4.1 10x 153s 7.7G 39s 15.1G
HG002 ONT R10.4.1 20x 444s 8.2G 53s 15.6G
HG002 ONT R10.4.1 30x 355s 8.5G 68s 24.4G
HG002 ONT R10.4.1 40x 908s 8.8G 217s 26.6G
HG002 ONT R10.4.1 50x 1043s 9.2G 262s 22.2G
HG002 ONT R10.4.1 60x 640s 9.5G 113s 33.4G
modcall (-t 24) v1.5.2 (Time) v1.5.2 (Memory) v1.6 (Time) v1.6 (Memory)
HG002 ONT R10.4.1 10x 322s 11.0G 93s 22.2G
HG002 ONT R10.4.1 20x 635s 14.6G 199s 31.6G
HG002 ONT R10.4.1 30x 746s 18.2G 125s 48.1G
HG002 ONT R10.4.1 40x 1308s 21.5G 292s 55.8G
HG002 ONT R10.4.1 50x 1570s 25.0G 317s 68.8G
HG002 ONT R10.4.1 60x 1454s 28.4G 248s 84.0G

*If the device is running low on memory, you can control memory usage by reducing the number of threads (-t).

v1.5.2

22 Nov 13:31
Compare
Choose a tag to compare

This version improves phasing accuracy by refining the graph boundary range and reduces memory usage by 20% to 40% through the elimination of redundant code.

v1.5.1

03 Aug 06:45
ad2f67c
Compare
Choose a tag to compare

Release note of v1.5.1
There are two major update in this release.

  1. Multi-bam input is now supported. The user can specify multiple BAM files by
longphase phase \
-s SNP.vcf \
-b alignment1.bam \
-b alignment2.bam \
-r reference.fasta \
-t 8 \
-o phased_prefix \
--ont # or --pb for PacBio Hifi
  1. A beta version of SNP and modification co-phasing is now released. Feedback are very welcome. The user can detect allele-specific modifications (5mC only at this moment) by a new command modcall, assuming the MM/ML tags are carried to aligned BAM.
longphase modcall \
-b alignment.bam \
-r reference.fasta \
-t 8 \
-o modcall

The SNP and modification co-phasing can then be invoked by providing the modcall-generated VCF. Co-phasing SNPs and modifications can further improve the phasing contiguity.

longphase phase \
-s SNP.vcf \
--mod-file modcall.vcf \
-b alignment.bam \
-r reference.fasta \
-t 8 \
-o phased_prefix \
--ont # or --pb for PacBio Hifi

v1.5

24 May 05:47
Compare
Choose a tag to compare

This release includes two main features.

  1. Longphase now officially support phasing Nanopore R10.4 with some fine-tuning. This version also works for previous R9.4 data.
  2. Co-phasing of SNPs and small indels can now be enabled by '--indels'. See Readme.
  3. Resolve issues #8, #26, #27.

v1.4

11 Nov 08:28
Compare
Choose a tag to compare

Major change

This version improves phasing accuracy including switch error and hamming distance. The underlying graph improves phasing integrity by multiple edges instead of second-stage block phasing. Haplotag reads are used in the implementation to improve phasing accuracy.

The phasing accuracy is improved under ONT R9.4.1 with different sequencing depths.
13_vs_14

v1.3

25 Aug 15:54
Compare
Choose a tag to compare

Major change

This version mainly improved the phasing accuracy, especially at low-coverage depth (10-20x). The underlying graph model now creates and considers multiple edges from local/flanking SNPs during phasing (see below), which increases the phasing accuracy and robustness at regions of dense ONT errors.
image

As such the running time increases ~20% on average (2.5-8 minutes for 10-60x with 8 Cores on SSD). The block N50 becomes larger at 10-30x and slightly smaller at 50-60x when compared with previous version. The phasing accuracy (SW: switch error rate) improved at all coverage.
image

Minor change

  1. The haplolag now writes the Phred-scaled phasing quality of each read in the tagged BAM (e.g., PQ:i:40), which was discussed at #19.

image

  1. The program will prompt the user if missing the reference genome. #18