Release 0.5.3 · akikuno/DAJIN2

💥 Breaking

Update clustering.clustering: Use Constrained Kmeans clustering to address the issue of cluster imbalance where extremely minor clusters were preferentially separated. Set min_cluster_size to 0.5% of the sample read count. [Commit Detail]
- As a result, clustering.label_merger.py is no longer needed and has been removed.
Update consensus.call_consensus: For mutations determined to be sequence errors, we previously replaced them with unknown (N), but this N had low interpretability. Therefore, mutations that DAJIN2 determines to be sequence errors will now be assigned the same base as the reference genome. [Commit Detail]

Due to a bias in classifiler.calc_match where alleles with shorter sequences were prioritized, the operation of dividing by sequence length has been removed. [Commit Detail]
Fix preporcess.mapping.generate_sam to perform alignments with map-ont and splice in addition to sr for sequence lengths of 500 bp or less, and select the optimal prefix from these alignments. Issue: #45 [Commit Detail]