0.5.0
📝 Documentation
- Update the issue template from md to yml and modify it to make it easier for users to fill out each item. [Commit Detail]
💥 Breaking
-
Extremely low-frequency alleles (less than 0.05%) are considered Nanopore sequence errors and are not clustered #36.
- Configure
clustering.extract_labels
so that alleles with a low number of reads (0.05% or fewer or 5 reads or fewer) are not clustered. [Commit Detail] - Change
clustering.clustering
to stop if the minimum value of the elements in the cluster is 0.5% or less. [Commit Detail] - Add
consensus.remove_minor_alleles
to remove minor alleles with fewer than 5 reads or less than 0.5% [Commit Detail]
- Configure
-
Save subsetted fastq of a control sample if the read number is too large (> 10,000 reads). The control will have a maximum of 10,000 reads to avoid excessive computational load. [Commit Detail]
-
If the read length is 500 bases or less, change the mappy preset to
sr
. [Commit Detail] -
Update
extract_best_preset
to prioritizemap-ont
and removesplice
preset if inversion is observed. [Commit Detail] -
Update the algorithms of
cssplits_hander.reallocate_insertion_within_deletion
to automate change point detection by incorporating temporal changes. [Commit Detail]
🔧 Maintenance
-
Update
deploy_pypi.yml
to use the latest version of Actions. Refer to the latest official YAML for guidance. [Commit Detail] -
Integrate
requirements.txt
andMANIFEST.in
intopyproject.toml
by replacingsetup.py
[Commit Detail] -
Modify to record the execution command of DAJIN2 in the log file [Commit Detail]
-
Add a test to check if the version in
test_version.sh
matches the version inpyproject.toml
andutils.config
[Commit Detail] -
Rename
consensus.subset_clust
toconsensus.downsample_by_label
to clarify the function's purpose. [Commit Detail] -
Update
extract_unique_insertions
to merge highly similar extracted insertion sequences. [Commit Detail]- Fix
extract_unique_insertions
: There is a bug where removing the key twice in fasta_insertions_unique caused the index and key to become misaligned in enumerate(distances) if i != key. Therefore, the removal of keys from fasta_insertions_unique is now done all at once at the end. [Commit Detail]
- Fix
-
Add control characters for
fastx_handler.sanitize_filename
as forbidden chars. [Commit Detail] -
Chang the naming convention for the temporary directory:
<sample_name>/<process_content>/<allele_name>/(<label_name>)/file_name
. Example:flox/consensus/control/1/mutation_loci.pickle
. [Commit Detail] -
Move
sanitze_name
function fromutils.fastx_handler
toutils.io
[Commit Detail]
🐛 Bug Fixes
-
Remove
sam_handler.remove_overlapped_reads
to prevent unnecessary trimming of reads. [Commit Detail] -
Fix
preprocess.insertions_to_fasta.remove_minor_groups
to delete the keys (insertion loci) when insertions are removed and result in an empty dict. This prevents errors when accessing non-existent keys insubset_insertions
. [Commit Detail] -
Fix the bug in
cssplits_handler.convert_cssplits_to_cstag
where the insertion cs tag is not merged with the next cs tag if they have the same operator (e.g.,+A|+A|=T, =T
: before:+aa=T=T
, after:+aa=TT
). [Commit Detail] -
Modify the system to separate intermediate files using a directory structure instead of underscores (
_
), ensuring that no errors occur even if users use allele names containing underscores [Commit Detail]