Skip to content

0.5.0

Compare
Choose a tag to compare
@akikuno akikuno released this 05 Jun 07:08
· 89 commits to main since this release
72cdcec

📝 Documentation

  • Update the issue template from md to yml and modify it to make it easier for users to fill out each item. [Commit Detail]

💥 Breaking

  • Extremely low-frequency alleles (less than 0.05%) are considered Nanopore sequence errors and are not clustered #36.

    • Configure clustering.extract_labels so that alleles with a low number of reads (0.05% or fewer or 5 reads or fewer) are not clustered. [Commit Detail]
    • Change clustering.clustering to stop if the minimum value of the elements in the cluster is 0.5% or less. [Commit Detail]
    • Add consensus.remove_minor_alleles to remove minor alleles with fewer than 5 reads or less than 0.5% [Commit Detail]
  • Save subsetted fastq of a control sample if the read number is too large (> 10,000 reads). The control will have a maximum of 10,000 reads to avoid excessive computational load. [Commit Detail]

  • If the read length is 500 bases or less, change the mappy preset to sr. [Commit Detail]

  • Update extract_best_preset to prioritize map-ont and remove splice preset if inversion is observed. [Commit Detail]

  • Update the algorithms of cssplits_hander.reallocate_insertion_within_deletion to automate change point detection by incorporating temporal changes. [Commit Detail]

🔧 Maintenance

  • Update deploy_pypi.yml to use the latest version of Actions. Refer to the latest official YAML for guidance. [Commit Detail]

  • Integrate requirements.txt and MANIFEST.in into pyproject.toml by replacing setup.py [Commit Detail]

  • Modify to record the execution command of DAJIN2 in the log file [Commit Detail]

  • Add a test to check if the version in test_version.sh matches the version in pyproject.toml and utils.config [Commit Detail]

  • Rename consensus.subset_clust to consensus.downsample_by_label to clarify the function's purpose. [Commit Detail]

  • Update extract_unique_insertions to merge highly similar extracted insertion sequences. [Commit Detail]

    • Fix extract_unique_insertions: There is a bug where removing the key twice in fasta_insertions_unique caused the index and key to become misaligned in enumerate(distances) if i != key. Therefore, the removal of keys from fasta_insertions_unique is now done all at once at the end. [Commit Detail]
  • Add control characters for fastx_handler.sanitize_filename as forbidden chars. [Commit Detail]

  • Chang the naming convention for the temporary directory: <sample_name>/<process_content>/<allele_name>/(<label_name>)/file_name. Example: flox/consensus/control/1/mutation_loci.pickle. [Commit Detail]

  • Move sanitze_name function from utils.fastx_handlerto utils.io [Commit Detail]

🐛 Bug Fixes

  • Remove sam_handler.remove_overlapped_reads to prevent unnecessary trimming of reads. [Commit Detail]

  • Fix preprocess.insertions_to_fasta.remove_minor_groups to delete the keys (insertion loci) when insertions are removed and result in an empty dict. This prevents errors when accessing non-existent keys in subset_insertions. [Commit Detail]

  • Fix the bug in cssplits_handler.convert_cssplits_to_cstag where the insertion cs tag is not merged with the next cs tag if they have the same operator (e.g., +A|+A|=T, =T: before: +aa=T=T, after: +aa=TT). [Commit Detail]

  • Modify the system to separate intermediate files using a directory structure instead of underscores (_), ensuring that no errors occur even if users use allele names containing underscores [Commit Detail]