Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Develop 0.3.6 #19

Merged
merged 27 commits into from
Jan 9, 2024
Merged

Develop 0.3.6 #19

merged 27 commits into from
Jan 9, 2024

Conversation

akikuno
Copy link
Owner

@akikuno akikuno commented Jan 9, 2024

v0.3.6 (2024-01-10)

📝 Documentation

  • Added a quick guide for installation to TROUBLESHOOTING.md. Commit Detail

🚀 Update

Preprocess

  • Updated input_validator.py: The UCSC Blat server sometimes returns a 200 HTTP status code even when an error occurs. In such cases, "Very Early Error" is indicated in the title. Therefore, we have made it so that it returns False in those situations. Commit Detail

  • Simplified homopolymer_handler.py for error detection using cosine similarity. Commit Detail

  • Updated mutation_extractor.py to use cosine similarity to filter dissimilar loci. Commit Detail

  • Updated the mutation_extractor.identify_dissimilar_loci so that it unconditionally returns True if the 'sample' shows more than 5% variation compared to the 'control'. Commit Detail

  • Added preprocess.midsv_caller.convert_consecutive_indels_to_match: Due to alignment errors, instances where a true match is mistakenly replaced with "insertion following a deletion" are corrected. For example, "=C,=T" mistakenly replaced by "-C,+C|=T" is reverted back to "=C,=T". Commit Detail

Classification

  • Added allele_merger.merge_minor_alleles to reclassify alleles with fewer than 10 reads to suppress excessive subdivision of alleles. Commit Detail

Clustering

  • Added the function merge_minor_cluster to revert labels clustered with fewer than 10 reads back to the previous labels to suppress excessive subdivision of alleles. Commit Detail

  • Updated generate_mutation_kmers to consider indices not registered in mutation_loci as mutations by replacing them with "@". For example, "=G,=C,-C" and "=G,=G,=C" become "@,@,@" in both cases, making them the same and ensuring they do not affect clustering. Commit Detail

Consensus

  • Implemented LocalOutlierFactor to filter abnormal control reads. Commit Detail

…ith less than 10 reads back to the previous labels.
…en an error occurs. In such cases, "Very Early Error" is indicated in the Title. Therefore, we have made it so that it returns False in those situations.
Due to alignment errors, there can be instances where a true match is mistakenly replaced with "insertion following a deletion".
For example, although it should be "=C,=T", it gets replaced by "-C,+C|=T". In such cases, a process is performed to revert it back to "=C,=T".
…d in mutation_loci as mutations by replacing them with "@". For example, if there are no mutations in mutation_loci, "=G,=C,-C" and "~G,=G,=C" become "@,@,@" and "@,@,@" respectively, making them the same and ensuring they do not affect clustering.
…ally returns True if the 'sample' shows more than 5% variation compared to the 'control'.
…d in mutation_loci as mutations by replacing them with "@". For example, if there are no mutations in mutation_loci, "=G,=C,-C" and "~G,=G,=C" become "@,@,@" and "@,@,@" respectively, making them the same and ensuring they do not affect clustering.
…rence between sample and control is more than 20%, it is unconditionally considered a mutation.
@akikuno akikuno merged commit f3cd582 into main Jan 9, 2024
6 checks passed
@akikuno akikuno deleted the develop-0.3.6 branch January 9, 2024 20:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant