Skip to content

Commit

Permalink
update ROADMAP.md
Browse files Browse the repository at this point in the history
  • Loading branch information
akikuno committed Jan 9, 2024
1 parent f3cd582 commit d230c49
Showing 1 changed file with 29 additions and 38 deletions.
67 changes: 29 additions & 38 deletions docs/ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,70 +11,61 @@
-->

<!-- memo ToDo
- barcode09 allele1 の`N`
- barcode11 allele2 の大型欠失が反映されていない
- barcode28 allele1 の`N`
- ✅ barcode09 allele1 の`N`
- barcode11 allele2 の大型欠失が反映されていない & barcode28 allele1 の`N` → Control側の問題
- FASTQ、VCFを出力する
-->
# v0.3.6 (yyyy-mm-dd)

## 📝 Documentation
+ Added a quick quide of installation to TROUBLESHOOTING.md [Commit Detail](https://github.com/akikuno/DAJIN2/commit/cefed0ff4d04282b9915486be07de85b2b77b657)

## 🚀 Features

### Preprocess
+ # 💡 Future Tasks
+ Remove minor alleles with predicted insertion
+ Enhance the Clarity of Insertion Allele Identification.
+ Develop and Integrate Inversion Detection Capability
+ ReferenceのアレルをFASTA/HTMLディレクトリに保存する
-->

+ Update `input_validator.py`: The UCSC Blat server sometimes returns a 200 HTTP status code even when an error occurs. In such cases, "Very Early Error" is indicated in the Title. Therefore, we have made it so that it returns False in those situations. [Commit Detail](https://github.com/akikuno/DAJIN2/commit/4ad9c9ef8bd963a6e20c1721480aed0fe7922760)

+ Simplyfy `homopolymer_handler.py` for the error detection using cosine similarity [Commit Detail](https://github.com/akikuno/DAJIN2/commit/21c2596805c36074f360285600e60ee76b948908)

+ Update `mutation_extractor.py` to use cosine similarity to filter dissimilar loci [Commit Detail](https://github.com/akikuno/DAJIN2/commit/c9f5aa7b48581e58d99fe8c31275c422756aa9f1)

+ Update the `mutation_extractor.identify_dissimilar_loci` so that it unconditionally returns True if the 'sample' shows more than 5% variation compared to the 'control'. [Commit Detail](https://github.com/akikuno/DAJIN2/commit/0cbec5217fdfba6886979eb86cf970b587e83e5f)
-------------

+ Add `preprocess.midsv_caller.convert_consecutive_indels_to_match`: Due to alignment errors, there can be instances where a true match is mistakenly replaced with "insertion following a deletion". For example, although it should be "=C,=T", it gets replaced by "-C,+C|=T". In such cases, a process is performed to revert it back to "=C,=T". [Commit Detail](https://github.com/akikuno/DAJIN2/commit/69c56fa904ef847dc5b0e2dcdb90303409412d0f)
# Past Logs

### Classification
<details>
<summary> v0.3.6 (2024-01-10) </summary>

+ Added `allele_merger.merge_minor_alleles` to reclassify alleles with less than 10 reads to suppress excessive subdivision of alleles. [Commit Detail](https://github.com/akikuno/DAJIN2/commit/b0752960def313e237ccf7d44542f9810cad0c00)
## 📝 Documentation

### Clustering
- Added a quick guide for installation to TROUBLESHOOTING.md. [Commit Detail](https://github.com/akikuno/DAJIN2/commit/cefed0ff4d04282b9915486be07de85b2b77b657)

+ Added the function `merge_minor_cluster` to revert labels clustered with less than 10 reads back to the previous labels to suppress excessive subdivision of alleles.
[Commit Detail](https://github.com/akikuno/DAJIN2/commit/4bd9f7dd806d192475d8d4f20c1e50c37281d64e)
## 🚀 Update

+ Update `generate_mutation_kmers` to to consider indices not registered in mutation_loci as mutations by replacing them with "@". For example, if there are no mutations in mutation_loci, "=G,=C,-C" and "~G,=G,=C" become "@,@,@" and "@,@,@" respectively, making them the same and ensuring they do not affect clustering. [Commit Detail](https://github.com/akikuno/DAJIN2/commit/9eefaaa1a9be3922b60655292c0a310e0f5fc76d)
### Preprocess

### Consensus
- Updated `input_validator.py`: The UCSC Blat server sometimes returns a 200 HTTP status code even when an error occurs. In such cases, "Very Early Error" is indicated in the title. Therefore, we have made it so that it returns False in those situations. [Commit Detail](https://github.com/akikuno/DAJIN2/commit/4ad9c9ef8bd963a6e20c1721480aed0fe7922760)

+ Use `LocalOutlierFactor` to filter abnormal control reads [Commit Detail](https://github.com/akikuno/DAJIN2/commit/4bd9f7dd806d192494c48da01fc039902c97a23ddea47dd5f2b42ab475d8d4f20c1e50c37281d64e)
- Simplified `homopolymer_handler.py` for error detection using cosine similarity. [Commit Detail](https://github.com/akikuno/DAJIN2/commit/21c2596805c36074f360285600e60ee76b948908)

- Updated `mutation_extractor.py` to use cosine similarity to filter dissimilar loci. [Commit Detail](https://github.com/akikuno/DAJIN2/commit/c9f5aa7b48581e58d99fe8c31275c422756aa9f1)

## 🐛 Bug Fixes
- Updated the `mutation_extractor.identify_dissimilar_loci` so that it unconditionally returns True if the 'sample' shows more than 5% variation compared to the 'control'. [Commit Detail](https://github.com/akikuno/DAJIN2/commit/0cbec5217fdfba6886979eb86cf970b587e83e5f)

### Consensus
- Added `preprocess.midsv_caller.convert_consecutive_indels_to_match`: Due to alignment errors, instances where a true match is mistakenly replaced with "insertion following a deletion" are corrected. For example, "=C,=T" mistakenly replaced by "-C,+C|=T" is reverted back to "=C,=T". [Commit Detail](https://github.com/akikuno/DAJIN2/commit/69c56fa904ef847dc5b0e2dcdb90303409412d0f)

+ 大型欠失の内部で欠失が反映されないバグを修正 [Commit Detail](https://github.com/akikuno/DAJIN2/commit/XXX)
### Classification

## 🔧 Maintenance
- Added `allele_merger.merge_minor_alleles` to reclassify alleles with fewer than 10 reads to suppress excessive subdivision of alleles. [Commit Detail](https://github.com/akikuno/DAJIN2/commit/b0752960def313e237ccf7d44542f9810cad0c00)

### Clustering

## ⛔️ Deprecated
- Added the function `merge_minor_cluster` to revert labels clustered with fewer than 10 reads back to the previous labels to suppress excessive subdivision of alleles. [Commit Detail](https://github.com/akikuno/DAJIN2/commit/4bd9f7dd806d192475d8d4f20c1e50c37281d64e)

---
- Updated `generate_mutation_kmers` to consider indices not registered in mutation_loci as mutations by replacing them with "@". For example, "=G,=C,-C" and "=G,=G,=C" become "@,@,@" in both cases, making them the same and ensuring they do not affect clustering. [Commit Detail](https://github.com/akikuno/DAJIN2/commit/9eefaaa1a9be3922b60655292c0a310e0f5fc76d)

# 💡 Future Tasks
### Consensus

+ Remove minor alleles with predicted insertion
+ Enhance the Clarity of Insertion Allele Identification.
+ Develop and Integrate Inversion Detection Capability
+ ReferenceのアレルをFASTA/HTMLディレクトリに保存する
- Implemented `LocalOutlierFactor` to filter abnormal control reads. [Commit Detail](https://github.com/akikuno/DAJIN2/commit/94c48da01fc039902c97a23ddea47dd5f2b42ab4)

-------------
</details>

# Past Logs

<details>
<summary> v0.3.5 (2023-12-23) </summary>
Expand Down

0 comments on commit d230c49

Please sign in to comment.