Skip to content

Commit

Permalink
Merge pull request #44 from akikuno/develop-0.5.2
Browse files Browse the repository at this point in the history
Develop 0.5.2
  • Loading branch information
akikuno authored Jul 8, 2024
2 parents 86d53b0 + b9ba707 commit 216c2e6
Show file tree
Hide file tree
Showing 13 changed files with 277 additions and 103 deletions.
39 changes: 39 additions & 0 deletions .github/ISSUE_TEMPLATE/question.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
name: ❓ Question
description: Report your question here
labels: ['question']

body:
- type: textarea
id: description
attributes:
label: '📋 Description'
description: A clear and concise description of the question.
validations:
required: true

- type: textarea
id: environment
attributes:
label: '🔍 Environment'
description: |
Optional: The environment information.
Example:
- OS: WSL (Ubuntu 22.04)
- DAJIN2 version: x.x.x
- Python version: x.x.x
value: |
- OS:
- DAJIN2 version:
- Python version:
render: markdown
validations:
required: false

- type: textarea
id: anything_else
attributes:
label: '📎 Anything else?'
description: |
Optional: Add any other contexts, links, or screenshots about the bug here.
validations:
required: false
4 changes: 2 additions & 2 deletions .github/workflows/pytest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,10 @@ jobs:
build:
runs-on: ${{ matrix.os }}
strategy:
max-parallel: 6
max-parallel: 10
matrix:
os: [ubuntu-latest, macos-latest]
python-version: ['3.8', '3.9', '3.10']
python-version: ['3.8', '3.9', '3.10', '3.11', '3.12']
name: Python ${{ matrix.python-version }} on ${{ matrix.os }}

defaults:
Expand Down
15 changes: 9 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ The name DAJIN is derived from the phrase 一網**打尽** (Ichimou **DAJIN** or

### Prerequisites

- Python 3.8 to 3.10
- Python >= 3.8
- Unix-like environment (Linux, macOS, WSL2, etc.)

### From [Bioconda](https://anaconda.org/bioconda/DAJIN2) (Recommended)
Expand All @@ -38,9 +38,6 @@ conda create -n env-dajin2 -c conda-forge -c bioconda python=3.10 DAJIN2 -y
conda activate env-dajin2
```

> [!IMPORTANT]
> DAJIN2 supports Python versions 3.8 to 3.10, but not Python 3.11 yet due to a [Bioconda issue](https://github.com/bioconda/bioconda-recipes/issues/37805).

> [!NOTE]
> To Apple Silicon (ARM64) users:
Expand Down Expand Up @@ -314,13 +311,19 @@ The **Allele type** includes:
> In PCR amplicon sequencing, the % of reads might not match the actual allele proportions due to amplification bias.
> Especially when large deletions are present, the deletion alleles might be significantly amplified, potentially not reflecting the actual allele proportions.
## 📣Feedback and Support
## 📣 Feedback and Support

> [!NOTE]
> For frequently asked questions, please refer to [this page](https://github.com/akikuno/DAJIN2/blob/main/docs/FAQ.md).
For questions, bug reports, or other forms of feedback, we'd love to hear from you!

For more questions, bug reports, or other forms of feedback, we'd love to hear from you!
Please use [GitHub Issues](https://github.com/akikuno/DAJIN2/issues/new/choose) for all reporting purposes.

Please refer to [CONTRIBUTING](https://github.com/akikuno/DAJIN2/blob/main/docs/CONTRIBUTING.md) for how to contribute and how to verify your contributions.



## 🤝 Code of Conduct

Please note that this project is released with a [Contributor Code of Conduct](https://github.com/akikuno/DAJIN2/blob/main/docs/CODE_OF_CONDUCT.md).
Expand Down
16 changes: 16 additions & 0 deletions docs/FAQ.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Frequently Asked Questions

## How many reads are necessary?

**We recommend at least 1,000 reads.**
With 1,000 reads, it is possible to detect point mutation alleles with a frequency of 1%, ensuring high-precision analysis. However, if the target is an indel of more than a few tens of bases, or if the expected allele frequency is 5% or higher, detection is possible with fewer reads (~500 reads).

## What is the recommended read length for analysis?

**We recommend lengths below 10kb.**
If the length is below 10kb, when reading PCR amplicons with Nanopore, the reads will uniformly cover the target region. It is possible to obtain PCR amplicons up to approximately 15kb, but this may result in uneven coverage of the target region, potentially reducing analysis accuracy.

## Can data from platforms other than Nanopore (e.g., PacBio or NGS) be analyzed?

**Yes, it is possible.**
DAJIN2 accepts common file formats (FASTA, FASTQ, BAM) as input, allowing the analysis of data from platforms other than Nanopore. However, since we do not have experience using DAJIN2 with non-Nanopore data, please contact us [here](https://github.com/akikuno/DAJIN2/issues/new/choose) if you encounter any issues.
19 changes: 19 additions & 0 deletions docs/FAQ_JP.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# よくあるご質問


## 必要なリード数はどれくらいですか?

**1,000リード以上を推奨しています。**
1,000リード以上あれば1%の点変異アレルの検出が可能であり、高精度な解析が可能です。一方で、検出対象が数10塩基以上のindelであったり、予想されるアレル頻度が5%以上である場合にはより少ないリード数(~500リード程度)で検出が可能です。


## 解析可能なリード長はどれくらいですか?

**10kb以下を推奨しています。**
10kb以下であれば、PCRアンプリコンをNanoporeで読んだ際に、標的領域に満遍なくリードが張り付きます。最大で15kb程度のPCRアンプリコンを得ることは可能ではありますが、標的領域におけるカバレッジにムラが生じてしまい、解析精度が低下する可能性があります。

## Nanopore以外(PacBioやNGS)のデータの解析は可能ですか?

**可能です。**
DAJIN2は一般的なファイルフォーマット(FASTA, FASTQ, BAM)を入力として受け付けているため、Nanopore以外のデータも解析可能です。ただし、私たちのほうではNanoporeデータ以外にDAJIN2を用いた経験がないため、もしご利用に不具合が生じた場合には、お手数ですが[こちら](https://github.com/akikuno/DAJIN2/issues/new/choose)よりお問い合わせください。

13 changes: 6 additions & 7 deletions docs/README_JP.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ DAJIN2は、ナノポアシーアターゲットシーケンシングを用い

### 環境

- Python 3.8 - 3.10
- Python >= 3.8
- Unix環境 (Linux, macOS, WSL2, etc.)

### [Bioconda](https://anaconda.org/bioconda/DAJIN2) (推奨)
Expand All @@ -36,9 +36,6 @@ conda create -n env-dajin2 -c conda-forge -c bioconda python=3.10 DAJIN2 -y
conda activate env-dajin2
```

> [!IMPORTANT]
> 現状、[BiocondaがPython 3.11以上に対応していない](https://github.com/bioconda/bioconda-recipes/issues/37805)ため、DAJIN2はPython 3.8 から 3.10までをサポートしています。
> [!NOTE]
> Appleシリコン搭載のMacの場合:
> 現状、[BiocondaがAppleシリコンに対応していない](https://github.com/bioconda/bioconda-recipes/issues/37068#issuecomment-1257790919)ため、以下のようにRoseeta2経由でインストールを行ってください
Expand Down Expand Up @@ -364,12 +361,14 @@ read_plot.html および read_plot.pdf は、resd_summary.xlsxを可視化した
> とくに大型欠失が存在する場合、欠失アレルが顕著に増幅されることから、実際のアレル割合を反映しない可能性が高まります。

## 📣フィードバックと行動規範
## 📣 フィードバックと行動規範

> [!NOTE]
> よくあるご質問については、[こちら](https://github.com/akikuno/DAJIN2/blob/main/docs/FAQ_JP.md)ををご覧ください。

質問、バグ報告、その他のフィードバックについて、皆さまからのご意見をお待ちしています。
他の質問、バグ報告、フィードバックについて、皆さまからのご意見をお待ちしています。
報告には [GitHub Issues](https://github.com/akikuno/DAJIN2/issues/new/choose) をご利用ください(日本語でも大丈夫です)。
<!-- フィードバックの方法は、[CONTRIBUTING](https://github.com/akikuno/DAJIN2/blob/main/docs/CONTRIBUTING.md) をご覧ください。 -->

<!-- ## 🤝 コントリビューター行動規範 -->

Expand Down
62 changes: 47 additions & 15 deletions docs/RELEASE.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,23 +3,64 @@
## 💥 Breaking
## 📝 Documentation
## 🚀 Performance
## 🌟 New Features
## 🐛 Bug Fixes
## 🔧 Maintenance
## ⛔️ Deprecated
[[Commit Detail](https://github.com/akikuno/DAJIN2/commit/xxxxx)]
-->

<!-- 💡 ToDo
<!-- ############################################################# # -->

# Current Release

# v0.5.2 (2024-XX-XX)

## 📝 Documentation

+ Add `FAQ.md` and `FAQ_JP.md` to provide answers to questions. [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/1172fddd34c382f92b6778d6f30fd733b458cc04)]

## 🌟 New Features

- Update `mutation_extractor` [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/9444ee701ee52adeb6271552eff70667fb49b854)]
- Simplified the logic of the `is_dissimilar_loci` if statement. Additionally, changed the threshold for determining a mutation in Consensus from 75% to 50% (to accommodate the insertion allele in Cas3 Tyr Barcode10).
- Updated `detect_anomalies` to use MLPClassifier to detect mutations more flexibly and accurately compared to the previous threshold setting with MiniBatchKMeans.

## 🔧 Maintenance

+ Make DAJIN2 compatible with Python 3.11 and 3.12. Issue: #43 [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/8da9118f5c0f584ed1ab12541d5e410d1b9f0da8)]
+ pysam and mappy builds with Python 3.11 and 3.12 are now available on Bioconda.

+ Update GitHub Actions to test with Python 3.11 and 3.12. Issue: #43 [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/54df79e60b484da429c1cbf6f12b0c19196452cc)]

+ Resolve the B023 Function definition does not bind loop variable `alignment_lengths` issue. [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/9c85d2f0410494a9b71d9905fad2f9e4efe30ed7)]

+ Add `question.yml` in GitHub Issue template. [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/1172fddd34c382f92b6778d6f30fd733b458cc04)]


## 🐛 Bug Fixes

+ Update `cssplits_handler._get_index_of_large_deletions`: Modified to split large deletions when a match of 10 or more bases is found within the identified large deletion. Issue: #42 [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/xxxxx)]

-->

<!-- ############################################################# # -->

# Current Release

## v0.5.1 (2024-06-15)

## 💥 Breaking
-------------------------------------------------------------

# Past Releases

<!-- ------------------------------------------------------------- -->

<!-- <details>
<summary> v0.5.0 (2024-06-05) </summary>
</details> -->

<details>
<summary> v0.5.1 (2024-06-15) </summary>

## 🌟 New Features

+ Enable to accept additional file formats as an input. Issue: #37
+ FASTA [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/ee6d392cd51649c928bd604acafbab4b9d28feb1)]
Expand All @@ -42,16 +83,7 @@

+ Add `reallocate_insertion_within_deletion` into `report.mutation_exporter` and reflected it in the mutation info. [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/ed6a96e01bb40c77df9cd3a17a4c29524684b6f1)]


<!-- ############################################################# # -->



-------------------------------------------------------------

# Past Releases

<!-- ------------------------------------------------------------- -->
</details>

<details>
<summary> v0.5.0 (2024-06-05) </summary>
Expand Down
4 changes: 2 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "poetry.core.masonry.api"

[tool.poetry]
name = "DAJIN2"
version = "0.5.1"
version = "0.5.2"
description = "One-step genotyping tools for targeted long-read sequencing"
authors = ["Akihiro Kuno <[email protected]>"]
readme = "README.md"
Expand All @@ -29,7 +29,7 @@ include = [
]

[tool.poetry.dependencies]
python = ">=3.8, <3.11"
python = "^3.8"
numpy = ">=1.24.0"
scipy = ">=1.10.0"
pandas = ">=1.0.0"
Expand Down
2 changes: 1 addition & 1 deletion src/DAJIN2/core/preprocess/midsv_caller.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ def extract_best_preset(preset_cigar_by_qname: dict[str, dict[str, str]]) -> dic
continue

# Define a custom key function to prioritize map-ont
def custom_key(key: str) -> tuple[int, bool]:
def custom_key(key: str, alignment_lengths=alignment_lengths) -> tuple[int, bool]:
return (alignment_lengths[key], key == "map-ont")

max_key = max(alignment_lengths, key=custom_key)
Expand Down
Loading

0 comments on commit 216c2e6

Please sign in to comment.