Skip to content

Commit

Permalink
Readme
Browse files Browse the repository at this point in the history
  • Loading branch information
bede committed Jul 6, 2023
1 parent a733b7a commit 079d965
Showing 1 changed file with 43 additions and 24 deletions.
67 changes: 43 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,44 +2,72 @@

# Hostile

Rapid FASTQ decontamination by host depletion. Accepts paired fastq.gz files as arguments and outputs paired fastq.gz files. Downloads and caches a custom human reference genome to `$XDG_DATA_DIR`. Replaces read headers with incrementing integers for speed and privacy. Python package with CLI and Python API. Installs with conda/mamba.
Rapid FASTQ decontamination by host subtraction. Accepts Illumina or ONT fastq[.gz] input and outputs fastq.gz files. Downloads and caches a custom human T2T + HLA reference genome to `$XDG_DATA_DIR` when run for the first time. Replaces read headers with incrementing integers for speed and privacy. Python package with CLI and Python API. Installs with conda/mamba. Please read the [BioRxiv preprint](https://www.biorxiv.org/content/10.1101/2023.07.04.547735) for further information, and open a GitHub issue if you encounter problems.



## Install

### Conda

```bash
curl -OJ https://raw.githubusercontent.com/bede/hostile/main/environment.yml
conda env create -f environment.yml # Mamba is faster
conda activate hostile
pip install hostile

# Test
hostile clean --fastq1 tests/data/mixed_human_100_1.fastq.gz --fastq2 tests/data/mixed_human_100_2.fastq.gz
```



### Docker

*Coming soon*



### Development install

```bash
git clone https://github.com/bede/hostile.git
cd hostile
conda env create -f environment.yml # Use mamba if impatient
conda activate hostile
pip install .
pip install --editable '.[dev]'
pytest
```




## Command line usage

```bash
% hostile clean --help
usage: hostile clean [-h] --fastq1 FASTQ1 --fastq2 FASTQ2 [--aligner {bowtie2,minimap2}] [--out-dir OUT_DIR] [--threads THREADS] [--debug]
usage: hostile clean [-h] --fastq1 FASTQ1 [--fastq2 FASTQ2] [--aligner {bowtie2,minimap2}] [--custom-index CUSTOM_INDEX] [--out-dir OUT_DIR]
[--threads THREADS] [--debug]

Remove human reads from paired fastq.gz files
Remove human reads from paired fastq(.gz) files

options:
-h, --help show this help message and exit
--fastq1 FASTQ1 path to forward fastq.gz file
--fastq2 FASTQ2 path to reverse fastq.gz file
--fastq1 FASTQ1 path to forward fastq(.gz) file
--fastq2 FASTQ2 optional path to reverse fastq(.gz) file
(default: None)
--aligner {bowtie2,minimap2}
alignment algorithm
(default: bowtie2)
--custom-index CUSTOM_INDEX
path to custom index
(default: None)
--out-dir OUT_DIR output directory for decontaminated fastq.gz files
(default: /Users/bede/Research/Git/hostile)
(default: /root/hostile/tests/data)
--threads THREADS number of CPU threads to use
(default: 10)
(default: 1)
--debug show debug messages
(default: False)

(default: False) (default: False)
```


Expand All @@ -64,28 +92,19 @@ Cleaning: 100%|█████████████████████
"reads_removed_proportion": 0.0
}
]

```



## Python usage

```python
from pathlib import Path
from hostile.lib import clean_paired_fastqs

decontamination_statistics = clean_paired_fastqs(fastqs=[("h37rv_10.r1.fastq.gz", fastq2="h37rv_10.r1.fastq.gz")])
```



## Development
stats = clean_paired_fastqs(
fastqs=[(Path("h37rv_10.r1.fastq.gz"), Path("h37rv_10.r1.fastq.gz"))]
)

```bash
git clone https://github.com/bede/hostile.git
cd hostile
conda env create -f environment.yml # Use mamba if impatient
conda activate hostile
pip install --editable '.[dev]'
pytest
print(stats)
```

0 comments on commit 079d965

Please sign in to comment.