-
Notifications
You must be signed in to change notification settings - Fork 1
Cookbook
Jim Shaw edited this page Oct 18, 2024
·
6 revisions
devider -b sorted_indexed.bam -v snps.vcf.gz -r reference.fa -o output -t 10 --preset nanopore-r9
-
-b
: sorted and indexed BAM file -
-v
: VCF file with header -
-r
: reference file (same reference as your BAM file) -
-t
: number of threads
Tip
You may need to run bgzip snps.vcf; tabix snps.vcf.gz
if your vcf does not have a header.
About --preset
: we offer four presets parameter choices. old-long-reads, nanopore-r9, nanopore-r10, hi-fi
. See the advanced usage manual for more information. In short, use:
-
old-long-reads
for ~90% identity long reads -
nanopore-r9
for ~95% (default) -
nanopore-r10
for ~98% -
hi-fi
for true high fidelity reads (> 99.9% accuracy).
run_devider_pipeline -i reads.fq.gz -r reference.fa -o pipeline_output -t 10 --devider-options "min-cov=10 allele-output preset=nanopore-r10"
This is just a light wrapper around minimap2, LoFreq, and devider. This comes with the conda install or is available through the scripts/
folder.
-
-i
: your reads -
-r
: reference file to map and call variants -
-o
: output folder -
-t
: number of threads for each pipeline step (minimap2, lofreq, then devider) -
--devider-options
: you can set options fordevider
by writing the long-form option and usinglong-form=value
for options with values.
haplotag_bam sorted_indexed.bam -i output/ids.txt
#visualize the tagged bam file in IGV.
ls sorted_indexed.bam.tagged.bam
-
sorted_indexed.bam
- your original bam file input to devider. If you used the pipeline, seeoutput/pipeline/mapping.bam
. -
-i
- theoutput/ids.txt
file output by devider, assigning reads to each haplotype.
haplotag_bam
is available in the conda install (or in scripts/
). This generates a haplotagged bam file with HP:i
tags for each read, allowing easy visualization in IGV.
devider ... --bed-file regions.bed
devider ... --sequence-to-phase NC_001802.1:1-1000,NC_001592.1:5000-6000
# pipeline version
run_devider_pipeline --devider-options="bed-file=regions.bed"
-
--bed-file
: A >= 3 column bed file (only the first 3 matter) with contig id, start position, and end position. Each region will be haplotyped. You can specify multiple regions per contig. -
--sequenced-to-phase
: specify the region/contig to phase incontig:start-end
format. Multiple regions can be separated by a comma.
devider ... -o output --allele-output --output-reads
head output/reads.fq
# obtain all reads for Haplotype:1
grep -A3 Haplotype:1 devider_pipeline_output/reads.fq | grep -v '^--' > haplotype_1_reads.fq
-
--allele-output
: output base-level alleles instead of0
,1
alleles forsnp_haplotypes.fasta
. See the page on devider's outputs for more info. -
--output-reads
: outputs the reads in fastq format inoutput/reads.fq
. These reads are tagged by their haplotype.