-
Notifications
You must be signed in to change notification settings - Fork 6
Fully automated assembly
Ryan Wick edited this page Oct 30, 2024
·
23 revisions
The following commands can be run without any human intervention.
In addition to Autocycler, these commands use the helper scripts for long-read assemblers. See Generating input assemblies for more details.
For more details on each step in the process, see the corresponding wiki pages.
# Set these variables as appropriate for your system and genome:
threads=16
genome_size="5500000"
# Step 1: subsample the long-read set into multiple files
autocycler subsample --reads ont.fastq.gz --out_dir subsampled_reads --genome_size "$genome_size"
# Step 2: assemble each subsampled file
mkdir assemblies
for i in 01 07 13 19; do
canu.sh subsampled_reads/sample_"$i".fastq assemblies/canu_"$i" "$threads" "$genome_size"
done
for i in 02 08 14 20; do
flye.sh subsampled_reads/sample_"$i".fastq assemblies/flye_"$i" "$threads"
done
for i in 03 09 15 21; do
miniasm.sh subsampled_reads/sample_"$i".fastq assemblies/miniasm_"$i" "$threads"
done
for i in 04 10 16 22; do
necat.sh subsampled_reads/sample_"$i".fastq assemblies/necat_"$i" "$threads" "$genome_size"
done
for i in 05 11 17 23; do
nextdenovo.sh subsampled_reads/sample_"$i".fastq assemblies/nextdenovo_"$i" "$threads" "$genome_size"
done
for i in 06 12 18 24; do
raven.sh subsampled_reads/sample_"$i".fastq assemblies/raven_"$i" "$threads"
done
# Optional step: remove the subsampled reads to save space
rm -r subsampled_reads
# Step 3: compress the input assemblies into a unitig graph
autocycler compress -i assemblies -a autocycler
# Step 4: cluster the input contigs into putative replicons
autocycler cluster -a autocycler
# Steps 5 and 6: trim and resolve each QC-pass cluster
for c in autocycler/clustering/qc_pass/cluster_*; do
autocycler trim -c "$c"
autocycler resolve -c "$c"
done
# Step 7: combine resolved clusters into a final assembly
autocycler combine -a autocycler -i autocycler/clustering/qc_pass/cluster_*/5_final.gfa
The final consensus assembly will be named: autocycler/consensus_assembly.fasta
- Step 1: Autocycler subsample
- Step 2: Generating input assemblies
- Step 3: Autocycler compress
- Step 4: Autocycler cluster
- Step 5: Autocycler trim
- Step 6: Autocycler resolve
- Step 7: Autocycler combine