Skip to content

Fully automated assembly

Ryan Wick edited this page Oct 30, 2024 · 23 revisions

The following commands can be run without any human intervention.

In addition to Autocycler, these commands use the helper scripts for long-read assemblers. See Generating input assemblies for more details.

For more details on each step in the process, see the corresponding wiki pages.

# Set these variables as appropriate for your system and genome:
threads=16
genome_size="5500000"

# Step 1: subsample the long-read set into multiple files
autocycler subsample --reads ont.fastq.gz --out_dir subsampled_reads --genome_size "$genome_size"

# Step 2: assemble each subsampled file
mkdir assemblies
for i in 01 07 13 19; do
    canu.sh subsampled_reads/sample_"$i".fastq assemblies/canu_"$i" "$threads" "$genome_size"
done
for i in 02 08 14 20; do
    flye.sh subsampled_reads/sample_"$i".fastq assemblies/flye_"$i" "$threads"
done
for i in 03 09 15 21; do
    miniasm.sh subsampled_reads/sample_"$i".fastq assemblies/miniasm_"$i" "$threads"
done
for i in 04 10 16 22; do
    necat.sh subsampled_reads/sample_"$i".fastq assemblies/necat_"$i" "$threads" "$genome_size"
done
for i in 05 11 17 23; do
    nextdenovo.sh subsampled_reads/sample_"$i".fastq assemblies/nextdenovo_"$i" "$threads" "$genome_size"
done
for i in 06 12 18 24; do
    raven.sh subsampled_reads/sample_"$i".fastq assemblies/raven_"$i" "$threads"
done

# Optional step: remove the subsampled reads to save space
rm -r subsampled_reads

# Step 3: compress the input assemblies into a unitig graph
autocycler compress -i assemblies -a autocycler

# Step 4: cluster the input contigs into putative replicons
autocycler cluster -a autocycler

# Steps 5 and 6: trim and resolve each QC-pass cluster
for c in autocycler/clustering/qc_pass/cluster_*; do
    autocycler trim -c "$c"
    autocycler resolve -c "$c"
done

# Step 7: combine resolved clusters into a final assembly
autocycler combine -a autocycler -i autocycler/clustering/qc_pass/cluster_*/5_final.gfa

The final consensus assembly will be named: autocycler/consensus_assembly.fasta

Clone this wiki locally