Skip to content

Fully automated assembly

Ryan Wick edited this page Jan 22, 2025 · 23 revisions

The following commands can be run without any human intervention.

In addition to Autocycler, these commands use some of the helper scripts. See Generating input assemblies and Genome size estimation for more details.

For more details on each step in the process, see the corresponding wiki pages.

reads=ont.fastq.gz  # your read set goes here
threads=16  # set as appropriate for your system
genome_size=$(genome_size_raven.sh "$reads" "$threads")  # can set this manually if you know the value

# Step 1: subsample the long-read set into multiple files
autocycler subsample --reads "$reads" --out_dir subsampled_reads --genome_size "$genome_size"

# Step 2: assemble each subsampled file
mkdir assemblies
for assembler in canu flye miniasm necat nextdenovo raven; do
    for i in 01 02 03 04; do
        "$assembler".sh subsampled_reads/sample_"$i".fastq assemblies/"$assembler"_"$i" "$threads" "$genome_size"
    done
done

# Optional step: remove the subsampled reads to save space
rm subsampled_reads/*.fastq

# Step 3: compress the input assemblies into a unitig graph
autocycler compress -i assemblies -a autocycler_out

# Step 4: cluster the input contigs into putative genomic sequences
autocycler cluster -a autocycler_out

# Steps 5 and 6: trim and resolve each QC-pass cluster
for c in autocycler_out/clustering/qc_pass/cluster_*; do
    autocycler trim -c "$c"
    autocycler resolve -c "$c"
done

# Step 7: combine resolved clusters into a final assembly
autocycler combine -a autocycler_out -i autocycler_out/clustering/qc_pass/cluster_*/5_final.gfa

The final consensus assembly will be named: autocycler_out/consensus_assembly.fasta

If you perform many automated assemblies with Autocycler, I recommend using Autocycler table to produce a TSV after they finish to check for problematic genomes.

And if you want to automate the entire Autocycler assembly process, take a look at the pipelines directory in the Autocycler repo. It contains user-contributed pipelines designed to simplify and streamline running Autocycler. Feel free to use, modify or contribute your own!

Clone this wiki locally