Skip to content

Quick start

Ryan Wick edited this page Mar 16, 2023 · 26 revisions

In brief, the steps involved in getting a Trycycler consensus assembly are:

  • Prepare the input files:
    • Save your long reads as reads.fastq (gzipped reads also work).
    • Run Trycycler subsample to create multiple read subsets:
      • trycycler subsample --reads reads.fastq --out_dir read_subsets
    • Assemble each of those subsets (ideally using a few different assemblers) to produce the input assemblies for Trycycler. These should all be very similar (because they are from the same genome) but not quite identical (because they are from different read subsets). Save them as assemblies/*.fasta.
    • Optionally, manually curate your input assemblies. E.g. look at them in Bandage to see which appear nice and complete (and are thus suitable for use in Trycycler) and which are fragmented (and should be thrown out).
  • Run Trycycler cluster to group similar contigs together:
    • trycycler cluster --assemblies assemblies/*.fasta --reads reads.fastq --out_dir trycycler
  • Manually inspect the clusters to decide which are valid:
    • For this example, we'll assume cluster_001, cluster_002 and cluster_003 are the good clusters which represent replicons for which we want a consensus.
    • Delete or rename all other cluster directories (so you can glob for the good clusters with trycycler/cluster_*).
  • Run Trycycler reconcile on each of the clusters:
    • trycycler reconcile --reads reads.fastq --cluster_dir trycycler/cluster_001
    • trycycler reconcile --reads reads.fastq --cluster_dir trycycler/cluster_002
    • trycycler reconcile --reads reads.fastq --cluster_dir trycycler/cluster_003
    • For these commands to complete, it may be necessary to delete or repair some of the cluster sequences.
    • If any clusters are not reconciling well, you can use trycycler dotplot to visualise how the sequences relate to each other, which can inform any interventions you need to take.
  • Run Trycycler MSA on each of the clusters:
    • trycycler msa --cluster_dir trycycler/cluster_001
    • trycycler msa --cluster_dir trycycler/cluster_002
    • trycycler msa --cluster_dir trycycler/cluster_003
  • Run Trycycler partition to divide up the reads:
    • trycycler partition --reads reads.fastq --cluster_dirs trycycler/cluster_*
  • Run Trycycler consensus to make a consensus sequence for each contig cluster:
    • trycycler consensus --cluster_dir trycycler/cluster_001
    • trycycler consensus --cluster_dir trycycler/cluster_002
    • trycycler consensus --cluster_dir trycycler/cluster_003
  • Combine all consensus sequences into a single FASTA:
    • cat trycycler/cluster_*/7_final_consensus.fasta > trycycler/consensus.fasta

For more information, please look at the wiki pages for each of the steps involved.

If you're new to Trycycler, I'd recommend trying it out on the Demo datasets to get some practice.

Clone this wiki locally