Skip to content
Tessa Alexanian edited this page Aug 23, 2024 · 9 revisions

Welcome to the commec wiki!

commec is a tool for DNA sequence screening that is part of the Common Mechanism.

Common Mechanism banner

commec provides three main subcommands::

  1. screen: Run Common Mechanism screening on an input FASTA file.
  2. flag: Parse all .screen files in a directory and create two CSV files of flags raised.
  3. split: Split a multi-record FASTA file into individual files, one for each record.

The tool is designed to:

  • Screen sequences down to 50 base pairs in length.
  • Sensitively identify sequences of concern known to contribute to pathogenicity
  • Flag regulated pathogens, including those listed on the Australia Group Common Control List and a variety of national control lists, including those from India, China, and South Africa.

The screen command runs the input FASTA through four steps:

  1. Biorisk scan (uses a hmmer search against custom databases)
  2. Regulated protein homology search (uses a BLASTX or DIAMOND search against NCBI nr)
  3. Regulated nucleotide homology search (uses BLASTN against NCBI nt)
  4. Benign scan (users hmmer, cmscan and BLASTN against custom databases)

The biorisk scan to identify sequences of concern can be run in seconds on a laptop using under 1 Gb of curated databases. The complete protein homology search is designed for high-performance bioinformatics environments and requires 275-650 Gb of reference databases and at least 20 Gb of RAM. See Install Guide for more details.

Clone this wiki locally