Skip to content

Command Line Usage

Tessa Alexanian edited this page Aug 23, 2024 · 1 revision

This guide provides instructions on how to use commec for DNA sequence screening. commec provides three main subcommands:

# Run screening on a FASTA file
commec screen -d /path/to/databases input.fasta

# Parse screen files and generate flag CSVs
commec flag /path/to/directory/with/screen/files

# Split a multi-record FASTA file into individual files, one for each record
commec split input.fasta

screen

To screen a FASTA file, run:

commec screen -d ~/path/to/databases input.fasta

screen has two required arguments:

  • input FASTA file: Path to FASTA file to screen
  • -d, --databases: Path to the directory containing the required databases

Optional arguments:

  • -o, --output: Output prefix (default: input filename)
  • -t, --threads: Number of threads to use (default: 1)
  • -p, --protein-search-tool: Tool for homology search (choices: "blastx", "diamond", default: "blastx")

Flags:

  • -f, --fast: Run in fast mode (skip homology search)
  • -n, --skip-nt: Skip nucleotide search if no protein hits are found
  • -c, --cleanup: Delete intermediate files after screening

flag

The .screen files produced by commec screen pipeline can be passed to flag to produce two output CSVs. flags.csv will have the following columns:

filename:                 .screen file basename
biorisk:                  "F" if flagged, "P" if no flags
virulence_factor:         "F" if flagged, "P" if no flags
regulated_virus:          "F" if flagged, "P" if no flags, "Err" if error logged
regulated_bacteria:       "F" if flagged, "P" if no flags, "Err" if error logged
regulated_eukaryote:      "F" if flagged, "P" if no flags, "Err" if error logged
mixed_regulated_non_reg:  "F" if flagged, "P" if no flags, "Err" if error logged
benign:                   "F" if not cleared, "P" if all cleared, "-" if not run

These flags are based on the biorisk scan (determining the "biorisk" and "virulence_factor" fields), the protein and nucleotide homology scans (determining the "regulated" fields) and the benign scan (determining the "benign" field).

The flags_recommended CSV just has two columns, "filename" and "recommend_flag_or_pass". The flags_recommended CSV just has two columns, "filename" and "recommend_flag_or_pass". The recommendation is based on the following decision flow:

Flowchart showing decision-making by the common mechanism flag module.

For any questions or issues, please contact [email protected] or open an issue on our GitHub repository.

Clone this wiki locally