Skip to content

Viewing and exporting data

Ted Verhey edited this page Nov 7, 2017 · 3 revisions

Viewing database summaries

Database summaries are for diagnostic use of the database contents and organization.

Reference summary

The command vast dbstats --references displays the following information about each reference:

  • Name
  • Sequence
  • Offset
  • Annotated variable region coordinates If silent cassette sequences have been included for the reference, it also displays the following information for each cassette:
  • Name
  • Sequence
  • Length
  • Aligned / Not Aligned

Read data summary

The command vast dbstats --ontology displays the ontology of reads in the database. The most recent version of VAST should only ever have 1 ontology per database. The number of reads in each ontology is shown, as well as the column headers of the ontology.

Usage

usage: vast.py dbstats [-h] [--references] [--ontology]

optional arguments:
  -h, --help        show this help message and exit
  --references, -r  Display information about the reference and cassette
                    sequences in the database.
  --ontology, -o    Display the ontology for reads in the database.

Exporting references

References in the database can be exported into popular formats. vast export_references populates the my_database\References\ directory with two files for each reference:

  • a FASTA format file containing the sequence of the reference, and its name
  • if cassette sequences were aligned to the reference, a .bam file containing the mappings of the cassettes. (Note that pysam must be installed for this file to be output.)