Skip to content
Marcus Fedarko edited this page Nov 6, 2018 · 14 revisions

Welcome to the MetagenomeScope wiki! This wiki is a work in progress, so if you have any questions feel free to get in contact.

About MetagenomeScope

Screenshot of MetagenomeScope's standard mode, showing a region of a biofilm assembly graph

MetagenomeScope is an interactive visualization tool designed for metagenomic sequence assembly graphs. The tool aims to display a hierarchical layout of the input graph while emphasizing the presence of small-scale details that can correspond to interesting biological features in the data.

To this end, MetagenomeScope highlights certain "structural patterns" of contigs in the graph, splits the graph into its connected components (only displaying one connected component at a time), and uses Graphviz' dot tool to hierarchically lay out each connected component of the graph.

MetagenomeScope also contains a bunch of other features intended to simplify exploratory analysis of assembly graphs, including tools for scaffold visualization, path finishing, and (optionally) SPQR tree decomposition of biconnected components in the graph.

MetagenomeScope is composed of two main components:

  1. The preprocessing script (contained in the graph_collator/ directory of this repository), a mostly Python script that takes as input an assembly graph file and produces a SQLite .db file that can be visualized in the viewer interface. collate.py is the main script that needs to be run here. This preprocessing step takes care of structural pattern detection, graph layout, and (optionally) SPQR tree generation.

    • Currently, this supports LastGraph (Velvet), GML (MetaCarvel), and GFA input files. Support for SPAdes FASTG files should be ready very soon, as well.
    • See this page on MetagenomeScope's wiki for information on the system requirements for the preprocessing script.
    • If the -spqr option is passed to collate.py, it uses the C++ code in spqr.cpp to interface with OGDF to generate SPQR tree decompositions of biconnected components in the graph for MetagenomeScope's "decomposition mode." Since this requires some C++ code to be compiled, the use of -spqr in MetagenomeScope necessitates a few extra system requirements. See this page on MetagenomeScope's wiki for more information on building SPQR functionality for the preprocessing script.
  2. The viewer interface (contained in the viewer/ directory of this repository), a client-side web application that reads a .db file generated by collate.py and renders the resulting graph using Cytoscape.js. The viewer interface includes a "control panel" supporting various features for interacting with the graph.

    • Since MetagenomeScope's viewer interface is a client-side web application, you should be able to access it from most modern web browsers (mobile browsers also work, although using a desktop browser is generally recommended), either locally (if the viewer interface code is downloaded on your computer) or over HTTP/HTTPS (if the viewer interface code is hosted on a server).

The bifurcated nature of the tool lends it a few advantages that have proved beneficial when analyzing large graphs:

  • The user can save a .db file generated by the preprocessing script and visualize that file an arbitrary number of later times, without incurring the costs of layout, pattern detection, etc. twice
  • The user can host the viewer interface and a number of .db files on a server, allowing many users to view graphs with the only costs incurred being those of rendering the graphs in question