Skip to content

QCSB/PROSO-Toolbox

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

52 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


PROSO Toolbox

A Computational Toolbox for Context-Specific Genome-Scale Modelling
Project Wiki »
Report Issues »

NOTE: Please always use and refer to QCSB Release

Table of Contents
  1. About PROSO
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Cite PROSO
  6. License
  7. Contact

About PROSO

PROSO Toolbox is a collection of functions used to process, interpret, and study cellular multi-omics data under the scope of genome-scale modelling (GEM).

Logo

What PROSO Toolbox offers:

  • Automatic implementing protein constraints to any genome-scale metabolic model (M-model)
  • System-level enzymatic constant estimation
  • Incorporating gene expression data onto GEM for context-specific modelling
  • Suggesting synthetic biology strategies for biotechnology, infectious disease, cancer research, and more

More information on PROSO Toolbox's intuition, formulation, and execution is available in our publication.

(back to top)

Getting Started

PROSO Toolbox can be setup easily as follows.

Prerequisites

This is an example of how to list things you need to use the software and how to install them.

Installation

  1. Clone the current repo to your PC

  2. In MATLAB command window, add PROSO directory to path

    >> addpath("Path-to-PROSO-Folder")
    >> savepath
  3. It's good to go

(back to top)

Usage

Here we only demonstrate a simple PC-model construction from Pseudomonas aeruginosa M-model. Despite not being context-specific by itself, PC-model is an 'upgraded' M-model and can serve important purposes in research.

  1. Prepare Data or find them under PROSO/tutorial. Make sure they are in path or in your working directory:

  2. Construct draft PC-model from M-model

    • Open MATLAB, make sure all installations are done correctly. Initialize Cobra Toolbox and change the default solver to Gurobi (or IBM CPLEX).

      >> initCobraToolbox(false);
      >> changeCobraSolver('gurobi','all',0);
    • Construct the draft PC-model

      We are implementing protein constraints onto iSD1509, with a protein budget of 150mg/gDW.

      >> model_ori = readCbModel('iSD1509.xml');
      >> [model_pc_draft,fullProtein,fullCplx,C_matrix,K_matrix,proteinMM] = pcModel(model_ori,'Pseudomonas_aeruginosa_UCBPP-PA14_109.faa',150);

      This will take several minutes to complete.

      The M-model has 1510 genes (with one dummy gene), 1642 metabolites, and 2023 reactions.

      Note that the resulting draft PC-model has 7519 'metabolites' (1642 true metabolites + 1510 proteins + 1250 complexes + 1558 forward enzymes + 1558 reverse enzymes + proteinWC) and 12487 'reactions' (2023 true reactions + 1510 protein dilutions + 1250 complex formations + 4588 enzyme formations + 3116 enzyme dilutions). This structure will not be changed during tuning, only the coefficient will be modified.

  3. Tune the draft PC-model for better performance

    • Manually adjust protein complex stoichiometry

      This step is usually conducted using some database. For example, from MetaCyc PA14 database we can extract complex information to curate the draft PC-model. It is important for the user to appropriately appreciate the accuracy of each source, as almost nothing is guaranteed completely accurate.

      ATP synthase complex is a large protein complex with 9 subunits. Use surfNet to inspect it in PC-model:

      >> surfNet(model_pc_draft,'cplxForm_x(193)x(197)x(195)x(198)x(200)x(199)x(196)x(194)x(192)');

      You can use keep track of complex -> enzyme -> reaction to make sure it is the ATPS complex, or going in reverse direction to find complexes for a certain reaction.

      For example, If I want to change it so each one of ATPS complex has two copies of subunit alpha (atpA, PA14_73260), I first need to locate both complex and protein in their respective list:

      >> pIdx = find(strcmp(fullProtein,'PA14_73260'));
      >> cIdx = find(C_matrix(pIdx,:));

      The change to make is protein #193 and complex #178. I change the coefficient from 1 to 2:

      >> C_matrix(pIdx,cIdx) = 2;

      I want to finish all subunit modifications before proceed to next step.

    • Estimate enzymatic rate constants using SASA

      Now we have modified all protein complexes (C_matrix), their rate constants can be automatically estimated as below.

      >> K_matrix = estimateKeffFromMW(C_matrix,K_matrix,proteinMM);

      This gives us an updated kinetic matrix to implement.

    • Update PC-model

      Implement new C_matrix and K_matrix back to PC-model.

      >> model_pc = adjustStoichAndKeff(model_pc_draft,C_matrix,K_matrix);

      This will take some time to complete.

  4. What does PC-model does

    PC-model 'soft-cap' the system-level activity by constraining the total amount of proteins in the system.

    >> FBAsol = optimizeCbModel(model_ori,'max');
    >> FBAsol_pc = optimizeCbModel(model_pc,'max');

    The optimal growth rate of PC-model (FBAsol_pc.v) is smaller than the one of M-model (FBAsol.v). In general, PC-FBA better resembles organism's true exponential phase metabolism.

These are only the most basic functions. For more examples, please refer to the Project wiki

(back to top)

Roadmap

PROSO is a on-going project with future plans to refine and expand the scope.

  • Version 1.0

    • Automated PC-model Construction from M-model
    • Convex QP for expression data incorporation
    • Nonconvex QP for kinetic parameter estimation
    • Debottlenecking algorithm
    • Finishing README, wiki, license, etc.
  • Version 2.0

    • Implementing more mechanistic details
    • Allowing incorporation of other omics data
    • Other approach for kinetic parameter estimation

(back to top)

Cite PROSO

Please cite our latest publications:

  • Yao, H., Dahal, S., & Yang, L. (2023). Novel context-specific genome-scale modelling explores the potential of triacylglycerol production by Chlamydomonas reinhardtii. Microbial Cell Factories, 22(1), 1-16.
  • Yao, H., & Yang, L. (2023). PROSO Toolbox: a unified protein-constrained genome-scale modelling framework for strain designing and optimization. arXiv preprint arXiv:2308.14869.

(back to top)

License

Distributed under GNU GENERAL PUBLIC LICENSE V3. Please see LICENSE for more information.

(back to top)

Contact

Herbert Yao - [email protected]

Laurence Yang - [email protected]

Queen's Computational Systems Biology Group, Department of Chemical Engineering, Queen's University at Kingston, Canada

(back to top)

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • MATLAB 100.0%