Welcome to the GitHub repository for the following publication: The genetic architecture of protein stability (Faure AJ et al., 2023)
Here you'll find an R package with all scripts to reproduce the figures and results from the computational analyses described in the paper.
- 1. Required Software
- 2. Required Data
- 3. Installation Instructions
- 4. Usage
To run the archstabms pipeline you will need the following software and associated packages:
- R (Biostrings, Cairo, bio3d, data.table, GGally, ggplot2, ggrepel, hexbin, plot3D, ppcor, reshape2, reticulate, ROCR, scales)
The software has been tested using R v4.2.0.
Fitness scores, inferred free energy changes and required miscellaneous files should be downloaded from here and unzipped in your project directory (see 'base_dir' option) i.e. where output files should be written.
Make sure you have git and conda installed and then run (expected install time <5min):
# Install dependencies manually (preferably in a fresh conda environment)
conda install -c conda-forge bioconductor-biostrings cairo r-base>4.0.0 r-bio3d r-cairo r-data.table r-devtools r-ggally r-ggplot2 r-ggrepel r-hexbin r-plot3d r-ppcor r-reshape2 r-reticulate r-rocr r-roxygen2 r-scales
# Open an R session and install the archstabms R package
devtools::install_github("lehner-lab/archstabms")
The top-level function archstabms() is the recommended entry point to the pipeline and by default reproduces the figures and results from the computational analyses described in the following publication: The genetic architecture of protein stability (Faure AJ et al., 2023). See Required Data for instructions on how to obtain all required data and miscellaneous files before running the pipeline. Expected run time <20min.
library(archstabms)
archstabms(base_dir = "MY_PROJECT_DIRECTORY")