Required Software

To run the pdzextms pipeline you will need the following software and associated packages:

R >=v3.6.1 (bio3d, Biostrings, coin, Cairo, data.table, ggplot2, GGally, hexbin, plot3D, reshape2, RColorBrewer, ROCR, stringr, ggrepel)

The following software is optional:

MoCHI (pipeline for pre-processing deep mutational scanning data i.e. FASTQ to counts)
DiMSum v1.2.8 (pipeline for pre-processing deep mutational scanning data i.e. FASTQ to counts)

Installation

We recommend using this yaml file to create a dedicated Conda environment with all necessary dependencies (as explained below).

Install the Conda package/environment management system (if you already have Conda skip to step 2):

On MacOS, run:
```
$ curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
$ sh Miniconda3-latest-MacOSX-x86_64.sh
```
On Linux, run:
```
$ curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
$ sh Miniconda3-latest-Linux-x86_64.sh
```
IMPORTANT: If in doubt, respond with "yes" to the following question during installation: "Do you wish the installer to initialize Miniconda3 by running conda init?". In this case Conda will modify your shell scripts (~/.bashrc or ~/.bash_profile) to initialize Miniconda3 on startup. Ensure that any future modifications to your $PATH variable in your shell scripts occur before this code to initialize Miniconda3.

Clone the pdzextms GitHub repository:

$ git clone https://github.com/lehner-lab/pdzextms.git

Create and activate the pdzextms Conda environment:

$ conda env create -f pdzextms/pdzextms.yml
$ conda activate pdzextms

Open R and install pdzextms:

# Install
if(!require(devtools)) install.packages("devtools")
devtools::install("pdzextms")

# Load
library(pdzextms)

# Help
?pdzextms

Required Data

Fitness scores, thermodynamic models, pre-processed data and required miscellaneous files should be downloaded from here and unzipped in your project directory (see 'base_dir' option) i.e. where output files should be written.

Pipeline Modes

There are a number of options available for running the pdzextms pipeline depending on user requirements.

Basic (default)

Default pipeline functionality ('startStage' = 1) uses prefit thermodynamic models and fitness scores from DMS experiments (already processed with MoCHI and DiMSum respectively; see Required Data) to reproduce all figures in the publication.

Thermodynamic model inference with MoCHI

Pipeline stage 0 ('pdzextms_fit_thermo_model') fits thermodynamic models to DMS data.

Raw read processing

Raw read processing is not handled by the pdzextms pipeline. FastQ files from paired-end sequencing of replicate deep mutational scanning (DMS) libraries before ('input') and after selection ('output') were processed using DiMSum (Faure and Schmiedel et al., 2020).

DiMSum command-line arguments and Experimental design files required to obtain variant counts from FastQ files are available here.

Pipeline Stages

The top-level function pdzextms() is the recommended entry point to the pipeline and by default reproduces the figures and results from the computational analyses described in the following publication: The effects of PDZ domain extensions on energies, energetic couplings and allostery (Hidalgo-Carcedo C & Faure AJ et al., 2023). See Required Data for instructions on how to obtain all required data and miscellaneous files before running the pipeline.

Stage 0: Fit thermodynamic models with MoCHI

This stage ('pdzextms_fit_thermo_model') fits thermodynamic models to variant fitness data from (ddPCA) DMS.

Stage 1: Evaluate thermodynamic model results

This stage ('pdzextms_thermo_model_results') evaluates thermodynamic model results and performance including comparing to literature in vitro measurements (related to Figure 2).

Stage 2: Add 3D structure metrics

This stage ('pdzextms_structure_metrics') annotates single mutant inferred free energies with PDB structure-derived metrics.

Stage 3: Fitness plots

This stage ('pdzextms_fitness_plots') plots fitness distributions and scatterplots.

Stage 4: Plot fitness heatmaps

This stage ('pdzextms_fitness_heatmaps') plots single mutant fitness heatmaps.

Stage 5: Plot free energy scatterplots

This stage ('pdzextms_free_energy_scatterplots') plots single mutant free energy scatterplots).

Stage 6: Plot free energy heatmaps

This stage ('pdzextms_free_energy_heatmaps') plots single mutant free energy heatmaps.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
R		R
man		man
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
NAMESPACE		NAMESPACE
README.md		README.md
pdzextms.yml		pdzextms.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Table Of Contents

Required Software

Installation

Required Data

Pipeline Modes

Basic (default)

Thermodynamic model inference with MoCHI

Raw read processing

Pipeline Stages

Stage 0: Fit thermodynamic models with MoCHI

Stage 1: Evaluate thermodynamic model results

Stage 2: Add 3D structure metrics

Stage 3: Fitness plots

Stage 4: Plot fitness heatmaps

Stage 5: Plot free energy scatterplots

Stage 6: Plot free energy heatmaps

About

Releases

Packages

Contributors 2

Languages

License

lehner-lab/pdzextms

Folders and files

Latest commit

History

Repository files navigation

Table Of Contents

Required Software

Installation

Required Data

Pipeline Modes

Basic (default)

Thermodynamic model inference with MoCHI

Raw read processing

Pipeline Stages

Stage 0: Fit thermodynamic models with MoCHI

Stage 1: Evaluate thermodynamic model results

Stage 2: Add 3D structure metrics

Stage 3: Fitness plots

Stage 4: Plot fitness heatmaps

Stage 5: Plot free energy scatterplots

Stage 6: Plot free energy heatmaps

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages