DiMSum

Welcome to the GitHub repository for DiMSum: An error model and pipeline for analyzing deep mutational scanning (DMS) data and diagnosing common experimental pathologies.

Overview

The DiMSum pipeline processes raw sequencing reads (in FASTQ format) or variant counts from deep mutational scanning (DMS) experiments to calculate estimates of variant fitness (and assocated error). These estimates are suitable for use in downstream analyses of epistasis, fitting interpretable models and protein structure determination.

The DiMSum pipeline consists of five stages grouped into two modules that can be run independently:

WRAP (DiMSum stages 1-3) processes raw FASTQ files generating a table of variant counts
STEAM (DiMSum stages 4-5) analyses variant counts generating variant fitness and error estimates

Further details of individual DiMSum pipeline stages can be found here.

Installation

The easiest way to install DiMSum is by using the bioconda package to create a dedicated Conda environment:

conda create -n dimsum r-base=4.0 fastqc=0.11 r-dimsum

See the full Installation Instructions for further details and alternative installation options.

Usage

In the example below, DiMSum will obtain variant sequences by aligning paired-end reads in the directory "FASTQ_dir", count variant occurrences for all samples specified in the supplied Experimental Design File ("experimentDesign.txt") and calculate fitness (and error) for all variants relative to the indicated wild-type sequence.

DiMSum --fastqFileDir FASTQ_dir --experimentDesignPath experimentDesign.txt --wildtypeSequence AGCTAGCT

By default, output files are saved to the folder "DiMSum_Project" in the current working directory.

See instructions regarding Command-line Arguments, File Formats and the Demo mode for full details and usage options.

Bugs

All bug reports are highly appreciated. You may submit a bug report here on GitHub as an issue or you could send an email to [email protected].

Citing DiMSum

Please cite the following publication if you use DiMSum:

Faure, A.J., Schmiedel, J.M., Baeza-Centurion, P., Lehner B. DiMSum: an error model and pipeline for analyzing deep mutational scanning data and diagnosing common experimental pathologies. Genome Biol 21, 207 (2020). 10.1186/s13059-020-02091-3

(Vector illustration credit: Vecteezy!)

Name		Name	Last commit message	Last commit date
Latest commit History 375 Commits
R		R
docs		docs
examples		examples
inst		inst
man		man
DESCRIPTION		DESCRIPTION
DMS_experiment.png		DMS_experiment.png
DiMSum		DiMSum
DiMSum.R		DiMSum.R
Dumpling.png		Dumpling.png
LICENSE		LICENSE
NAMESPACE		NAMESPACE
README.md		README.md
dimsum.yaml		dimsum.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DiMSum

Table Of Contents

Overview

Installation

Usage

Bugs

Citing DiMSum

About

Releases 12

Packages

Contributors 3

Languages

License

lehner-lab/DiMSum

Folders and files

Latest commit

History

Repository files navigation

DiMSum

Table Of Contents

Overview

Installation

Usage

Bugs

Citing DiMSum

About

Resources

License

Stars

Watchers

Forks

Releases 12

Packages 0

Contributors 3

Languages

Packages