RBP sequence specificity

(Current version does not contain RNA sequence specificities. Experimental data will be provided upon submission. Consequently, only scripts that work without RNA binding specificities can be executed)

This directory contains the code for "Reconstructing sequence specificities of RNA binding proteins across eukaryotes".

We used a joint linear embedding approach to model the relationship between protein sequence and RNA sequence specificity.

Recommended: install anaconda and create virtual environment with python 2.7 adding dependencies listed in dependencies.txt

Additional requirements:

python3 (only agglomerative_clustering.py), scikit-learn==0.23.2)
Hmmer (http://hmmer.org/)
conservation_code (Capra JA and Singh M. Predicting functionally important residues from sequence conservation. Bioinformatics, 23(15):1875-82, 2007. (https://compbio.cs.princeton.edu/conservation/))
pymol (https://pymol.org/2/) to visualize individual pdbs

Note: To execute "full" pipeline (i.e. every intermediate step), very long running times and large amounts of memory are required. Parallel execution recommended!

To run every step, and modify intermediate results, set $full=1 in individual bash scripts.

RUN: Before reconstructing the figures with (fig1-5.sh), execute data processing scripts (rncmpt_data.sh, performance_calc.sh, interface_importance.sh, jple_reconstruction.sh, cisbp-recstats.sh, arabidopsis.sh)

The following order is recommended/required:

rncmpt_data.sh
fig1.sh
performance_calc.sh
interface_importance.sh
fig2.sh
jple_reconstruction.sh
fig3.sh
cisbp-recstats.sh
arabidopsis.sh
fig4.sh
fig5.sh

cisbp_reconstruction.sh executes scripts to locally generate data available on http://cisbp-rna.ccbr.utoronto.ca/, e.g. PWMs jpgs for confidentily reconstructed specificities.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RBP sequence specificity

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 164 Commits
Data		Data
Figures		Figures
Outdir		Outdir
Scripts		Scripts
README.md		README.md
arabidopsis.sh		arabidopsis.sh
cisbp-recstats.sh		cisbp-recstats.sh
cisbp_reconstruction.sh		cisbp_reconstruction.sh
dependencies.txt		dependencies.txt
fig1.sh		fig1.sh
fig2.sh		fig2.sh
fig3.sh		fig3.sh
fig4.sh		fig4.sh
fig5.sh		fig5.sh
interface_importance.sh		interface_importance.sh
jple_reconstruction.sh		jple_reconstruction.sh
performance_calc.sh		performance_calc.sh
rncmpt_data.sh		rncmpt_data.sh

LXsasse/RBPbinding

Folders and files

Latest commit

History

Repository files navigation

RBP sequence specificity

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages