Skip to content

Contains code for paper that will be submitted soon

Notifications You must be signed in to change notification settings

LXsasse/RBPbinding

Repository files navigation

RBP sequence specificity

(Current version does not contain RNA sequence specificities. Experimental data will be provided upon submission. Consequently, only scripts that work without RNA binding specificities can be executed)

This directory contains the code for "Reconstructing sequence specificities of RNA binding proteins across eukaryotes".

We used a joint linear embedding approach to model the relationship between protein sequence and RNA sequence specificity.

Recommended: install anaconda and create virtual environment with python 2.7 adding dependencies listed in dependencies.txt

Additional requirements:

Note: To execute "full" pipeline (i.e. every intermediate step), very long running times and large amounts of memory are required. Parallel execution recommended!

To run every step, and modify intermediate results, set $full=1 in individual bash scripts.

RUN: Before reconstructing the figures with (fig1-5.sh), execute data processing scripts (rncmpt_data.sh, performance_calc.sh, interface_importance.sh, jple_reconstruction.sh, cisbp-recstats.sh, arabidopsis.sh)

The following order is recommended/required:

  1. rncmpt_data.sh
  2. fig1.sh
  3. performance_calc.sh
  4. interface_importance.sh
  5. fig2.sh
  6. jple_reconstruction.sh
  7. fig3.sh
  8. cisbp-recstats.sh
  9. arabidopsis.sh
  10. fig4.sh
  11. fig5.sh

cisbp_reconstruction.sh executes scripts to locally generate data available on http://cisbp-rna.ccbr.utoronto.ca/, e.g. PWMs jpgs for confidentily reconstructed specificities.

About

Contains code for paper that will be submitted soon

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published