============================================================== DEPENDENCIES:
OS: Linux
-
Installing torch and basic package dependencies following instructions from http://torch.ch/docs/getting-started.html You may need to install cmake if you do not have it already. It is highly recommended to link against OpenBLAS or other optimized BLAS library when building torch.
-
npy4th A package to load/save numpy files for troch7 Requirements: torch7 xlua Installation: git clone https://github.com/htwaijry/npy4th.git cd npy4th luarocks make
-
Python 2.7.x, numpy, pandas, scipy, scikit-learn
-
Installing Theano and basic package dependencies following instructions from http://deeplearning.net/software/theano/install.html
-
Installing Lasagne and basic package dependencies following instructions from https://github.com/Lasagne/Lasagne or http://lasagne.readthedocs.org/en/latest/user/installation.html
-
bedtools (>= 2.25.0) make sure the bedtools has already exported to PATH
-
bigWigAverageOverBed
=============================================================== USAGE:
STEP 1: Data preprocessing
Example run:
python dataEncode.py example genome.fa EvolutionaryConservation.bw
step1 must contain the following files:
dataEncode.py
transformTo1Kformat.py
genome.fa
OneHotTransform.py
DeepSEA.lua
deepsea.cpu
ExtractCoreRegion.py
bigWigAverageOverBed
EvolutionaryConservation.bw
SortConsBed.py
AddConsToDeepSeaRes.py
RNNDataFormat.py
In this step, the predict target should be .BED format (example.bed), each region should be large than 200bp. #############################update####################################################### We are sorry that the ftp server is temporarily down. Due to the network issue, we can only upload the data to the baidu netdisk. URL: https://pan.baidu.com/s/1kOBwR8iMsbcbm5x4Kk6NzA PASSWORD: 5spg
We will try to setup another server as soon as possible.
########################################################################################### genome.fa is the Human(hg19) whole genome seuqnece which can be obtained from UCSC Genome Browser,please merge all chromosome and rename "genome.fa",or you can download from:
ftp://123.56.134.57/data/genome.fa
EvolutionaryConservation.bw was taken from the vertebrate phastCons44way track from USCS Genome Browser, or you can download from:
ftp://123.56.134.57/data/EvolutionaryConservation.bw
deepsea.cpu is DeepSEA trained model, you can download from:
ftp://123.56.134.57/data/deepsea.cpu
############################################################################################### output files will be under "example" folder:
example_data.npy
example_mask.npy
STEP 2: Prediction procedure
step2 must contain the following files:
BiRen.py
BiRen_trained_model.npy
Example run:
python BiRen.py example
output files will be under ./example : example.out
Here is a minnimal example:
chr20 2719208 2719789 0.220106445764
chr5 2112055 2113430 0.859481014407
The four columns are chromosome, start position, end position and enhancer probabiliy score.