- Make sure you've already the fasta files from Uniprot
- Make sure you've already set blast tool and NR DB
- run this command to generate pssm from fasta using blast
python pssmbyblast.py <fastadir>
python generateposlist.py
- Make sure you've already run blastclust tool to generate similarity result
- run this script to filter the pssm files
python filterpssm.py <resultsimilarityfile>
- run this script to generate libsvm files from pssm
python pssm_to_libsvm.py <folderofpssm> <windowssize>
- run this script to generate csv as our final dataset
python libsvm_to_csv.py <folderoflibsvm>
- run this script to generate dataset as training, validation and independent with customize ratio
python generatedataset.py <folderofcsv> <ratio>
We provide some algorithm to try such as CNN, ResNet, VGG and DenseNet.
See this current result.
Algo | ACC | TN | FN | TP | FP | Specitivity | Sensitivity | MCC | AUC |
---|---|---|---|---|---|---|---|---|---|
CNN | 100 % | 4177 | 1 | 48 | 1 | 100 % | 98 % | 0.979 | |
ResNet 18 | 99 % | 4176 | 1 | 48 | 2 | 100 % | 98 % | 0.969 | |
ResNet 50 | 99 % | 4176 | 1 | 48 | 2 | 100 % | 98 % | 0.969 | |
VGG 16 | |||||||||
VGG 19 | |||||||||
DenseNet |