OCPA

This repository is the implementation of Accelerating Convolutional Neural Network by Exploiting Sparsity on GPUs

Requirements

running ECR for convolution layer

cd OCPA/ECR/ECR/time_vgg
nvcc batchedECR.cu -o batchedECR.out
./batchedECR.out 32

running PECR for convolution+pooling layer

cd OCPA/PECR/pecr/time_vgg
nvcc batchedPECR.cu -o batchedPECR.out
./batchedPECR.out 32

running cudnn(using tensor core) for convolution layer

cd OCPA/ECR/cudnn/time_vgg
make
./cudnn 32

running cudnn(using tensor core) for convolution+pooling layer

cd OCPA/PECR/cudnn/time_vgg
make
./cudnn 32

running ECR for convolution layer

cd OCPA/ECR/ECR/time_resnet
nvcc batchedECR.cu -o batchedECR.out
./batchedECR.out 32

running PECR for convolution+pooling layer

cd OCPA/PECR/pecr/time_resnet
nvcc batchedPECR.cu -o batchedPECR.out
./batchedPECR.out 32

running cudnn(using tensor core) for convolution layer

cd OCPA/ECR/cudnn/time_resnet
make
./cudnn 32

running cudnn(using tensor core) for convolution+pooling layer

cd OCPA/PECR/pecr/time_resnet
make
./cudnn 32

We can get the running time of other algorithms by analogy with the above methods using cuDNN.

The Vgg-16 and Resnet-50 speedup effects can be obtained by running programs under the folder OCPA/speedup. As shown in the following figure：

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
ECR		ECR
PECR		PECR
dataset		dataset
speedup		speedup
.gitattributes		.gitattributes
README.md		README.md