Skip to content

chesiy/cuda_resnet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Group20 Resnet18

In this repo, we reimplement ResNet18 with CUDA, and make optimizations to speed up inference. Experiments show that our code is faster than the pytorch version (batchsize=1).

We use im2col and winograd(4x4) to speed up convolution.

onnx模型下载链接:链接:https://pan.baidu.com/s/1eVvb2OedbnYR7m6PG-U_Cw 提取码:ksm8

解析onnx模型并获取参数

1、通过get_onnx_weight.py获得weight.json

cd /home/group20/cuda_onnx_python/
conda activate onnx_env
python get_onnx_weight.py

2、通过Jsoncpp加载参数至模型中,其中/json以及/json_lib即为jsoncpp所需文件

Baseline

通过pytorch搭建Resnet18并将backend设为cudnn作为baseline

cd /home/group20/git/resnet_python/try_resnet_format.py
conda activate onnx_env
python try_resnet_format.py

CUDA搭建与实现

kernels.cu: MaxPooing AvgPooling Relu Add

GEMM: matmul.cu

winograd: conv_winograd_4x4_3x3.cu conv_winograd_gpu.cu

im2col: conv_im2col.cu

resnet_extern.cu: resnet

resnet18_main.cc: main

cd /home/group20/resnet_cuda/tmp/final_version/
make
./hello

输入输出文件及模型

resnet18Input.txt

resnet18Output.txt

resnet18.onnx

weight.json

实验结果

Methods time(ms)
Baseline (pytorch) 2.67
our model 2.26

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages