Group20 Resnet18

In this repo, we reimplement ResNet18 with CUDA, and make optimizations to speed up inference. Experiments show that our code is faster than the pytorch version (batchsize=1).

We use im2col and winograd(4x4) to speed up convolution.

onnx模型下载链接：链接：https://pan.baidu.com/s/1eVvb2OedbnYR7m6PG-U_Cw 提取码：ksm8

解析onnx模型并获取参数

1、通过get_onnx_weight.py获得weight.json

cd /home/group20/cuda_onnx_python/
conda activate onnx_env
python get_onnx_weight.py

2、通过Jsoncpp加载参数至模型中，其中/json以及/json_lib即为jsoncpp所需文件

Baseline

通过pytorch搭建Resnet18并将backend设为cudnn作为baseline

cd /home/group20/git/resnet_python/try_resnet_format.py
conda activate onnx_env
python try_resnet_format.py

CUDA搭建与实现

kernels.cu: MaxPooing AvgPooling Relu Add

GEMM: matmul.cu

winograd: conv_winograd_4x4_3x3.cu conv_winograd_gpu.cu

im2col: conv_im2col.cu

resnet_extern.cu: resnet

resnet18_main.cc: main

cd /home/group20/resnet_cuda/tmp/final_version/
make
./hello

输入输出文件及模型

resnet18Input.txt

resnet18Output.txt

resnet18.onnx

weight.json

实验结果

Methods	time(ms)
Baseline (pytorch)	2.67
our model	2.26

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
json		json
lib_json		lib_json
.gitattributes		.gitattributes
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
Makefile		Makefile
Readme.md		Readme.md
block.cu		block.cu
conv_im2col.cu		conv_im2col.cu
conv_im2col_transpose.cu		conv_im2col_transpose.cu
conv_winograd_4x4_3x3.cu		conv_winograd_4x4_3x3.cu
conv_winograd_cpu.cpp		conv_winograd_cpu.cpp
conv_winograd_gpu.cu		conv_winograd_gpu.cu
kernels.cu		kernels.cu
main.cu		main.cu
matmul.cu		matmul.cu
read_weight.py		read_weight.py
resnet.cu		resnet.cu
resnet18Input.txt		resnet18Input.txt
resnet18Output.txt		resnet18Output.txt
resnet18_main.cc		resnet18_main.cc
resnet18_main.cu		resnet18_main.cu
resnet_extern.cu		resnet_extern.cu
tensor.cu		tensor.cu
test.cpp		test.cpp
test.cu		test.cu
try.cpp		try.cpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Group20 Resnet18

解析onnx模型并获取参数

Baseline

CUDA搭建与实现

输入输出文件及模型

实验结果

About

Releases

Packages

Contributors 3

Languages

chesiy/cuda_resnet

Folders and files

Latest commit

History

Repository files navigation

Group20 Resnet18

解析onnx模型并获取参数

Baseline

CUDA搭建与实现

输入输出文件及模型

实验结果

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages