cudaCheckError() failed : invalid device function #19

hugobordigoni · 2016-11-09T14:02:34Z

I spent so much time debugging this issue that I give the answer here:
When running the demo.py as stated in README, I was getting an error cudaCheckError() failed : invalid device function with no traceback. It happen when this line was executed : https://github.com/smallcorgi/Faster-RCNN_TF/blob/master/lib/fast_rcnn/test.py#L169

I have never seen this error in any of my other tensorflow project.

This issue was similar to this one in Faster-RCNN for python : rbgirshick/py-faster-rcnn#2
And i solved it by updating the arch code in https://github.com/smallcorgi/Faster-RCNN_TF/blob/master/lib/make.sh#L9 and https://github.com/smallcorgi/Faster-RCNN_TF/blob/master/lib/setup.py#L137
I don't know how to find the arch code of any GPU, but for Tesla K80, sm_37 seems to work.

I don't know if we can change something so that it works for any GPU or maybe we can add an information in the README?

Hope it can help people having the same issue.

The text was updated successfully, but these errors were encountered:

ahmedammar · 2016-11-18T17:43:16Z

Just adding to this since it was useful to me.

I hit this same problem when testing on AWS EC2 instances with GPU. I had to use sm_20 in two places as mentioned above:
lib/make.sh
lib/setup.py

and force the rebuild of the python modules:
cd lib
python setup.py build_ext --inplace

ywpkwon · 2016-12-31T22:40:55Z

When I ran $ python ./tools/demo.py --model ./VGGnet_fast_rcnn_iter_70000.ckpt,
I had exactly the same error cudaCheckError() failed : invalid device function

I tried to follow having the sm_37 in lib/make.sh and lib/setup.py. I think the setting is almost there. What can I do? I am using AWS EC2 g2.2xlarge. Below messages say I am using NVIDA GRID K520.

I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties:
name: GRID K520
major: 3 minor: 0 memoryClockRate (GHz) 0.797
pciBusID 0000:00:03.0
Total memory: 3.94GiB
Free memory: 3.91GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GRID K520, pci bus id: 0000:00:03.0)
Tensor("Placeholder:0", shape=(?, ?, ?, 3), dtype=float32)
Tensor("conv5_3/conv5_3:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("rpn_conv/3x3/rpn_conv/3x3:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("rpn_cls_score/rpn_cls_score:0", shape=(?, ?, ?, 18), dtype=float32)
Tensor("rpn_cls_prob:0", shape=(?, ?, ?, ?), dtype=float32)
Tensor("rpn_cls_prob_reshape:0", shape=(?, ?, ?, 18), dtype=float32)
Tensor("rpn_bbox_pred/rpn_bbox_pred:0", shape=(?, ?, ?, 36), dtype=float32)
Tensor("Placeholder_1:0", shape=(?, 3), dtype=float32)
Tensor("conv5_3/conv5_3:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("rois:0", shape=(?, 5), dtype=float32)
[<tf.Tensor 'conv5_3/conv5_3:0' shape=(?, ?, ?, 512) dtype=float32>, <tf.Tensor 'rois:0' shape=(?, 5) dtype=float32>]
Tensor("fc7/fc7:0", shape=(?, 4096), dtype=float32)

Loaded network ./VGGnet_fast_rcnn_iter_70000.ckpt
cudaCheckError() failed : invalid device function

eakbas · 2017-01-11T10:13:34Z

I got the same error (i.e., cudaCheckError() failed : invalid device function) with my Tesla K40. When I changed the -arch parameter in lib/make.sh to sm_35, and rerun make.sh, it worked.

VitaliKaiser · 2017-01-13T18:29:40Z

@ahmedammar you could even use sm_30 for AWS

guotong1988 · 2017-02-04T08:25:35Z

@eakbas Thanks!

lancejchen · 2017-03-27T15:39:21Z

For other GPUs:

# Which CUDA capabilities do we want to pre-build for?
# https://developer.nvidia.com/cuda-gpus
#   Compute/shader model   Cards
#   6.1		      P4, P40, Titan X so CUDA_MODEL = 61
#   6.0                    P100 so CUDA_MODEL = 60
#   5.2                    M40
#   3.7                    K80
#   3.5                    K40, K20
#   3.0                    K10, Grid K520 (AWS G2)
#   Other Nvidia shader models should work, but they will require extra startup
#   time as the code is pre-optimized for them.
CUDA_MODELS=30 35 37 52 60 61

Credit to https://github.com/mldbai/mldb/blob/master/ext/tensorflow.mk

fangyan93 · 2017-08-08T22:17:57Z

@VitaliKaiser Hello, I am using AWS EC2 GPU to run demo.py, getting 'cudaCheckError() failed : invalid device function'.
Since GPU used is GRID K520 on my instance, I follow your post to change -arch parameter in steup.py and make.sh to sm_30 and rerun make.sh, but this error is still there when I run './tools/demo.py --model ./VGG.....ckpt'. Could you please give me some help?

VitaliKaiser · 2017-08-08T22:27:54Z

@fangyan93 It´s quite a while since I last looked into it, but I had lost a lot of time to figure out things were not rebuild!
You have delete every binary which is build with the make.sh script. And then build it one more time.

fangyan93 · 2017-08-08T23:22:33Z

@VitaliKaiser Thanks for reply. Yes, I remove the previous build files and rebuild from very beginning, it works!

Kinantb · 2019-04-26T17:40:16Z

So for those who are still lost. Here are a few clean steps to resolve the issue (you need to recompile your CUDA):

Got to the following page https://developer.nvidia.com/cuda-gpus and find your GPU
Find the number (the "Compute Capability") next to your GPU name, e.g. for 680 it is 3.0
Remove the dot from it so it becomes 30
In the make_cuda.sh file required for compiling, change the number after "arch" flag in the nvcc command to the one you found above. Example:
nvcc -c -o corr_cuda_kernel.cu.o corr_cuda_kernel.cu -x cu -Xcompiler -fPIC -arch=sm_52
nvcc -c -o corr_cuda_kernel.cu.o corr_cuda_kernel.cu -x cu -Xcompiler -fPIC -arch=sm_30
Delete the folder of built files if you have already compiled it before
Run the make_cuda.sh and continue as usual

longcw mentioned this issue Feb 20, 2017

cudaCheckError() failed : invalid device function longcw/faster_rcnn_pytorch#2

Closed

This was referenced Mar 31, 2017

ImportError on _reorg_layer.so longcw/yolo2-pytorch#10

Closed

cudaCheckError failed longcw/faster_rcnn_pytorch#6

Closed

YutingLin mentioned this issue May 3, 2017

Cannot save the trained ckpt file. #139

Closed

Zardinality mentioned this issue Jul 28, 2017

A question when training Zardinality/TF_Deformable_Net#2

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cudaCheckError() failed : invalid device function #19

cudaCheckError() failed : invalid device function #19

hugobordigoni commented Nov 9, 2016

ahmedammar commented Nov 18, 2016 •

edited

Loading

ywpkwon commented Dec 31, 2016 •

edited

Loading

eakbas commented Jan 11, 2017 •

edited

Loading

VitaliKaiser commented Jan 13, 2017

guotong1988 commented Feb 4, 2017

lancejchen commented Mar 27, 2017

fangyan93 commented Aug 8, 2017

VitaliKaiser commented Aug 8, 2017 •

edited

Loading

fangyan93 commented Aug 8, 2017

Kinantb commented Apr 26, 2019

cudaCheckError() failed : invalid device function #19

cudaCheckError() failed : invalid device function #19

Comments

hugobordigoni commented Nov 9, 2016

ahmedammar commented Nov 18, 2016 • edited Loading

ywpkwon commented Dec 31, 2016 • edited Loading

eakbas commented Jan 11, 2017 • edited Loading

VitaliKaiser commented Jan 13, 2017

guotong1988 commented Feb 4, 2017

lancejchen commented Mar 27, 2017

fangyan93 commented Aug 8, 2017

VitaliKaiser commented Aug 8, 2017 • edited Loading

fangyan93 commented Aug 8, 2017

Kinantb commented Apr 26, 2019

ahmedammar commented Nov 18, 2016 •

edited

Loading

ywpkwon commented Dec 31, 2016 •

edited

Loading

eakbas commented Jan 11, 2017 •

edited

Loading

VitaliKaiser commented Aug 8, 2017 •

edited

Loading