-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check failed: error == cudaSuccess (8 vs. 0) invalid device function #2
Comments
I'm running the code using K520 with 4G GPU memory. Is it because the code cannot support this GPU? CPU mode works fine. |
You might find some solutions here. |
@rbgirshick I got the same error, but I can run fast-rcnn on GPU using the same Makefile.config to compile caffe-fast-rcnn |
I got the same error too, i have do many tests, also have try to edit Makefile.config, but there are still the same error invalid device function, but some time this error was in other .cu file not roi_pooling_layer.cu. So i think the version of caffe-fast-rcnn which faster-rcnn used has some compatibility problem? And if i want to use other version of caffe, eg. caffe in fast-rcnn, which files i should copy to fast-rnn's caffe-fast-rcnn? @rbgirshick |
I got the same error too. I have carefully read the solutions pointed out by @rbgirshick . However, this error still exists. Finally, I have to get back to the matlab version again. |
I have done many tests, and i found this type of error maybe called by some function of faster-rcnn which fast-rcnn doesn't have. When i use the imagenet model, there isn't this error, and all run well, after RPN training, generate proposals, the error invalid device function occurs, and very strange, call net.forward in function im_proposals() first time, it runs, when second time, error. When i comment the last layer : |
I have fixed the problem and after some modifications, now it runs well |
@PierreHao Could you share your modifications? |
OK, for me , it works, but for your problem, you should test yourself. I found that, use cpu mode, it can run, so the problem is gpu, then in the code nms, by default, it calls nms_gpu version, so if we use caffe gpu mode and nms_gpu, there will be an error for our type of GPU (not surely, my guess). You can change nms_wrapper.py to set mode cpu, or in proposal_layer.py the function forward(), comment nms function and related code, nms_cpu mode is slow, comment nms is more fast. you can try it by yourself. Good luck! |
@twtygqyy @PierreHao I've pushed a small change to demo.py that I hope will fix the underlying problem. Let me know if you have a chance to check the patch. Thanks. |
@PierreHao Thank you for the information, I've tried to comment nms, but it could not help to pass the error. |
@rbgirshick I think the problem is called by the GPU, some version of GPU couldn't call a gpu program in another gpu program, when i try titan, i works, when i try 2 different tesla, there will be the error: invalid device function(But pass the error if use cpu mode of nms) |
@PierreHao For me, changing this line: |
@sunshineatnoon 0.975s means that you use NMS cpu mode, so it runs slowly. |
@PierreHao If you delete all codes about nms, will multiple bboxes appear in an image? |
@sunshineatnoon when you delete nms in training process, maybe there will be an error. NMS is not necessary, without nms, nultiple bboxes appears ,and you can try it. |
Finally found the solution. You need to change the architecture to match yours in here: @rbgirshick any chance we can support multiple architectures in there? |
@alantrrs Can you specify how to change the architecture? My GPU is Quadro K4000. |
@sunshineatnoon I believe your GPU has a Kepler architecture, so you can change |
@alantrrs I changed my setup.py file like this, but I still got the error:
|
@PierreHao thanks Pierre for your solution! |
@alantrrs It works, finally. Thank you so much |
@twtygqyy what you have changed? Your gpu is old, i have tested gpu with computing power 5.0, all run well |
@PierreHao I changed setting from sm_35 to sm_30. I'm using AWS g2.8xlarge instance. |
@twtygqyy Hi, I got the same error too, if i set __C.USE_GPU_NMS = True in $FCN_ROOT/lib/fast_rcnn/config.py. I'm using AWS g2.0xlarge instance. So, how can i change the architecture to solve the problem? Thanks a lot. |
@zimenglan-sysu-512 if you're using the GPU instance on AWS, then please change the architecture setting into: # CUDA architecture setting: going with all of them.
# For CUDA < 6.0, comment the *_50 lines for compatibility.
CUDA_ARCH := -gencode arch=compute_30,code=sm_30 \
-gencode arch=compute_50,code=sm_50 \
-gencode arch=compute_50,code=compute_50 Because the GPU in AWS does not support compute_35 |
@twtygqyy I I changed settings from sm_35 to sm_30 and remove *_50, but it did not work. what other settings should be changed? Thanks. |
@twtygqyy I have solve the problem. In the case, I use K520 of aws. Thanks for your help. As below, there is my solution (thress steps): |
@zimenglan-sysu-512 Sorry for the late reply, I'm glad to hear that your problem has been solved. |
@alantrrs thanks for the pointer ! That fixed the problem. @sunshineatnoon did you remove the *.so files and recompile the $FRCN_ROOT/lib ? |
@rodrigob I tried to remove *.so in |
@sunshineatnoon , I use GeForce GTX 760, and come across the problem too, for the solution:
|
@xiaohujecky Note that if you set In any case, I still face this error. It is pretty simple to reproduce. Run Faster-RCNN training and alongside it run a simple CUDA program that tries to |
I'm running this error with
and
and
|
changed sm_35 into sm_30 in lib/setup.py file, and |
Fix typo in README.md
Still getting this on a GTX 1060. Tried Update: Adding
Not sure why this issue is closed when it still seems to be a constant problem. |
That worked for me. |
Hi ... |
There is no problem for me to run the demo.py of fast-rcnn, however, I had the error as follows when I try to run the demo.py of py-faster-rcnn after successfully make -j8 & make pycaffe
Loaded network /home/ubuntu/py-faster-rcnn/data/faster_rcnn_models/ZF_faster_rcnn_final.caffemodel
F1008 04:30:16.139123 5360 roi_pooling_layer.cu:91] Check failed: error == cudaSuccess (8 vs. 0) invalid device function
*** Check failure stack trace: ***
Anyone has the same problem?
The text was updated successfully, but these errors were encountered: