Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cudaCheckError() failed : invalid device function #2

Closed
iFighting opened this issue Feb 20, 2017 · 12 comments
Closed

cudaCheckError() failed : invalid device function #2

iFighting opened this issue Feb 20, 2017 · 12 comments

Comments

@iFighting
Copy link

when i run the demo.py, i met this error:
"load model successfully!
cudaCheckError() failed : invalid device function"

do you know why?
i can use pytorch to train other models, so the installation is correct.
thanks

@longcw
Copy link
Owner

longcw commented Feb 20, 2017

Is there any other error message like the line number for me to find the reason.

@iFighting
Copy link
Author

你是中国人,我说中文吧,就这两行输出。。我跑其他的模型的时候没有出错的

@longcw
Copy link
Owner

longcw commented Feb 20, 2017

可能是RoIPool 的问题吧,你试试用python实现的RoIPool 能不能跑,在faster_rcnn/faster_rcnn.py:

from roi_pooling.modules.roi_pool_py import RoIPool
# from roi_pooling.modules.roi_pool import RoIPool

@iFighting
Copy link
Author

iFighting commented Feb 20, 2017

果然是这个原因。。搞好了,不过速度非常慢啊。。一张图片4秒钟
我还没仔细看代码,是什么原因导致错误呢?
最后。。
你是T大的,厉害,是不是最近也在学pytorch啊,方便加个联系方式一起交流嘛

@longcw
Copy link
Owner

longcw commented Feb 20, 2017

是的,python实现的RoIPool需要2,3秒。
但是你那GPU实现的不能跑我也不知道原因了,你试试重新编译,你的GPU型号是什么,我这里GTX1080是没有问题的。
或者强制用CPU版的(只实现了前传),在faster_rcnn/roi_pooling/functions/roi_pool.py:

if not features.is_cuda:
    _features = features.permute(0, 2, 3, 1)
    roi_pooling.roi_pooling_forward(self.pooled_height, self.pooled_width, self.spatial_scale,
                                    _features, rois, output)
    # output = output.cuda()

改成:

if True:
    is_cuda = features.is_cuda

    _features = features.permute(0, 2, 3, 1)
    if is_cuda:
        _features = _features.cpu()
        rois = rois.cpu()

    roi_pooling.roi_pooling_forward(self.pooled_height, self.pooled_width, self.spatial_scale,
                                    _features, rois, output)
    if is_cuda:
        output = output.cuda()

@iFighting
Copy link
Author

GPU是Tesla K40m,cuda是7.5
用CPU前传。。。速度也很慢啊,囧
我重新编译过了还是这个问题

@longcw
Copy link
Owner

longcw commented Feb 20, 2017

我这里连续跑VOC07的测试集是每帧 0.12s 左右,cpu的RoIPool 只比gpu 慢一点。
see this:smallcorgi/Faster-RCNN_TF#19

I got the same error (i.e., cudaCheckError() failed : invalid device function) with my Tesla K40. When I changed the -arch parameter in lib/make.sh to sm_35, and rerun make.sh, it worked.

你可以试试按照他说的在faster_rcnn/make.sh 里面改 -arch=sm_35
不同GPU的计算能力: https://developer.nvidia.com/cuda-gpus

@longcw
Copy link
Owner

longcw commented Feb 20, 2017

我也是刚开始用pytorch,一起交流哈~
我的邮箱[email protected],微信qq啥的可以邮箱发给你。

@iFighting
Copy link
Author

好的

@longcw longcw closed this as completed Feb 20, 2017
@lancejchen
Copy link

对于其它GPU:

# Which CUDA capabilities do we want to pre-build for?
# https://developer.nvidia.com/cuda-gpus
#   Compute/shader model   Cards
#   6.1		      P4, P40, Titan X so CUDA_MODEL = 61
#   6.0                    P100 so CUDA_MODEL = 60
#   5.2                    M40
#   3.7                    K80
#   3.5                    K40, K20
#   3.0                    K10, Grid K520 (AWS G2)
#   Other Nvidia shader models should work, but they will require extra startup
#   time as the code is pre-optimized for them.
CUDA_MODELS=30 35 37 52 60 61

Credit to https://github.com/mldbai/mldb/blob/master/ext/tensorflow.mk

@nlqq
Copy link

nlqq commented Oct 12, 2018

hi,可以带我一个吗,pytorch新手,跑过mxnet,tf等

@nlqq
Copy link

nlqq commented Oct 12, 2018

[email protected]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants