Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A question when training #3

Open
xiaowenhe opened this issue Jul 31, 2017 · 10 comments
Open

A question when training #3

xiaowenhe opened this issue Jul 31, 2017 · 10 comments

Comments

@xiaowenhe
Copy link

@Zardinality ,thanks for your answer. But the error still again. Even I change the -arch=sm_37 (K80) in make.sh and setup.py, and rerun the make.

@paulcx
Copy link

paulcx commented Jul 31, 2017

Another error when trainning:

kittivoc_train
kittivoc_val
kittivoc_trainval
kittivoc_test
kittivoc_train
kittivoc_val
kittivoc_trainval
kittivoc_test
kittivoc_train
kittivoc_val
kittivoc_trainval
kittivoc_test
nthu_71
nthu_370
Traceback (most recent call last):
File "./faster_rcnn/train_net.py", line 30, in
from lib.networks.factory import get_network
File "/root/tf_deformable_frcnn/lib/networks/init.py", line 8, in
from .VGGnet_train import VGGnet_train
File "/root/tf_deformable_frcnn/lib/networks/VGGnet_train.py", line 2, in
from .network import Network
File "/root/tf_deformable_frcnn/lib/networks/network.py", line 13, in
from ..deform_conv_layer import deform_conv_op as deform_conv_op
File "/root/tf_deformable_frcnn/lib/deform_conv_layer/deform_conv_op.py", line 8, in
_deform_conv_module = tf.load_op_library(filename)
File "/usr/local/lib/python/dist-packages/tensorflow/python/framework/load_library.py", line 64, in load_op_library
None, None, error_msg, error_code)
tensorflow.python.framework.errors_impl.NotFoundError: /root/tf_deformable_frcnn/lib/deform_conv_layer/deform_conv.so: undefined symbol: _ZN10tensorflow8internal21CheckOpMessageBuilder9NewStringB5cxx11Ev

Any thoughts?

@Zardinality
Copy link
Owner

Zardinality commented Jul 31, 2017

@xiaowenhe I was aware about your problem, please don't reopen two same issues.

@Zardinality
Copy link
Owner

@paulcx It might related to gcc version and certain flags. I will add some lines in make.sh and FAQ. Now if you want to fix it instantly, check this issue.

@paulcx
Copy link

paulcx commented Jul 31, 2017

@Zardinality I have tried both solutions but they don't work so far with same error.
g++ -std=c++11 -shared -D_GLIBCXX_USE_CXX11_ABI=0 -o roi_pooling.so roi_pooling_op.cc
roi_pooling_op.cu.o -I $TF_INC -fPIC -D GOOGLE_CUDA -lcudart -L $CUDA_HOME/lib64
or

g++ -std=c++11 -shared -D_GLIBCXX_USE_CXX11_ABI=1 -o roi_pooling.so roi_pooling_op.cc
roi_pooling_op.cu.o -I $TF_INC -fPIC -D GOOGLE_CUDA -lcudart -L $CUDA_HOME/lib64

Am I right about the solution?

@Zardinality
Copy link
Owner

@paulcx Make sure you use the recompiled version. Or try removing the related flag maybe. Which version of g++ do you use?

@paulcx
Copy link

paulcx commented Jul 31, 2017

@Zardinality What do you mean by using the recomplied version?
g++ is 5.40

@Zardinality
Copy link
Owner

@paulcx I mean manually remove original generated file such as .o and .so, then recompile it. Also, since you use g++5(which I didn't have chance to test), you should compile with -D_GLIBCXX_USE_CXX11_ABI=0.

@Zardinality
Copy link
Owner

@paulcx Hi, have you worked out where the problem is? I have updated readme to include a workaround given by others in another issue, which solves the same problem.

@paulcx
Copy link

paulcx commented Aug 7, 2017

@Zardinality Not yet. The solution does not work for g++5.4 at least.

@selkerdawy
Copy link

@paulcx check out this , it solved a similar undefined symbol problem for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants