Failed to run demo.py with undefined symbol,how can I solve this problem #232

soldatjiang · 2017-11-06T16:23:22Z

soldat@soldat:~/Program/Faster-RCNN_TF$ python ./tools/demo.py --model ./data/faster_rcnn_models/VGG16_faster_rcnn_final.caffemodel
Traceback (most recent call last):
File "./tools/demo.py", line 11, in
from networks.factory import get_network
File "/home/soldat/Program/Faster-RCNN_TF/tools/../lib/networks/init.py", line 8, in
from .VGGnet_train import VGGnet_train
File "/home/soldat/Program/Faster-RCNN_TF/tools/../lib/networks/VGGnet_train.py", line 2, in
from networks.network import Network
File "/home/soldat/Program/Faster-RCNN_TF/tools/../lib/networks/network.py", line 3, in
import roi_pooling_layer.roi_pooling_op as roi_pool_op
File "/home/soldat/Program/Faster-RCNN_TF/tools/../lib/roi_pooling_layer/roi_pooling_op.py", line 5, in
_roi_pooling_module = tf.load_op_library(filename)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/load_library.py", line 56, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename, status)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/errors_impl.py", line 473, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: /home/soldat/Program/Faster-RCNN_TF/tools/../lib/roi_pooling_layer/roi_pooling.so: undefined symbol: _ZTIN10tensorflow8OpKernelE

awilliamson · 2017-11-07T12:41:17Z

I am getting the same issue. Perhaps tensorflow/tensorflow#13607 is related?

apennisi · 2017-11-10T22:47:03Z

I was not able to solve the problem, and you?

awilliamson · 2017-11-10T22:54:45Z

@apennisi It was the culmination of a few days worth of bashing my head against a wall and collating from many sources on the fly.
I have my fork with 2to3 conversion. ( Which is what I presume caused your issue ). Specifically most changes were Makefile changes. ( here )

Ensure the CUDA_PATH at the top is changed to your path, or alternatively replace it in-line. This way the CUDA section gets executed.
Define TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())') and modify the g++ call to include -L$TF_LIB -ltensorflow_framework
The default arch set by this repository does not account for 10-series cards. I was running on a few GTX Titan XP ( and 1080 ). Therefore I set -arch=sm_61.

I'm not claiming this will fix your issues. You may then encounter issues when running the demo. This is due to encoding issues caused by the 2to3 conversion. The solution to these was a combination of eragonruan/text-detection-ctpn and CharlesShang/TFFRCNN

This may be a little beyond scope of your original error, but I believe the cause was your attempt at 2to3 conversion, alongside the MakeFile issues with your system. If you could feedback on any of the above steps, this would be very useful; additionally, this may provide a singular location for others who like us were struggling with errors.

apennisi · 2017-11-10T23:03:31Z

@awilliamson I already tried all these fixes without success..I receive always that error. I already converted from python2 to python3 and on my macbook (cpu) works. I am trying on a server with a Tesla TK80 and I have such an error. Do you have any other suggestions?

awilliamson · 2017-11-10T23:12:55Z

@apennisi Not quite sure without more information regarding your environment etc. It does sound odd, as the fix for your specific undefined symbol is TF_LIB linking in step 2. You shouldn't be getting that error on a CPU only implementation to my knowledge (ensure you pass the cpu only flag to Faster-RCNN). Additionally for a K80, it is a different architecture. This article shows some of the sm_XX codes for various cards and their respective CUDA variants.
I admit, it is a hard problem to solve, and took me a day or two to collate enough information to solve it for my specific platform. Feel free to e-mail me on my institutional e-mail address ( shouldn't be hard to find / figure out ;) ) if you want to discuss this further. If we can figure out your problem, then it might be suitable to respond here once found.

apennisi · 2017-11-10T23:24:55Z

Of course, I change the architecture! Did you change something else?

ambr89 · 2017-12-15T17:05:05Z

I solve it,

I downgraded tensorflow to 1.3

I've change demo.py
I've GTX 1080 Ti.
at line 114
config = tf.ConfigProto(allow_soft_placement=True)
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)

but your 2° step for me doesn't work, in make.sh

g++ -std=c++11 -shared -o roi_pooling.so roi_pooling_op.cc
roi_pooling_op.cu.o -I $TF_INC -D GOOGLE_CUDA=1 -fPIC $CXXFLAGS
-D_GLIBCXX_USE_CXX11_ABI=0 -lcudart -L $CUDA_PATH/lib64 -L $TF_LIB -ltensorflow_framework

/usr/bin/ld: cannot find -ltensorflow_framework
collect2: error: ld returned 1 exit status

trikim · 2017-12-31T15:26:01Z

I think the problem is that your tensorflow version is too high.
My cuda version is 8.0.
My cudnn version is 6.0.
At the first time, I used "pip install --user tensorflow-gpu" to install tensorflow whose version is 1.4.1.
So I met the same problem said above.
At the second time, I downloaded the "Linux GPU: Python 2" package from https://github.com/tensorflow/tensorflow. And finished the installation by "pip install tf_nightly_gpu-1.head-cp27-none-linux_x86_64.whl". This time the tensorflow version changed to 1.4.0-dev20170920.
In Faster-RCNN_TF/lib, before "make", I edited the file:~/.local/lib/python2.7/dist-packages/tensorflow/include/tensorflow/core/platform/default/mutex.h by the reference of #245
At last, I succeed to run the demo.py.
python ./tools/demo.py --model ./models/VGGnet_fast_rcnn_iter_70000.ckpt

xtanitfy · 2018-02-08T02:33:47Z

awilliamson is right！ I use his way and solved the problem .
add this compile flag:
LIBS_FLGAS=-L/usr/local/lib/python2.7/dist-packages/tensorflow -ltensorflow_framework

wtliao · 2018-03-01T14:33:00Z

@awilliamson Hi, thanks for your solution. But it does not work for me. I encountered the new issues as:

tensorflow.python.framework.errors_impl.NotFoundError: /home/wtliao/work_space/Faster-RCNN_TF/tools/../lib/roi_pooling_layer/roi_pooling.so: undefined symbol: ZN10tensorflow7strings6StrCatB5cxx11ERKNS0_8AlphaNumES3
only a little different. Could you help me? thanks

wtliao · 2018-03-02T06:48:51Z

@awilliamson the only way i can fix this problem is to use tf1.3+cuda8.0+cudnn6.0... so sad

ChanChiChoi · 2018-07-09T11:06:40Z

my environment is: cuda 9.0 ; tensorflow 1.8.0. python3.6
this is my solution, just change:

    g++ -std=c++11 -shared -o roi_pooling.so roi_pooling_op.cc \
        roi_pooling_op.cu.o -I $TF_INC  -D GOOGLE_CUDA=1 -fPIC $CXXFLAGS \
        -lcudart -L $CUDA_PATH/lib64

to

    TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')
    g++ -std=c++11 -shared  -o roi_pooling.so  roi_pooling_op.cc  \
         roi_pooling_op.cu.o -I $TF_INC  -D GOOGLE_CUDA=1 -fPIC $CXXFLAGS  \
         -lcudart -L $CUDA_PATH/lib64  -L $TF_LIB -ltensorflow_framework

cfh3c · 2018-07-10T12:46:33Z

You can use both include and lib to solve it:
TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())')
TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')

nvcc -std=c++11 -c -o roi_pooling_op_gpu.cu.o roi_pooling_op_gpu.cu.cc
-I $TF_INC -L $TF_LIB -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC $CXXFLAGS

g++ -std=c++11 -D_GLIBCXX_USE_CXX11_ABI=0 -shared -o ./build/roi_pooling.so roi_pooling_op.cc
roi_pooling_op_gpu.cu.o -I $TF_INC -fPIC $CXXFLAGS -D_GLIBCXX_USE_CXX11_ABI=0 -lcudart -L $CUDA_HOME/lib64 -L $TF_LIB -ltensorflow_framework

rm -rf roi_pooling_op_gpu.cu.o

chenyanyin · 2019-07-09T03:11:12Z

my environment is: cuda 9.0 ; tensorflow 1.8.0. python3.6
this is my solution, just change:

    g++ -std=c++11 -shared -o roi_pooling.so roi_pooling_op.cc \
        roi_pooling_op.cu.o -I $TF_INC  -D GOOGLE_CUDA=1 -fPIC $CXXFLAGS \
        -lcudart -L $CUDA_PATH/lib64

to

    TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')
    g++ -std=c++11 -shared  -o roi_pooling.so  roi_pooling_op.cc  \
         roi_pooling_op.cu.o -I $TF_INC  -D GOOGLE_CUDA=1 -fPIC $CXXFLAGS  \
         -lcudart -L $CUDA_PATH/lib64  -L $TF_LIB -ltensorflow_framework

hello, my envs is same with you ,that is cuda 9.0 too, but i got a erro with you said:
erro is:

ImportError: Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 41, in
from tensorflow.python.pywrap_tensorflow_internal import *
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in
_pywrap_tensorflow_internal = swig_import_helper()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
ImportError: libcusolver.so.8.0: cannot open shared object file: No such file or directory

emilyfy · 2019-08-02T03:42:07Z

@ambr89 I got the same error as you, compiling with -ltensorflow_framework didn't work. I tried to look for libtensorflow_framework.so and couldn't find it but found libtensorflow_framework.so.1 instead inside /usr/local/lib/python2.7/dist-packages/tensorflow. So I made a copy called libtensorflow_framework.so and that fixed it. Hope that helps!

lijf138 · 2020-06-22T18:04:37Z

my error @soldatjiang same error:

tensorflow.python.framework.errors_impl.NotFoundError: /home/ii/app/Faster-RCNN_TF-master/tools/../lib/roi_pooling_layer/roi_pooling.so: undefined symbol: _ZTIN10tensorflow8OpKernelE

@awilliamson Hope your helps!!
my environment is: cuda 9.0 ; cudnn7.1.2 tensorflow 1.10.0 python3.5.5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failed to run demo.py with undefined symbol,how can I solve this problem #232

Failed to run demo.py with undefined symbol,how can I solve this problem #232

soldatjiang commented Nov 6, 2017

awilliamson commented Nov 7, 2017

apennisi commented Nov 10, 2017

awilliamson commented Nov 10, 2017

apennisi commented Nov 10, 2017

awilliamson commented Nov 10, 2017

apennisi commented Nov 10, 2017

ambr89 commented Dec 15, 2017 •

edited

Loading

trikim commented Dec 31, 2017 •

edited

Loading

xtanitfy commented Feb 8, 2018

wtliao commented Mar 1, 2018

wtliao commented Mar 2, 2018

ChanChiChoi commented Jul 9, 2018 •

edited

Loading

cfh3c commented Jul 10, 2018

chenyanyin commented Jul 9, 2019 •

edited

Loading

emilyfy commented Aug 2, 2019

lijf138 commented Jun 22, 2020

Failed to run demo.py with undefined symbol,how can I solve this problem #232

Failed to run demo.py with undefined symbol,how can I solve this problem #232

Comments

soldatjiang commented Nov 6, 2017

awilliamson commented Nov 7, 2017

apennisi commented Nov 10, 2017

awilliamson commented Nov 10, 2017

apennisi commented Nov 10, 2017

awilliamson commented Nov 10, 2017

apennisi commented Nov 10, 2017

ambr89 commented Dec 15, 2017 • edited Loading

trikim commented Dec 31, 2017 • edited Loading

xtanitfy commented Feb 8, 2018

wtliao commented Mar 1, 2018

wtliao commented Mar 2, 2018

ChanChiChoi commented Jul 9, 2018 • edited Loading

cfh3c commented Jul 10, 2018

chenyanyin commented Jul 9, 2019 • edited Loading

emilyfy commented Aug 2, 2019

lijf138 commented Jun 22, 2020

ambr89 commented Dec 15, 2017 •

edited

Loading

trikim commented Dec 31, 2017 •

edited

Loading

ChanChiChoi commented Jul 9, 2018 •

edited

Loading

chenyanyin commented Jul 9, 2019 •

edited

Loading