Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protobuf version mismatch (3.5.0 required, 3.5.0 installed but 3.4.0 detected) #15

Open
albornet opened this issue Jun 13, 2018 · 2 comments

Comments

@albornet
Copy link

albornet commented Jun 13, 2018

Hi!
When I'm running the CDP4_experiment on the Neurorobotics Platform, I'm having a protobuf version issue. When I launch the experiment, at some point the backend returns the following error (and the launch stays forever at the step "Loading transfer function: image_to_saliency":

[libprotobuf FATAL external/protobuf_archive/src/google/protobuf/stubs/common.cc:68] This program requires version 3.5.0 of the Protocol Buffer runtime library, but the installed version is 3.4.0. Please update your library. If you compiled the program yourself, make sure that your headers are from the same version of Protocol Buffers as your link-time library. (Version verification failed in "google/protobuf/descriptor.pb.cc".)

The whole backend output is here, if needed: here

When I comment the image_to_saliency TF in the .bibi file, the experiment can be launched and played (although nothing happens when I press the play button).

I tried to look for google/protobuf in both my system (/usr/local/lib/python2.7/dist-packages) and my platform_venv ($HOME/.opt/platform_venv/lib/python2.7/site-packages). Both versions were 3.5.X but to be sure, I uninstalled them and reinstalled them with pip, to have both version at "3.5.0.post1". However the same error arises. I cannot figure out where a 3.4.0 version of protobuf is installed.

Have you ever met the same error? I saw the same issue for a very similar experiment on the forum (here), but I could not understand how it was solved :s

Thank you for your help!
Best,
Alban

@jackokaiser
Copy link
Member

Hey Alban,

That's a tricky one: both Tensorflow and Gazebo rely on protobuf, but of course they want different version. This is the reason why, in the image_to_saliency.py we add a folder to the PYTHONPATH programatically.

@albornet
Copy link
Author

albornet commented Jul 2, 2018

Hi!
Indeed I had a small irregularity in my PYTHONPATH. After correcting it, the same error occured. My way to correct the bug was to uninstall all my CUDA / CUDNN versions and re-install CUDA-9.0 (+ corresponding CUDNN) and then tensorflow-1.6.0. Now it works better (the libprotobuf version mismatch did not arise anymore!).

However, I get an "out_of_memory" error when I start the experiment (although the experiment launches, then the saliency network is never initialized). I noticed in the output of the terminal nrp-backend that the saliency network is mapped twice on my GPU. I found it strange because when I use the attention package on its own (outside NRP), this never happens.

Here is the relevant part of the terminal output (or full output here):

"2018-07-02 11:34:09 GMT+0200 [REQUEST from ::ffff:127.0.0.1] GET /storage/CDP4_experiment_0/experiment_configuration.exc?byname=true
Now using node v0.10.48 (npm v2.15.1)
2018-07-02 11:34:11.014513: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-07-02 11:34:11.108949: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-07-02 11:34:11.109649: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1212] Found device 0 with properties:
name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.645
pciBusID: 0000:01:00.0
totalMemory: 7.93GiB freeMemory: 7.59GiB
2018-07-02 11:34:11.109683: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1312] Adding visible gpu devices: 0
2018-07-02 11:34:11.324549: I tensorflow/core/common_runtime/gpu/gpu_device.cc:993] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7331 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0, compute capability: 6.1)
Found /home/alban/Documents/NRP/gzweb/gzbridge/ws_server.js
Starting node: OK.

/home/alban/Documents/NRP/CLE/hbp_nrp_cle/hbp_nrp_cle/robotsim/GazeboHelper.py:119: FutureWarning: The behavior of this method will change in future versions. Use specific 'len(elem)' or 'elem is not None' test instead.
2018-07-02 11:34:11,843 [Thread-7 ] [hbp_nrp_cles] [INFO] RobotAbs: /tmp/nrp-simulation-dir/hollie.sdf
Pose message filter parameters:
minimum seconds between successive messages: 0.02
minimum XYZ distance squared between successive messages: 0.00001
minimum Quaternion distance squared between successive messages: 0.0001
Mon Jul 02 2018 11:34:11 GMT+0200 (CEST) Server is listening on port 7681
[ INFO] [1530524052.282033853]: Camera Plugin (robotNamespace = ), Info: the 'robotNamespace' param did not exit
[ INFO] [1530524052.285858891]: Camera Plugin (ns = ) <tf_prefix_>, set to ""
[ INFO] [1530524052.286094616]: Camera Plugin (robotNamespace = ), Info: the 'robotNamespace' param did not exit
[ INFO] [1530524052.288751442]: Camera Plugin (ns = ) <tf_prefix_>, set to ""
2018-07-02 11:34:12,327 [Thread-7 ] [hbp_nrp_cles] [INFO] Preparing CLE Server
2018-07-02 11:34:12,337 [Thread-7 ] [hbp_nrp_cle.] [INFO] Robot control adapter initialized
2018-07-02 11:34:12,359 [Thread-7 ] [hbp_nrp_cle.] [INFO] neuronal simulator initialized
2018-07-02 11:34:12,359 [Thread-7 ] [BrainLoader ] [INFO] Loading brain model from python: /tmp/nrp-simulation-dir/idle_brain.py
2018-07-02 11:34:12,378 [Thread-7 ] [hbp_nrp_cle.] [INFO] Saving brain source
2018-07-02 11:34:12,378 [Thread-7 ] [hbp_nrp_cle.] [INFO] Initialize transfer functions node tfnode
2018-07-02 11:34:12,378 [Thread-7 ] [hbp_nrp_cle.] [INFO] PyNN communication adapter initialized
2018-07-02 11:34:12,379 [Thread-7 ] [hbp_nrp_cle.] [WARNING] ROS node already initialized with another name
2018-07-02 11:34:12,384 [Thread-7 ] [hbp_nrp_cles] [INFO] Registering ROS Service handlers
2018-07-02 11:34:12,386 [Thread-7 ] [hbp_nrp_cles] [INFO] Registering ROS Service handlers
2018-07-02 11:34:12,963 [Thread-3 ] [rospy.intern] [INFO] topic[/ros_cle_simulation/0/lifecycle] adding connection to [/nrp_backend], count 0
2018-07-02 11:34:13.280555: I tensorflow/core/platform/cpu_feature_guard.cc:140]
2018-07-02 11:34:13,282 [Thread-18 ] [rospy.intern] [INFO] topic[/clock] adding connection to [http://127.0.0.1:37719/], count 0
2018-07-02 11:34:13,282 [Thread-17 ] [rospy.intern] [INFO] topic[/ros_cle_simulation/0/lifecycle] adding connection to [http://127.0.0.1:35463/], count 0
2018-07-02 11:34:13,282 [Thread-15 ] [rospy.intern] [INFO] topic[/ros_cle_simulation/0/lifecycle] adding connection to [http://127.0.0.1:36595/], count 0
2018-07-02 11:34:13,284 [Thread-3 ] [rospy.intern] [INFO] topic[/ros_cle_simulation/0/lifecycle] adding connection to [/ros_cle_simulation_14404_1530524010422], count 1
2018-07-02 11:34:13.358846: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-07-02 11:34:13.359468: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1212] Found device 0 with properties:
name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.645
pciBusID: 0000:01:00.0
totalMemory: 7.93GiB freeMemory: 275.06MiB
2018-07-02 11:34:13.359506: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1312] Adding visible gpu devices: 0
2018-07-02 11:34:13.621509: I tensorflow/core/common_runtime/gpu/gpu_device.cc:993] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 219 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0, compute capability: 6.1)
2018-07-02 11:34:13.622443: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 219.31M (229965824 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY"

Do you have any idea why this happens? I think it should not, should it?

Thanks for the help!!
Alban

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants