Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building gym_tensorflow #10

Open
Nostrademous opened this issue May 11, 2018 · 19 comments
Open

Building gym_tensorflow #10

Nostrademous opened this issue May 11, 2018 · 19 comments

Comments

@Nostrademous
Copy link

Nostrademous commented May 11, 2018

Getting the following errors from a fresh git clone following README instructions:
SIDE NOTE: I'm installing this on an Ubuntu OS using Windows Subsystem for Linux

(env) nostrademous@DESKTOP-J9431IB:~/ML/deep-neuroevolution/gpu_implementation/gym_tensorflow$ make clean
rm -rf gym_tensorflow.so
(env) nostrademous@DESKTOP-J9431IB:~/ML/deep-neuroevolution/gpu_implementation/gym_tensorflow$ make
g++ -std=c++11 -shared -fPIC -I/home/nostrademous/ML/env/lib/python3.6/site-packages/tensorflow/include -I/home/nostrademous/ML/env/lib/python3.6/site-packages/tensorflow/include/external/nsync/public -L/home/nostrademous/ML/env/lib/python3.6/site-packages/tensorflow/core -D_GLIBCXX_USE_CXX11_ABI=0 -O2 -DGOOGLE_CUDA=1 -Wl,-rpath=/build .//*.cpp .//ops/*.cpp -ltensorflow_framework -o gym_tensorflow.so
In file included from .//tf_env.cpp:22:0:
.//tf_env.cpp: In member function ‘virtual void EnvironmentMakeOp::Compute(tensorflow::OpKernelContext*)’:
.//tf_env.cpp:102:69: error: ‘MakeResourceHandleToOutput’ was not declared in this scope
                                     MakeTypeIndex<BaseEnvironment>()));
                                                                     ^
/home/nostrademous/ML/env/lib/python3.6/site-packages/tensorflow/include/tensorflow/core/framework/op_kernel.h:1309:29: note: in definition of macro ‘OP_REQUIRES_OK’
     ::tensorflow::Status _s(STATUS);    \
                             ^
In file included from /home/nostrademous/ML/env/lib/python3.6/site-packages/tensorflow/include/tensorflow/stream_executor/platform/mutex.h:25:0,
                 from /home/nostrademous/ML/env/lib/python3.6/site-packages/tensorflow/include/tensorflow/stream_executor/dso_loader.h:29,
                 from /home/nostrademous/ML/env/lib/python3.6/site-packages/tensorflow/include/tensorflow/core/platform/default/stream_executor.h:25,
                 from /home/nostrademous/ML/env/lib/python3.6/site-packages/tensorflow/include/tensorflow/core/platform/stream_executor.h:24,
                 from .//ops/indexedmatmul.cpp:20:
/home/nostrademous/ML/env/lib/python3.6/site-packages/tensorflow/include/tensorflow/stream_executor/platform/default/mutex.h:32:19: error: ‘tensorflow::tf_shared_lock’ has not been declared
 using tensorflow::tf_shared_lock;
                   ^
In file included from /home/nostrademous/ML/env/lib/python3.6/site-packages/tensorflow/include/tensorflow/core/platform/default/stream_executor.h:31:0,
                 from /home/nostrademous/ML/env/lib/python3.6/site-packages/tensorflow/include/tensorflow/core/platform/stream_executor.h:24,
                 from .//ops/indexedmatmul.cpp:20:
/home/nostrademous/ML/env/lib/python3.6/site-packages/tensorflow/include/tensorflow/stream_executor/stream.h: In member function ‘bool perftools::gputools::Stream::InErrorState() const’:
/home/nostrademous/ML/env/lib/python3.6/site-packages/tensorflow/include/tensorflow/stream_executor/stream.h:2005:5: error: ‘tf_shared_lock’ was not declared in this scope
     tf_shared_lock lock{mu_};
     ^
Makefile:45: recipe for target 'gym_tensorflow.so' failed
make: *** [gym_tensorflow.so] Error 1

NOTE: I do have slightly update version of some of the python packages, but I don't think that's the errors I'm hitting. Here is the pip list anyways:

(env) nostrademous@DESKTOP-J9431IB:~/ML/deep-neuroevolution/gpu_implementation/gym_tensorflow$ pip list
Package        Version
-------------- -----------
absl-py        0.2.0
appdirs        1.4.3
astor          0.6.2
bleach         1.5.0
click          6.7
gast           0.2.0
grpcio         1.11.0
gym            0.9.4
h5py           2.7.0
html5lib       0.9999999
Markdown       2.6.11
mujoco-py      0.5.7
numpy          1.14.3
packaging      16.8
pip            10.0.1
protobuf       3.5.2.post1
pyglet         1.2.4
PyOpenGL       3.1.0
pyparsing      2.2.0
redis          2.10.5
requests       2.14.2
setuptools     28.8.0
six            1.11.0
tensorboard    1.8.0
tensorflow     0.12.1
tensorflow-gpu 1.8.0
termcolor      1.1.0
Werkzeug       0.14.1
wheel          0.31.0
(env) nostrademous@DESKTOP-J9431IB:~/ML/deep-neuroevolution/gpu_implementation/gym_tensorflow$
@fps7806
Copy link
Contributor

fps7806 commented May 11, 2018

I wonder if it has anything do to with the two different tensorflow versions you have installed. Try pip uninstall tensorflow and keep tensorflow-gpu as is

@Nostrademous
Copy link
Author

Tried that, no luck.

(env) nostrademous@DESKTOP-J9431IB:~/ML/deep-neuroevolution/gpu_implementation/gym_tensorflow$ make
Traceback (most recent call last):
  File "<string>", line 1, in <module>
AttributeError: module 'tensorflow' has no attribute 'sysconfig'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
AttributeError: module 'tensorflow' has no attribute 'sysconfig'
g++ -std=c++11 -shared -fPIC -I -I/external/nsync/public -L -D_GLIBCXX_USE_CXX11_ABI=0 -O2 -DGOOGLE_CUDA=1 -Wl,-rpath=/build .//*.cpp .//ops/*.cpp -ltensorflow_framework -o gym_tensorflow.so
.//tf_env.cpp:22:49: fatal error: tensorflow/core/framework/op_kernel.h: No such file or directory
compilation terminated.
.//ops/indexedmatmul.cpp:7:42: fatal error: tensorflow/core/framework/op.h: No such file or directory
compilation terminated.
Makefile:45: recipe for target 'gym_tensorflow.so' failed
make: *** [gym_tensorflow.so] Error 1

@Nostrademous
Copy link
Author

Nostrademous commented May 12, 2018

I did get the gym to compile after reinstalling the latest version of tensorflow (1.8.0) like I have with tensorflow-gpu. The previous 0.12 version was per the top-level requirements pip in the repo which I guess is obsolete.

However, as compile now, I can't get the ga.py or es.py to work b/c apparently even though I compiled the gym without ALE support, those files require ALE support.

(env) nostrademous@DESKTOP-J9431IB:~/ML/deep-neuroevolution/gpu_implementation$ python es.py configurations/es_atari_config.json
/home/nostrademous/ML/env/lib/python3.6/site-packages/h5py/__init__.py:34: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
05/12/2018 09:29:42 AM {
    "episode_cutoff_mode": 5000,
    "game": "frostbite",
    "l2coeff": 0.005,
    "model": "ModelVirtualBN",
    "mutation_power": 0.02,
    "num_test_episodes": 200,
    "num_validation_episodes": 30,
    "optimizer": {
        "args": {
            "stepsize": 0.01
        },
        "type": "adam"
    },
    "population_size": 5000,
    "return_proc_mode": "centered_rank",
    "timesteps": 250000000.0
}
05/12/2018 09:29:42 AM Logging to: /tmp/tmp9g4m8tav
Traceback (most recent call last):
  File "es.py", line 293, in <module>
    main(**exp)
  File "es.py", line 148, in main
    worker = ConcurrentWorkers(make_env, Model, batch_size=64)
  File "/home/nostrademous/ML/deep-neuroevolution/gpu_implementation/neuroevolution/concurrent_worker.py", line 135, in __init__
    ref_batch = gym_tensorflow.get_ref_batch(make_env_f, sess, 128)
  File "/home/nostrademous/ML/deep-neuroevolution/gpu_implementation/gym_tensorflow/__init__.py", line 18, in get_ref_batch
    env = make_env_f(1)
  File "es.py", line 147, in make_env
    return gym_tensorflow.make(game=exp["game"], batch_size=b)
  File "/home/nostrademous/ML/deep-neuroevolution/gpu_implementation/gym_tensorflow/__init__.py", line 11, in make
    return StackFramesWrapper(atari.AtariEnv(game, batch_size, *args, **kwargs))
  File "/home/nostrademous/ML/deep-neuroevolution/gpu_implementation/gym_tensorflow/atari/__init__.py", line 8, in __init__
    raise NotImplementedError("gym_tensorflow was not compiled with ALE support.")
NotImplementedError: gym_tensorflow was not compiled with ALE support.
(env) nostrademous@DESKTOP-J9431IB:~/ML/deep-neuroevolution/gpu_implementation$

However, when I try to enable ALE in the Makefile and make the gym I get the following errors:

(env) nostrademous@DESKTOP-J9431IB:~/ML/deep-neuroevolution/gpu_implementation/gym_tensorflow$ make
/home/nostrademous/ML/env/lib/python3.6/site-packages/h5py/__init__.py:34: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
/home/nostrademous/ML/env/lib/python3.6/site-packages/h5py/__init__.py:34: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
g++ -std=c++11 -shared -fPIC -I/home/nostrademous/ML/env/lib/python3.6/site-packages/tensorflow/include -I/home/nostrademous/ML/env/lib/python3.6/site-packages/tensorflow/include/external/nsync/public -L/home/nostrademous/ML/env/lib/python3.6/site-packages/tensorflow -D_GLIBCXX_USE_CXX11_ABI=0 -O2 -DGOOGLE_CUDA=1 -I/home/nostrademous/ML/deep-neuroevolution/gpu_implementation/gym_tensorflow/atari-py/atari_py/ale_interface/src -I/home/nostrademous/ML/deep-neuroevolution/gpu_implementation/gym_tensorflow/atari-py/atari_py/ale_interface/src/controllers -I/home/nostrademous/ML/deep-neuroevolution/gpu_implementation/gym_tensorflow/atari-py/atari_py/ale_interface/src/os_dependent -I/home/nostrademous/ML/deep-neuroevolution/gpu_implementation/gym_tensorflow/atari-py/atari_py/ale_interface/src/environment -I/home/nostrademous/ML/deep-neuroevolution/gpu_implementation/gym_tensorflow/atari-py/atari_py/ale_interface/src/external -L/home/nostrademous/ML/deep-neuroevolution/gpu_implementation/gym_tensorflow/atari-py/atari_py/ale_interface/build -Wl,-rpath=/home/nostrademous/ML/deep-neuroevolution/gpu_implementation/gym_tensorflow/atari-py/atari_py/ale_interface/build .//*.cpp .//ops/*.cpp .//atari/*.cpp -ltensorflow_framework -lale -o gym_tensorflow.so
.//atari/tf_atari.cpp:3:29: fatal error: ale_interface.hpp: No such file or directory
compilation terminated.
Makefile:45: recipe for target 'gym_tensorflow.so' failed
make: *** [gym_tensorflow.so] Error 1

@Alro10
Copy link

Alro10 commented May 14, 2018

Hi everyone! I got the same issue, I think it depends on which version of gcc this repository uses to build gpu_implementation.

I found the following references:

Zardinality/TF-deformable-conv#1

tensorflow/tensorflow#15002

@fps7806
Copy link
Contributor

fps7806 commented May 14, 2018

The experiments we included are for the Atari games which require ALE support, you can follow these instructions to compile. We are in the process of adding MuJoCo support, but without ALE the only environment available is the hard maze.

@ylddd
Copy link

ylddd commented Jul 26, 2018

Hello, have your problem been solved? I have the same trouble with you....

@zhan0903
Copy link

Hi, everyone, I met an issue: "g++: error: unrecognized command line option ‘-Wl’", any help?

@benjamin22-314
Copy link

Hi, everyone, I met an issue: "g++: error: unrecognized command line option ‘-Wl’", any help?

I'm having the same issue. Did you work it out @zhan0903 ?

@benjamin22-314
Copy link

Hi, everyone, I met an issue: "g++: error: unrecognized command line option ‘-Wl’", any help?

Hi @zhan0903 , I think that issue is from a typo in the 'deep-neuroevolution/gpu_implementation/gym_tensorflow/Makefile'.

line 30 is missing a ","
I think it should be
FLAGS+= -Wl,-rpath=$(ALE)/build
instead of
FLAGS+= -Wl -rpath=$(ALE)/build

@krisx0101
Copy link

krisx0101 commented Jan 25, 2019

Hi the FLAGS+= -Wl,-rpath=$(ALE)/build does not work. I am still encounter the same error. Have your solved this issue?
@Nostrademous
@fps7806
@ylddd

@matthewzar
Copy link

A slight adaptation of the changes suggested by @BenjaminPhillips22 fixed it on my Linux Mint instance:
FLAGS+= -Wl,-rpath,$(ALE)/build

Notice there are no spaces, and 1 extra comma.

@lisun-ai
Copy link

lisun-ai commented Mar 9, 2019

Hi, everyone, I met an issue: "g++: error: unrecognized command line option ‘-Wl’", any help?

Hi @zhan0903 , I think that issue is from a typo in the 'deep-neuroevolution/gpu_implementation/gym_tensorflow/Makefile'.

line 30 is missing a ","
I think it should be
FLAGS+= -Wl,-rpath=$(ALE)/build
instead of
FLAGS+= -Wl -rpath=$(ALE)/build

Compile successful on Ubuntu 16.04, Thanks!

@krisx0101
Copy link

@Nostrademous I have changed it to "FLAGS+= -Wl,-rpath=$(ALE)/build" and successfully make the gym_tensorflow. But have you guys solved "gym_tensorflow was not compiled with ALE support" error? I have been stuck here for a long time.

Error log:

Traceback (most recent call last):
File "es.py", line 293, in
main(**exp)
File "es.py", line 148, in main
worker = ConcurrentWorkers(make_env, Model, batch_size=64)
File "/home/shawn/workspace/test/deep-neuroevolution/gpu_implementation/neuroevolution/concurrent_worker.py", line 135, in init
ref_batch = gym_tensorflow.get_ref_batch(make_env_f, sess, 128)
File "/home/shawn/workspace/test/deep-neuroevolution/gpu_implementation/gym_tensorflow/init.py", line 18, in get_ref_batch
env = make_env_f(1)
File "es.py", line 147, in make_env
return gym_tensorflow.make(game=exp["game"], batch_size=b)
File "/home/shawn/workspace/test/deep-neuroevolution/gpu_implementation/gym_tensorflow/init.py", line 11, in make
return StackFramesWrapper(atari.AtariEnv(game, batch_size, *args, **kwargs))
File "/home/shawn/workspace/test/deep-neuroevolution/gpu_implementation/gym_tensorflow/atari/init.py", line 8, in init
raise NotImplementedError("gym_tensorflow was not compiled with ALE support.")
NotImplementedError: gym_tensorflow was not compiled with ALE support.

@denis-xiao
Copy link

@Nostrademous @youshaox I got the same problem "gym_tensorflow was not compiled with ALE support" error. Have you ever solved this problem?

@fps7806
Copy link
Contributor

fps7806 commented Apr 26, 2019

@Nostrademous @youshaox I got the same problem "gym_tensorflow was not compiled with ALE support" error. Have you ever solved this problem?

That can be solved if you enable USE_ALE option: https://github.com/uber-research/deep-neuroevolution/blob/master/gpu_implementation/gym_tensorflow/Makefile#L2

Instructions to use ALE are here: https://github.com/uber-research/deep-neuroevolution/tree/master/gpu_implementation/gym_tensorflow/atari

@krisx0101
Copy link

krisx0101 commented Apr 26, 2019

I have already set USE_ALE=1 in the file "deep-neuroevolution/gpu_implementation/gym_tensorflow/Makefile".
USE_SDL := 0
USE_ALE := 1
USE_GPU := 1

Still, i get the above error.

Following the instructions in https://github.com/uber-research/deep-neuroevolution/tree/master/gpu_implementation/gym_tensorflow/atari:

  1. git clone https://github.com/fps7806/atari-py.git into the directory "deep-neuroevolution/gpu_implementation/gym_tensorflow".
  2. cd ./atari-py && make
  3. set USE_ALE := 1 in the file "deep-neuroevolution/gpu_implementation/gym_tensorflow/Makefile".
  4. cd ./gym_tensorflow && make
  5. python es.py configurations/es_atari_config.json
    I still get the above error.

Error log:

2019-04-27 08:02:27.225223: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8790 MB memory) -> physical GPU (device: 0, name: Tesla K40c, pci bus id: 0000:03:00.0, compute capability: 3.5)
Traceback (most recent call last):
File "es.py", line 293, in
main(**exp)
File "es.py", line 148, in main
worker = ConcurrentWorkers(make_env, Model, batch_size=64)
File "/home/shawn/workspace/research/deep-neuroevolution/gpu_implementation/neuroevolution/concurrent_worker.py", line 135, in init
ref_batch = gym_tensorflow.get_ref_batch(make_env_f, sess, 128)
File "/home/shawn/workspace/research/deep-neuroevolution/gpu_implementation/gym_tensorflow/init.py", line 18, in get_ref_batch
env = make_env_f(1)
File "es.py", line 147, in make_env
return gym_tensorflow.make(game=exp["game"], batch_size=b)
File "/home/shawn/workspace/research/deep-neuroevolution/gpu_implementation/gym_tensorflow/init.py", line 11, in make
return StackFramesWrapper(atari.AtariEnv(game, batch_size, *args, **kwargs))
File "/home/shawn/workspace/research/deep-neuroevolution/gpu_implementation/gym_tensorflow/atari/init.py", line 8, in init
raise NotImplementedError("gym_tensorflow was not compiled with ALE support.")
NotImplementedError: gym_tensorflow was not compiled with ALE support.

@fps7806
Copy link
Contributor

fps7806 commented Apr 26, 2019

I have already set USE_ALE=1 in the file "deep-neuroevolution/gpu_implementation/gym_tensorflow/Makefile".
USE_SDL := 0
USE_ALE := 1
USE_GPU := 1

Still get the above error.

Interesting, can you try running cd ./gym_tensorflow && make clean && make

@vijnasu
Copy link

vijnasu commented Feb 20, 2020

Running the python ga.py -c configurations/ga_atari_config.json -o out gives the following error. I tried most of the suggestions discussed above.

tensorflow.python.framework.errors_impl.NotFoundError: /home/administrator/Hands-on-Neuroevolution-with-Python/Chapter10/gym_tensorflow/gym_tensorflow.so: undefined symbol: ZN10tensorflow11ResourceMgr8DoDeleteERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESt10type_indexS8

@thisisjasleen
Copy link

Hi, everyone, I met an issue: "g++: error: unrecognized command line option ‘-Wl’", any help?

Hi @zhan0903 , I think that issue is from a typo in the 'deep-neuroevolution/gpu_implementation/gym_tensorflow/Makefile'.

line 30 is missing a ","
I think it should be
FLAGS+= -Wl,-rpath=$(ALE)/build
instead of
FLAGS+= -Wl -rpath=$(ALE)/build

I am still having some trouble with the same error. Does somebody know how to resolve it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests