Cannot launch TensorBoard from source due to debugger plugin #431

wchargin · 2017-08-28T02:25:10Z

TensorBoard master, with TensorFlow 1.3.0 from pip, cannot run: it fails to import a Python library related to gRPC.

The error is:

Traceback (most recent call last):
  File "/home/wchargin/.cache/bazel/_bazel_wchargin/3f99396cfb979f2f5a2059c1fd233f92/execroot/org_tensorflow_tensorboard/bazel-out/local-fastbuild/bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/main.py", line 38, in <module>
    from tensorboard.plugins.debugger import debugger_plugin as debugger_plugin_lib
  File "/home/wchargin/.cache/bazel/_bazel_wchargin/3f99396cfb979f2f5a2059c1fd233f92/execroot/org_tensorflow_tensorboard/bazel-out/local-fastbuild/bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/plugins/debugger/debugger_plugin.py", line 35, in <module>
    from tensorboard.plugins.debugger import debugger_server_lib
  File "/home/wchargin/.cache/bazel/_bazel_wchargin/3f99396cfb979f2f5a2059c1fd233f92/execroot/org_tensorflow_tensorboard/bazel-out/local-fastbuild/bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/plugins/debugger/debugger_server_lib.py", line 33, in <module>
    from tensorflow.python.debug.lib import grpc_debug_server
ImportError: cannot import name grpc_debug_server

The first bad commit is (unsurprisingly) a856e61, which I identified by using git bisect with the following script:

#!/bin/bash
! bazel run tensorboard 2>&1 | grep -F 'cannot import name grpc_debug_server'

Steps to reproduce:

$ virtualenv /tmp/tensorflow-1.3.0-fresh
$ source /tmp/tensorflow-1.3.0-fresh/bin/activate
$ pip install tensorflow==1.3.0
$ git checkout b1a4d2586a0eae1ce7f3a18b4db188b62c4daaee  # current origin/master
$ bazel run tensorboard -- --logdir /tmp/data

The following patch fixes the problem:

diff --git a/tensorboard/main.py b/tensorboard/main.py
index ec84e25..fb5d2cd 100644
--- a/tensorboard/main.py
+++ b/tensorboard/main.py
@@ -35,7 +35,7 @@ from tensorboard.backend import application
 from tensorboard.backend.event_processing import event_file_inspector as efi
 from tensorboard.plugins.audio import audio_plugin
 from tensorboard.plugins.core import core_plugin
-from tensorboard.plugins.debugger import debugger_plugin as debugger_plugin_lib
+#from tensorboard.plugins.debugger import debugger_plugin as debugger_plugin_lib
 from tensorboard.plugins.distribution import distributions_plugin
 from tensorboard.plugins.graph import graphs_plugin
 from tensorboard.plugins.histogram import histograms_plugin
@@ -240,11 +240,12 @@ def main(unused_argv=None):
     efi.inspect(FLAGS.logdir, event_file, FLAGS.tag)
     return 0
   else:
-    def ConstructDebuggerPluginWithGrpcPort(context):
-      debugger_plugin = debugger_plugin_lib.DebuggerPlugin(context)
-      if FLAGS.debugger_data_server_grpc_port is not None:
-        debugger_plugin.listen(FLAGS.debugger_data_server_grpc_port)
-      return debugger_plugin
+    pass
+    #def ConstructDebuggerPluginWithGrpcPort(context):
+    #  debugger_plugin = debugger_plugin_lib.DebuggerPlugin(context)
+    #  if FLAGS.debugger_data_server_grpc_port is not None:
+    #    debugger_plugin.listen(FLAGS.debugger_data_server_grpc_port)
+    #  return debugger_plugin
 
     plugins = [
         core_plugin.CorePlugin,
@@ -258,7 +259,7 @@ def main(unused_argv=None):
         projector_plugin.ProjectorPlugin,
         text_plugin.TextPlugin,
         profile_plugin.ProfilePlugin,
-        ConstructDebuggerPluginWithGrpcPort,
+        #ConstructDebuggerPluginWithGrpcPort,
     ]
 
     tb = create_tb_app(plugins)

Versions:

$ bazel version
Build label: 0.5.4
Build target: bazel-out/local-fastbuild/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Fri Aug 25 10:00:00 2017 (1503655200)
Build timestamp: 1503655200
Build timestamp as int: 1503655200
$ pip --version
pip 9.0.1 from /tmp/tensorflow-1.3.0-fresh/local/lib/python2.7/site-packages (python 2.7)
$ lsb_release -a
No LSB modules are available.
Distributor ID:	LinuxMint
Description:	Linux Mint 18.2 Sonya
Release:	18.2
Codename:	sonya

The text was updated successfully, but these errors were encountered:

wchargin · 2017-08-28T02:25:38Z

@chihuahua

chihuahua · 2017-08-28T09:33:18Z

Hmm, I'm trying to repro. I ran

pip install https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.3.0rc2-cp27-none-linux_x86_64.whl

and TensorBoard at master HEAD seems to run fine.

chihuahua · 2017-08-28T09:36:55Z

One thing to note: TensorBoard used to fail to start for me, but I fixed by pip installing grpcio. However, the error I got from that looked different.

INFO: Running command line: bazel-bin/tensorboard/tensorboard '--logdir=~/Desktop/pr_curve_demo'
Traceback (most recent call last):
File "/private/var/tmp/_bazel_chizeng/1b1399fef0aaaae96df4708880f141bb/execroot/org_tensorflow_tensorboard/bazel-out/darwin_x86_64-fastbuild/bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/main.py", line 38, in
from tensorboard.plugins.debugger import debugger_plugin as debugger_plugin_lib
File "/private/var/tmp/_bazel_chizeng/1b1399fef0aaaae96df4708880f141bb/execroot/org_tensorflow_tensorboard/bazel-out/darwin_x86_64-fastbuild/bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/plugins/debugger/debugger_plugin.py", line 35, in
from tensorboard.plugins.debugger import debugger_server_lib
File "/private/var/tmp/_bazel_chizeng/1b1399fef0aaaae96df4708880f141bb/execroot/org_tensorflow_tensorboard/bazel-out/darwin_x86_64-fastbuild/bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/plugins/debugger/debugger_server_lib.py", line 33, in
from tensorflow.python.debug.lib import grpc_debug_server
File "/Users/chizeng/anaconda/lib/python3.6/site-packages/tensorflow/python/debug/lib/grpc_debug_server.py", line 27, in
import grpc
ModuleNotFoundError: No module named 'grpc'

The error you noted seems to instead indicate that the grpc_debug_server module is unavailable.

wchargin · 2017-08-28T12:36:24Z

Using 1.3.0rc2 instead of 1.3.0, with the link that you provided, does not fix the problem.

Additionally installing grpcio does not fix the problem.

In my site packages, the tensorflow.python.debug.lib package contains no file grpc_debug_server.py, so it is no wonder that the import fails. You don't seem to have this problem: could you please post your output for

from tensorflow.python.debug.lib import grpc_debug_server
print(grpc_debug_server.__file__)

Note that this file does exist in nightly TensorFlow. However, (a) I'd thought that we no longer wanted to depend on nightly since the 1.3 release (correct me if wrong?), and (b) the import still fails because a transitive dependency is missing: if I write

$ virtualenv /tmp/tensorflow-nightly-20170828
$ source /tmp/tensorflow-nightly-20170828/bin/activate
$ pip install 'https://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=cpu-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-1.3.0-cp27-none-linux_x86_64.whl'
$ bazel run tensorboard

then the error is

Traceback (most recent call last):
  File "/home/wchargin/.cache/bazel/_bazel_wchargin/3f99396cfb979f2f5a2059c1fd233f92/execroot/org_tensorflow_tensorboard/bazel-out/local-fastbuild/bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/main.py", line 38, in <module>
    from tensorboard.plugins.debugger import debugger_plugin as debugger_plugin_lib
  File "/home/wchargin/.cache/bazel/_bazel_wchargin/3f99396cfb979f2f5a2059c1fd233f92/execroot/org_tensorflow_tensorboard/bazel-out/local-fastbuild/bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/plugins/debugger/debugger_plugin.py", line 35, in <module>
    from tensorboard.plugins.debugger import debugger_server_lib
  File "/home/wchargin/.cache/bazel/_bazel_wchargin/3f99396cfb979f2f5a2059c1fd233f92/execroot/org_tensorflow_tensorboard/bazel-out/local-fastbuild/bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/plugins/debugger/debugger_server_lib.py", line 33, in <module>
    from tensorflow.python.debug.lib import grpc_debug_server
  File "/tmp/tensorflow-nightly-20170828/local/lib/python2.7/site-packages/tensorflow/python/debug/lib/grpc_debug_server.py", line 26, in <module>
    from concurrent import futures
ImportError: No module named concurrent

wchargin · 2017-08-28T12:39:19Z

To summarize, the only configuration that I have found to work is to install both TensorFlow nightly and the separate grpcio package, which provides the concurrent package. The former might be acceptable, but the latter isn't and should be fixed.

ioeric · 2017-08-28T14:59:21Z

FYI, I ran into the same problem, and I did pip install grpc which seemed to fix the problem.

caisq · 2017-08-28T15:40:59Z

I think this may have to do with the recent update in the tensorboard version that tensorflow 1.3.0 depends on. The new version includes the PR that open-sourced plugin/debugger: #310.

But plugin/debugger depends on grpc_debug_server, which is not available in tensorflow 1.3.0. It is available in tensorflow HEAD, though.

So we have a few options:

Put out a patch release of tensorboard with the PR reverted.
Put out a patch release of tensorflow with the grpc_debug_server cherry picked.

@jart

caisq · 2017-08-28T20:09:51Z

@wchargin, I may have misunderstood the issue in my previous comment. Now I realize that the issue happens only for developers working at tensorboard master HEAD. For this developer workflow, the way to resolve this issue is to install the nightly tensorflow, instead of tensorflow 1.3.0. tensorflow 1.3.0 doesn't have the grpc_debug_server. The nightly install instructions can be found here:
https://github.com/tensorflow/tensorflow#installation

Note that the Travis testing we have is performed against nightly tensorflow, not latest-release tensorflow.

luchensk · 2017-08-29T04:53:12Z

I also met the issue before and fixed it by using the master branch of tensorflow as @caisq said as above.

luchensk · 2017-08-29T05:02:26Z

BTW，if you work on MAC OS, please refer to tensorflow/tensorflow#12123, which includes a workaround to compile tensorflow on MAC by replacing -Werror with -Wno-excessive-errors in
add_boringssl_s390x.patch.

RenatoUtsch · 2017-09-01T15:03:23Z

Just update to Bazel 0.5.4, the -Werror hack is not needed anymore.

wchargin · 2017-10-03T01:23:02Z

Bump—this issue continues to occur on a fresh clone (repro below), and using TF nightly does not fix the issue. @caisq

Here is a revised repro script:

#!/bin/sh
set -eux
tmpdir="$(mktemp -d --suffix _tensorflow)"
virtualenv "${tmpdir}"
. "${tmpdir}/bin/activate"
pip install 'https://ci.tensorflow.org/view/tf-nightly/job/tf-nightly-linux/TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=cpu-slave/52/artifact/pip_test/whl/tf_nightly-1.head-cp27-none-linux_x86_64.whl'
# pip install futures
# pip install grpc
bazel build //tensorboard
./bazel-bin/tensorboard/tensorboard --logdir ~/data/

This yields:

Traceback (most recent call last):
  File "/home/wchargin/git/tensorboard/bazel-bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/main.py", line 38, in <module>
    from tensorboard.plugins.debugger import debugger_plugin as debugger_plugin_lib
  File "/home/wchargin/git/tensorboard/bazel-bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/plugins/debugger/debugger_plugin.py", line 35, in <module>
    from tensorboard.plugins.debugger import debugger_server_lib
  File "/home/wchargin/git/tensorboard/bazel-bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/plugins/debugger/debugger_server_lib.py", line 33, in <module>
    from tensorflow.python.debug.lib import grpc_debug_server
  File "/tmp/tmp.xP0p6ZLUpx_tensorflow/local/lib/python2.7/site-packages/tensorflow/python/debug/lib/grpc_debug_server.py", line 26, in <module>
    from concurrent import futures
ImportError: No module named concurrent

Uncommenting the first commented line yields:

Traceback (most recent call last):
  File "/home/wchargin/git/tensorboard/bazel-bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/main.py", line 38, in <module>
    from tensorboard.plugins.debugger import debugger_plugin as debugger_plugin_lib
  File "/home/wchargin/git/tensorboard/bazel-bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/plugins/debugger/debugger_plugin.py", line 35, in <module>
    from tensorboard.plugins.debugger import debugger_server_lib
  File "/home/wchargin/git/tensorboard/bazel-bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/plugins/debugger/debugger_server_lib.py", line 33, in <module>
    from tensorflow.python.debug.lib import grpc_debug_server
  File "/tmp/tmp.2juuukpm8w_tensorflow/local/lib/python2.7/site-packages/tensorflow/python/debug/lib/grpc_debug_server.py", line 27, in <module>
    import grpc
ImportError: No module named grpc

Uncommenting the second line works, although there is still a spurious log entry:

Import grpc:No module named gevent.socket

Note that I've had to go back to TensorFlow build 52 because of a regression introduced recently (#595 (comment)).

Surely this must be fixed. We have dependencies that we are failing to express; I just don't know what the right place to put them is. cc @jart

jart · 2017-10-03T01:38:02Z

It's assumed that, when working from source, you'll pip install futures and grpcio manually into your virtualenv, because it's nontrivial to express them in our Bazel build.

It's hard to integrate futures because, in the pip world, installing that package on Python3 is treated as a no-op. I'm not quite certain how to express that in a Bazel build. Integrating grpcio would require a lot of BUILD configuration and a lot of time spent compiling on Travis. It's not a beautiful thing.

I will however note that I've encountered some other strange errors relating to the debugger plugin and grpc. Please see this comment. It seems like our Travis build might be broken and I'm not sure why.

wchargin · 2017-10-03T01:48:52Z

@jart: Thanks for the summary. That's quite unfortunate. I'll add that to DEVELOPMENT.md, but I propose that this issue remain open: if we have some opportunity to fix it (a fixit day, or someone just feels like it some time), then that will be nice.

I linked to that comment of yours near the end of my comment; I can reproduce the issues when using TensorFlow nightly, and I have not found a resolution (though I have not looked too deeply, either).

caisq · 2017-10-03T01:51:16Z

@wchargin, @jart: futures and grpcio are listed as dependencies of the tensorboard pip package in setup.py. setup.py does not affect bazel runs obviously, which is the reason for the ImportErrors that @wchargin mentioned. The ImportErrors do not occur when pip package is built and installed in a virtualenv.

As for the weird issue that @jart mentioned, I just ran bazel test tensorboard/... on my machine in a virtualenv with futures and grpcio installed. I saw some breakage related to SummaryMetadata, but not the one that @jart pasted:

AttributeError: 'SymbolDatabase' object has no attribute 'RegisterServiceDescriptor'

@jart, can you let me know which test shows this particular error?

wchargin · 2017-10-03T01:53:48Z

@caisq I reproduce @jart's exact error with the script in #431 (comment), by changing the build number from 52 to 56. Moreover, 56 is the earliest bad build. Simply running bazel run tensorboard triggers the error.

That is, the following script reproduces:

#!/bin/sh
set -eux
tmpdir="$(mktemp -d --suffix _tensorflow)"
virtualenv "${tmpdir}"
. "${tmpdir}/bin/activate"
pip install 'https://ci.tensorflow.org/view/tf-nightly/job/tf-nightly-linux/TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=cpu-slave/56/artifact/pip_test/whl/tf_nightly-1.head-cp27-none-linux_x86_64.whl'
pip install futures
pip install grpc
bazel build //tensorboard
./bazel-bin/tensorboard/tensorboard --logdir ~/data/

wchargin · 2017-10-03T01:54:30Z

Here's the commit diff from 54→56 (there is no build 55); one of these changes causes the regression: tensorflow/tensorflow@e3ceea3...64f0ebd

chihuahua · 2017-10-03T01:54:54Z

Have we tried changing the version of protobuf?
googleapis/google-cloud-python#3967

I think I've seen that AttributeError before while using TensorFlow, and I resolved by installing protobuf 3.1.0. https://www.tensorflow.org/versions/r0.12/get_started/os_setup#protobuf_library_related_issues

wchargin · 2017-10-03T01:56:37Z

@chihuahua downgrading protobuf from 3.4.0 to 3.1.0 does not fix the issue.

wchargin · 2017-10-03T02:00:15Z

I observe the following commit in the list: "Update protobuf to 3.4.1" (tensorflow/tensorflow@d16262d). It seems probable that this is related.

caisq · 2017-10-03T02:45:52Z

I have some rough ideas of what might be the cause and how to fix it from the tensorflow side. Will give it a shot tomorrow.

jart · 2017-10-03T02:58:54Z

Upgrading grpc and protobuf doesn't fix the issue either. How stable is grpc? I'm concerned that issues like these could cause problems for TensorBoard and TensorFlow users if we make it a dependency. Should we rework the debugger code so that it can survive if importing grpc fails? Then have an "inactive plugin" page that tells the user to pip install grpc if he/she wants to use it?

caisq · 2017-10-03T03:04:56Z

@jart That sounds good to me, too. I will look into that rework.

caisq · 2018-05-24T16:11:24Z

This issue is obsolete now. Closing it.

wchargin added the core:backend label Aug 28, 2017

This was referenced Sep 1, 2017

ImportError: cannot import name 'grpc_debug_server' tensorflow/tensorboard-plugin-example#10

Closed

Note how we develop on TF nightly #486

Merged

caisq closed this as completed May 24, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot launch TensorBoard from source due to debugger plugin #431

Cannot launch TensorBoard from source due to debugger plugin #431

wchargin commented Aug 28, 2017

wchargin commented Aug 28, 2017

chihuahua commented Aug 28, 2017

chihuahua commented Aug 28, 2017

wchargin commented Aug 28, 2017

wchargin commented Aug 28, 2017

ioeric commented Aug 28, 2017

caisq commented Aug 28, 2017

caisq commented Aug 28, 2017

luchensk commented Aug 29, 2017

luchensk commented Aug 29, 2017

RenatoUtsch commented Sep 1, 2017

wchargin commented Oct 3, 2017

jart commented Oct 3, 2017 •

edited

Loading

wchargin commented Oct 3, 2017

caisq commented Oct 3, 2017

wchargin commented Oct 3, 2017

wchargin commented Oct 3, 2017

chihuahua commented Oct 3, 2017

wchargin commented Oct 3, 2017

wchargin commented Oct 3, 2017

caisq commented Oct 3, 2017

jart commented Oct 3, 2017

caisq commented Oct 3, 2017

caisq commented May 24, 2018

Cannot launch TensorBoard from source due to debugger plugin #431

Cannot launch TensorBoard from source due to debugger plugin #431

Comments

wchargin commented Aug 28, 2017

wchargin commented Aug 28, 2017

chihuahua commented Aug 28, 2017

chihuahua commented Aug 28, 2017

wchargin commented Aug 28, 2017

wchargin commented Aug 28, 2017

ioeric commented Aug 28, 2017

caisq commented Aug 28, 2017

caisq commented Aug 28, 2017

luchensk commented Aug 29, 2017

luchensk commented Aug 29, 2017

RenatoUtsch commented Sep 1, 2017

wchargin commented Oct 3, 2017

jart commented Oct 3, 2017 • edited Loading

wchargin commented Oct 3, 2017

caisq commented Oct 3, 2017

wchargin commented Oct 3, 2017

wchargin commented Oct 3, 2017

chihuahua commented Oct 3, 2017

wchargin commented Oct 3, 2017

wchargin commented Oct 3, 2017

caisq commented Oct 3, 2017

jart commented Oct 3, 2017

caisq commented Oct 3, 2017

caisq commented May 24, 2018

jart commented Oct 3, 2017 •

edited

Loading