Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Projector doesn't work if I'm storing scalars too #3686

Closed
isaacgg opened this issue May 30, 2020 · 1 comment · Fixed by #3694
Closed

Projector doesn't work if I'm storing scalars too #3686

isaacgg opened this issue May 30, 2020 · 1 comment · Fixed by #3694

Comments

@isaacgg
Copy link

isaacgg commented May 30, 2020

Environment information (required)

Windows 10.
Python 3.8.3
tensorboard 2.2.1

Please run diagnose_tensorboard.py (link below) in the same
environment from which you normally run TensorFlow/TensorBoard, and
paste the output here:

https://raw.githubusercontent.com/tensorflow/tensorboard/master/tensorboard/tools/diagnose_tensorboard.py

Diagnostics

Diagnostics output
--- check: autoidentify
INFO: diagnose_tensorboard.py version 724b56cee52e7d8eb89bbeec1f0d5ce3e38c9682

--- check: general
INFO: sys.version_info: sys.version_info(major=3, minor=8, micro=3, releaselevel='final', serial=0)
INFO: os.name: nt
INFO: os.uname(): N/A
INFO: sys.getwindowsversion(): sys.getwindowsversion(major=10, minor=0, build=18363, platform=2, service_pack='')

--- check: package_management
INFO: has conda-meta: True
INFO: $VIRTUAL_ENV: None

--- check: installed_packages
INFO: installed: tensorboard==2.2.1
INFO: installed: tensorflow==2.2.0
INFO: installed: tensorflow-estimator==2.2.0

--- check: tensorboard_python_version
INFO: tensorboard.version.VERSION: '2.2.1'

--- check: tensorflow_python_version
2020-05-30 16:34:40.538779: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2020-05-30 16:34:40.542551: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
INFO: tensorflow.__version__: '2.2.0'
INFO: tensorflow.__git_version__: 'v2.2.0-rc4-8-g2b96f3662b'

--- check: tensorboard_binary_path
INFO: which tensorboard: b'C:\\Users\\Isaac\\Anaconda3\\envs\\soccer_stats\\Scripts\\tensorboard.exe\r\n'

--- check: addrinfos
socket.has_ipv6 = True
socket.AF_UNSPEC = <AddressFamily.AF_UNSPEC: 0>
socket.SOCK_STREAM = <SocketKind.SOCK_STREAM: 1>
socket.AI_ADDRCONFIG = <AddressInfo.AI_ADDRCONFIG: 1024>
socket.AI_PASSIVE = <AddressInfo.AI_PASSIVE: 1>
Loopback flags: <AddressInfo.AI_ADDRCONFIG: 1024>
Loopback infos: [(<AddressFamily.AF_INET6: 23>, <SocketKind.SOCK_STREAM: 1>, 0, '', ('::1', 0, 0, 0)), (<AddressFamily.AF_INET: 2>, <SocketKind.SOCK_STREAM: 1>, 0, '', ('127.0.0.1', 0))]
Wildcard flags: <AddressInfo.AI_PASSIVE: 1>
Wildcard infos: [(<AddressFamily.AF_INET6: 23>, <SocketKind.SOCK_STREAM: 1>, 0, '', ('::', 0, 0, 0)), (<AddressFamily.AF_INET: 2>, <SocketKind.SOCK_STREAM: 1>, 0, '', ('0.0.0.0', 0))]

--- check: readable_fqdn
INFO: socket.getfqdn(): 'Ainara-Corral-XPS.home'

--- check: stat_tensorboardinfo
INFO: directory: C:\Users\Isaac\AppData\Local\Temp\.tensorboard-info
INFO: os.stat(...): os.stat_result(st_mode=16895, st_ino=2814749767408054, st_dev=3128547874, st_nlink=1, st_uid=0, st_gid=0, st_size=0, st_atime=1590849016, st_mtime=1590849016, st_ctime=1590338411)
INFO: mode: 0o40777

--- check: source_trees_without_genfiles
INFO: tensorboard_roots (1): ['C:\\Users\\Isaac\\Anaconda3\\envs\\soccer_stats\\lib\\site-packages']; bad_roots (0): []

--- check: full_pip_freeze
INFO: pip freeze --all:
absl-py==0.9.0
alabaster==0.7.12
argh==0.26.2
asgiref==3.2.7
astroid==2.4.0
astunparse==1.6.3
atomicwrites==1.4.0
attrs==19.3.0
autopep8==1.5.1
Babel==2.8.0
backcall==0.1.0
bcrypt==3.1.7
beautifulsoup4==4.9.0
bleach==3.1.4
blis==0.4.1
cachetools==4.1.0
catalogue==1.0.0
certifi==2020.4.5.1
cffi==1.14.0
chardet==3.0.4
click==7.1.2
cloudpickle==1.4.1
colorama==0.4.3
cryptography==2.9.2
cycler==0.10.0
cymem==2.0.3
decorator==4.4.2
defusedxml==0.6.0
diff-match-patch==20181111
dill==0.3.1.1
Django==3.0.5
docutils==0.16
en-core-web-sm==2.2.5
entrypoints==0.3
flake8==3.7.9
future==0.18.2
gast==0.3.3
google-auth==1.15.0
google-auth-oauthlib==0.4.1
google-pasta==0.2.0
googleapis-common-protos==1.51.0
grpcio==1.29.0
h5py==2.10.0
idna==2.9
imagesize==1.2.0
importlib-metadata==1.5.0
intervaltree==3.0.2
ipykernel==5.1.4
ipython==7.13.0
ipython-genutils==0.2.0
ipywidgets==7.5.1
isort==4.3.21
jedi==0.15.2
Jinja2==2.11.2
joblib==0.15.1
jsonschema==3.2.0
jupyter==1.0.0
jupyter-client==6.1.3
jupyter-console==6.1.0
jupyter-core==4.6.3
Keras-Preprocessing==1.1.2
keyring==21.1.1
kiwisolver==1.2.0
lazy-object-proxy==1.4.3
lxml==4.5.0
Markdown==3.2.2
MarkupSafe==1.1.1
matplotlib==3.2.1
mccabe==0.6.1
mistune==0.8.4
murmurhash==1.0.2
nbconvert==5.6.1
nbformat==5.0.6
nltk==3.5
notebook==6.0.3
numpy==1.18.3
numpydoc==0.9.2
oauthlib==3.1.0
opt-einsum==3.2.1
packaging==20.3
pandas==1.0.3
pandocfilters==1.4.2
paramiko==2.7.1
parso==0.5.2
pathtools==0.1.2
pexpect==4.8.0
pickleshare==0.7.5
pip==20.0.2
plac==1.1.3
pluggy==0.13.1
preshed==3.0.2
prometheus-client==0.7.1
promise==2.3
prompt-toolkit==3.0.4
protobuf==3.11.3
psutil==5.7.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycodestyle==2.5.0
pycparser==2.20
pydocstyle==4.0.1
pyflakes==2.1.1
Pygments==2.6.1
pylint==2.5.0
pymongo==3.10.1
PyNaCl==1.3.0
pyOpenSSL==19.1.0
pyparsing==2.4.7
pyrsistent==0.16.0
PySocks==1.7.1
python-dateutil==2.8.1
python-jsonrpc-server==0.3.4
python-language-server==0.31.10
pytz==2020.1
pywin32==227
pywin32-ctypes==0.2.0
pywinpty==0.5.7
PyYAML==5.3.1
pyzmq==18.1.1
QDarkStyle==2.8.1
QtAwesome==0.7.0
qtconsole==4.7.4
QtPy==1.9.0
regex==2020.5.14
requests==2.23.0
requests-oauthlib==1.3.0
rope==0.17.0
rsa==4.0
Rtree==0.9.4
scipy==1.4.1
selenium==3.141.0
Send2Trash==1.5.0
setuptools==46.4.0.post20200518
sip==4.19.13
six==1.14.0
snowballstemmer==2.0.0
sortedcontainers==2.1.0
soupsieve==2.0
spacy==2.2.4
Sphinx==3.0.3
sphinxcontrib-applehelp==1.0.2
sphinxcontrib-devhelp==1.0.2
sphinxcontrib-htmlhelp==1.0.3
sphinxcontrib-jsmath==1.0.1
sphinxcontrib-qthelp==1.0.3
sphinxcontrib-serializinghtml==1.1.4
spyder==4.1.2
spyder-kernels==1.9.1
sqlparse==0.3.1
srsly==1.0.2
tensorboard==2.2.1
tensorboard-plugin-wit==1.6.0.post3
tensorflow==2.2.0
tensorflow-datasets==3.1.0
tensorflow-estimator==2.2.0
tensorflow-metadata==0.22.1
termcolor==1.1.0
terminado==0.8.3
testpath==0.4.4
thinc==7.4.0
toml==0.10.0
tornado==6.0.4
tqdm==4.46.0
traitlets==4.3.3
ujson==1.35
urllib3==1.25.9
wasabi==0.6.0
watchdog==0.10.2
wcwidth==0.1.9
webencodings==0.5.1
Werkzeug==1.0.1
wheel==0.34.2
widgetsnbextension==3.5.1
win-inet-pton==1.1.0
wincertstore==0.2
wrapt==1.11.2
yapf==0.28.0
zipp==3.1.0

Next steps

No action items identified. Please copy ALL of the above output,
including the lines containing only backticks, into your GitHub issue
or comment. Be sure to redact any sensitive information.

For browser-related issues, please additionally specify:

  • Browser type and version (e.g., Chrome 64.0.3282.140):
  • Screenshot, if it’s a visual issue:

Issue description

Please describe the bug as clearly as possible. How can we reproduce the
problem without additional resources (including external data files and
proprietary Python modules)?

Just as the title said, projector only works if there's no scalars being saved too.

I'm runnig tensorboard with tensorboard --logdir=.

I've made a script to test this behaviour:


import os
import numpy as np
import tensorflow as tf
from tensorboard.plugins import projector

class MyClass:
    logs_dir = "./tmp/"
    metadata_file = 'metadata.tsv'

    def __init__(self):
        self.weigths = tf.Variable(np.random.randn(100,10), name="embedding")
        labels = [str(i) for i in range(100)]
        self.register_embedding(self.weigths, labels)
        self.register_scalar()

    def register_scalar(self):
        logdir = self.logs_dir + "/scalars/"

        writer = tf.summary.create_file_writer(logdir + "/metrics")
        with writer.as_default():
            """ variables is a list of dict like [{"loss": 0.5}, ...]"""
            tf.summary.scalar("loss", data=0.1, step=0)

    def register_embedding(self, weights, labels) -> None:
            """Saves a metadata file (labels) and a checkpoint (derived from weights)
            and configures the Embedding Projector to read from the appropriate locations.
            Args:
            weights: tf.Variable with the weights of the embedding layer to be displayed.
            labels: list of labels corresponding to the weights.
            """
            # projector only works in the root folder
            logs_dir = self.logs_dir # + "/projector/"
            os.makedirs(logs_dir, exist_ok=True)
            embedding_fpath = os.path.join(logs_dir, "embedding.ckpt")
            # Create a checkpoint from embedding, the filename and key are
            # name of the tensor.
            checkpoint = tf.train.Checkpoint(embedding=weights)
            checkpoint.save(os.path.join(logs_dir, "embedding.ckpt"))

            # Save Labels separately on a line-by-line manner.
            with open(os.path.join(logs_dir, self.metadata_file), "w") as f:
                for label in labels:
                    f.write("{}\n".format(label.encode("utf-8")))

            # Set up config
            config = projector.ProjectorConfig()
            embedding = config.embeddings.add()
            # The name of the tensor will be suffixed by `/.ATTRIBUTES/VARIABLE_VALUE`
            embedding.tensor_name = "embedding/.ATTRIBUTES/VARIABLE_VALUE"
            embedding.metadata_path = self.metadata_file
            projector.visualize_embeddings(logs_dir, config)

if __name__ == "__main__":
    test = MyClass()

Just comment/uncomment the 14th line (self.register_scalar()) to test both cases

@wchargin
Copy link
Contributor

wchargin commented Jun 2, 2020

Hi @isaacgg! Thanks so much for the clear repro script; that helps a
lot. I can reproduce this. Here’s what’s happening. To TensorBoard,
a run is a directory on disk that contains an events file: those files
with ...tfevents... in the name that contain scalar summary data.
Usually, the projector looks at all your runs to see if any of them
contain projector data. In your example, when you register_scalar(),
you create a run called tmp/scalars with the scalar data, and that
run doesn’t have projector data, because the projector data is written
just to tmp. The reason that it works when you don’t have scalar
data is that the projector plugin has a special case to also check the
root logdir if there are no runs:

# If there are no summary event files, the projector should still work,
# treating the `logdir` as the model checkpoint directory.
if not run_path_pairs:
run_path_pairs.append((".", self.logdir))

…but as you point out this is pretty confusing and unexpected; adding
data clearly shouldn’t cause existing data to disappear.

You can work around this by writing the projector data to the same
directory as one of your runs with metrics. (Usually, people tend to
have a run for each set of hyperparameters that they trained on, so it’s
natural to put the embeddings for a run in that run’s logdir.) But I’ll
make sure to also fix this discontinuity in the projector plugin.

wchargin added a commit that referenced this issue Jun 2, 2020
Summary:
Previously, the projector plugin would look for checkpoints in the root
logdir only if there were no other logs. This led to a confusing
discontinuity: if you add logs to a subdirectory of a logdir with only
projector data, suddenly your projector data would disappear in the UI.
This patch changes the projector plugin to _always_ look in the root log
directory for checkpoints. Fixes #3686.

Test Plan:
Three cases to test:

  - checkpoints in root logdir, no other data
  - checkpoints and summary data in root logdir
  - checkpoints in root logdir, summary data in non-root run

Verify that each of these cases lets the projector plugin render and
that the data shows up only once (i.e., it’s not duplicated).

wchargin-branch: projector-always-check-root
wchargin-source: 7fa825034ad74f83d4aab2bea10d2d6cd6485663
wchargin added a commit that referenced this issue Jun 10, 2020
Summary:
Previously, the projector plugin would look for checkpoints in the root
logdir only if there were no other logs. This led to a confusing
discontinuity: if you add logs to a subdirectory of a logdir with only
projector data, suddenly your projector data would disappear in the UI.
This patch changes the projector plugin to _always_ look in the root log
directory for checkpoints. Fixes #3686.

Test Plan:
Three cases to test:

  - checkpoints in root logdir, no other data
  - checkpoints and summary data in root logdir
  - checkpoints in root logdir, summary data in non-root run

Verify that each of these cases lets the projector plugin render and
that the data shows up only once (i.e., it’s not duplicated).

wchargin-branch: projector-always-check-root
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants