bugfix: TF_CONFIG error when enable evaluator #2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Relative issue kubeflow#1139
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/server_lib.py", line 427, in task_address
job = self._cluster_spec[job_name]
KeyError: 'evaluator'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "code/tensorflow-fashion-mnist-sample/fashion_mnist_tf_estimator.py", line 94, in
eval_spec)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/training.py", line 464, in train_and_evaluate
estimator, train_spec, eval_spec, _TrainingExecutor)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/distribute/estimator_training.py", line 290, in train_and_evaluate
session_config=run_config.session_config)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/distribute/distribute_coordinator.py", line 848, in run_distribute_coordinator
environment=environment)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/distribute/distribute_coordinator.py", line 414, in _run_std_server
target = cluster_spec.task_address(task_type, task_id)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/server_lib.py", line 429, in task_address
raise ValueError("No such job in cluster: %r" % job_name)
ValueError: No such job in cluster: 'evaluator'