Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bugfix: TF_CONFIG error when enable evaluator #2

Merged

Conversation

goodoid
Copy link

@goodoid goodoid commented Oct 29, 2021

Relative issue kubeflow#1139

Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/server_lib.py", line 427, in task_address
job = self._cluster_spec[job_name]
KeyError: 'evaluator'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "code/tensorflow-fashion-mnist-sample/fashion_mnist_tf_estimator.py", line 94, in
eval_spec)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/training.py", line 464, in train_and_evaluate
estimator, train_spec, eval_spec, _TrainingExecutor)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/distribute/estimator_training.py", line 290, in train_and_evaluate
session_config=run_config.session_config)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/distribute/distribute_coordinator.py", line 848, in run_distribute_coordinator
environment=environment)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/distribute/distribute_coordinator.py", line 414, in _run_std_server
target = cluster_spec.task_address(task_type, task_id)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/server_lib.py", line 429, in task_address
raise ValueError("No such job in cluster: %r" % job_name)
ValueError: No such job in cluster: 'evaluator'

@cheyang cheyang merged commit a8f586a into AliyunContainerService:v1.0-aliyun-branch Aug 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants