Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1alpha2 doesn't work TF.estimator for TF <= 1.6 ; need to add environment:cloud to TF_CONFIG #761

Closed
jlewi opened this issue Jul 28, 2018 · 1 comment

Comments

@jlewi
Copy link
Contributor

jlewi commented Jul 28, 2018

See kubeflow/kubeflow#1283

On TF <= 1.6 if you try to use the estimator API with a master the job hangs with

The error is the same "tensorflow:Waiting for model to be ready.  Ready_for_local_init_op:  Variables not initialized:"

The TF_CONIFG looks

TF_CONFIG={"cluster":{"master":["trainer-tfjob2-master-0.default.svc.cluster.local:2222"]},"task":{"type":"master","index":0}}

Its missing Environment which is used in TF 1.6 See here
https://github.com/tensorflow/tensorflow/blob/r1.6/tensorflow/contrib/learn/python/learn/estimators/run_config.py#L146

TF 1.8 doesn't use Environment but it looks like it expects the Chief type.
https://github.com/tensorflow/tensorflow/blob/r1.8/tensorflow/python/estimator/run_config.py#L489

See here:
https://github.com/kubeflow/tf-operator/blob/master/pkg/trainer/replicas.go#L72

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants