You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
if i want distribute train with tfjob, what changes do i need to do in my code(say train.py).
if i understand right, i still need to set up the ClusterSpec, right? thanks
The text was updated successfully, but these errors were encountered:
The correct ClusterSpec will be provided as part of the TF_CONFIG environment variable. The code changes you make depends on which TensorFlow APIs you are using.
Higher level API's like tf.Estimator I believe automatically check the TF_CONFIG environment variable for the cluster spec so you might not need to do any work.
If you are manually constructing Tf servers then you might have to get the cluster spec from TF_CONFIG and pass it through.
For one such example look at the tf-cnn example. This example uses one of the TensorFlow models published by the TensorFlow team. This example uses low level APIs that depend on a variety of flags being set to configure the job for distributed processing. So in this case we have a launcher script which parses TF_CONFIG to determine the values to set for the flags and then invokes the binary.
if i want distribute train with tfjob, what changes do i need to do in my code(say train.py).
if i understand right, i still need to set up the ClusterSpec, right? thanks
The text was updated successfully, but these errors were encountered: