-
Notifications
You must be signed in to change notification settings - Fork 716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feature] python sdk should report errors in created TFJobs #1180
Comments
Issue-Label Bot is automatically applying the labels:
Please mark this comment with 👍 or 👎 to give our bot feedback! |
Issue-Label Bot is automatically applying the labels:
Please mark this comment with 👍 or 👎 to give our bot feedback! |
/assign @jinchihe |
@yashjakhotiya Thanks. Would you think user should get the error from logs if training has error? if so the SDK has a api |
Turned out that there have been a couple of bugs - 1. with |
Got
This was the same reason why TFJob won't stop running. Now that we have solved them both, we don't need |
In case of errors in training code
tfjob_client.wait_for_job
showsRunning
and exits after some time. Instead of looking at Error Reporting from Google Cloud Dashboard, the python sdk should report themThe text was updated successfully, but these errors were encountered: