-
Notifications
You must be signed in to change notification settings - Fork 710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Head #293
Comments
Here's the latest postsubmit result There are two failures.
And pylint issues. |
Most recent postsubmit had a single test failure which was the GPU test timing out. |
It looks like the TfJob status might not be updated correctly. A job is stuck in the "creating" state even though master has exited and job should be marked as completed. |
With the refactor to use the informer and controller classes, does TrainingJob.Reconcile get called periodically? Or only in response to some event? /cc @wackxu @ScorpioCPH |
@jlewi It is not called periodically, I think it is
|
* Add an Update function to the controller. * The informer periodically generates an Update event but we aren't processing these because we don't have an update function. * As a result, TrainingJob.reconcile doesn't get called periodically and we aren't properly updating the status of the job. ref #309 ref #293
Head is now fixed. Here's the latest passing postsubmit |
Uber bug for fixing head.
A lot of bugs crept in during the refactor because of #280 which meant jobs which failed were actually indicated as successful.
The text was updated successfully, but these errors were encountered: