-
Notifications
You must be signed in to change notification settings - Fork 716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set state to failed if there is a problem initializing job #219
Conversation
* Clean up the code; there's no longer a reason to have a separate function triggerCreatePhase. * Refactor setup to ensure we always set phase and state. * We should always call updateTPRStatus after setup is called * Before the update was called in two different places (e.g. triggerCreatPhase) depending on the state.
* Call reconcile before the loop and inside it.
@ wackxu could you please review this? |
@wackxu could you please review this? |
pkg/trainer/training.go
Outdated
err := j.job.Spec.SetDefaults() | ||
if err != nil { | ||
return fmt.Errorf("there was a problem setting defaults for job spec: %v", err) | ||
j.status.SetReason("Internal error; tried to setup a job with no spec.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe tried to
--> failed to
would be better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
// Make sure the runtime id is set. | ||
if job.job.Spec.RuntimeId == "" { | ||
if job.status.Phase != c.expectPhase { | ||
t.Errorf("job.job.Status.Phase Want: %v Got:%v ", c.expectPhase, job.status.Phase) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool, i like this test style
if err := j.updateTPRStatus(); err != nil { | ||
log.Warningf("Job %v; failed to update TPR status error: %v", j.job.Metadata.Name, err) | ||
} | ||
j.reconcile(config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems we will do this every 8 minute intervals, if i understand this code, maybe we only need do setup
when my job is created and phase is none.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reconcile function is the code that periodically checks the job and figures out what needs to be done. The reconcile function takes advantage of state (e.g. phase) preserved in the CRD to avoid repeating unnecessary steps like setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. PTAL.
this big enhance code, LGTM |
This change is