-
Notifications
You must be signed in to change notification settings - Fork 716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix a bunch of issues with logging. #811
Conversation
* I ran into these issues while trying to understand why my job was marked as failed even though there was no useful informatin about why the pod failed. * Log events that indicate exit code of pods. * In the json payload use the syntax namespace + "." + name not namespace + "/" + name; use of a period is more consistent in K8s * Don't log an event TFJob is terminated, deleting pods and services; this event ends up being triggered repeatedly because of CleanPodPolicy the number of completed pods is non zero so the event statement kept getting called; the event is unnecessary because we will create events corresponding to actual services/events deleted.
/assign @gaocegege |
/lgtm |
Travis tests have failedHey @jlewi, 2nd Buildgometalinter --config=linter_config.json --vendor ./...
3rd Buildgometalinter --config=linter_config.json --vendor ./...
|
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: gaocegege The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
I ran into these issues while trying to understand why my job was
marked as failed even though there was no useful informatin about
why the pod failed.
Log events that indicate exit code of pods.
In the json payload use the syntax namespace + "." + name not
namespace + "/" + name; use of a period is more consistent in K8s
Don't log an event TFJob is terminated, deleting pods and services;
this event ends up being triggered repeatedly because of CleanPodPolicy
the number of completed pods is non zero so the event statement kept
getting called; the event is unnecessary because we will create
events corresponding to actual services/events deleted.
This change is