Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] Exit code on failure #1749

Open
tonykim-moloco opened this issue Feb 1, 2023 · 5 comments
Open

[Feature request] Exit code on failure #1749

tonykim-moloco opened this issue Feb 1, 2023 · 5 comments

Comments

@tonykim-moloco
Copy link

Feature Request

I would like to propose a feature to get the exit code of the chief pod via TFJob.
I know that the TFJob already takes care of whether-to-restart based on restartPolicy and the exit code internally.
Can we expose that information via TFJob? Currently we want to handle the restart externally and it seems it is hard to get that information via TFJob CRD.

@johnugeorge
Copy link
Member

johnugeorge commented May 17, 2023

@tonykim-moloco what info do you need via TFJob ? Can you elaborate more

@tonykim-moloco
Copy link
Author

TFJob currently only exposes a wrapped information - Succeeded | Created | Running | Failed
However, in failure cases, I would like to know the exit code about the pod to understand about why it failed.
For example, if a worker pod crashed due to OOM (exit code 137), I would like to look up TFJob to see that exit code was 137 on one of the workers.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@tenzen-y
Copy link
Member

Probably, we can support this request once we introduce batch/Job.

#1718

Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants