[Improvement] K8s pod OOM Killed should be identified as Application failed state #6720

Madhukar525722 · 2024-10-02T07:05:49Z

Code of Conduct

I agree to follow this project's Code of Conduct

Search before asking

I have searched in the issues and found no similar issues.

What would you like to be improved?

The current behaviour is, when a user engine pod goes into OOMKilled state, it gets into Error operating Launchengine. And even if they try to reconnect a new session, kyuubi connects to same old engine, till the engine timeout and the error persists. This can hinder user experience, who dont have cluster visibility

How should we improve?

Expected behaviour should be, instead of Application mapping itself to UNKNOWN state, it should be KILLED, which eventually results in application failed, and allows to reconnect for a new session.

Are you willing to submit PR?

Yes. I would be willing to submit a PR with guidance from the Kyuubi community to improve.
No. I cannot submit a PR at this time.

github-actions · 2024-10-02T07:06:12Z

Hello @Madhukar525722,
Thanks for finding the time to report the issue!
We really appreciate the community's efforts to improve Apache Kyuubi.

…cation failed state

… failed state # 🔍 Description ## Issue References 🔗 This pull request fixes #6720 ## Describe Your Solution 🔧 If pod goes into OOMKilled state, application should be marked as KILLED, which is eventually identified as isFailed ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Tested locally, was able to launch new session <img width="922" alt="kyuubi_new_session" src="https://github.com/user-attachments/assets/b003c86f-484d-40c5-b173-847374a45b1d"> --- **Be nice. Be informative.** Closes #6721 from Madhukar525722/OOM. Closes #6720 cd0bdf6 [madlnu] [KYUUBI #6720] K8s pod OOM Killed should be identified as Application failed state Authored-by: madlnu <[email protected]> Signed-off-by: Cheng Pan <[email protected]> (cherry picked from commit 2d64255) Signed-off-by: Cheng Pan <[email protected]>

Madhukar525722 added a commit to Madhukar525722/kyuubi that referenced this issue Oct 2, 2024

[KYUUBI apache#6720] K8s pod OOM Killed should be identified as Appli…

cd0bdf6

…cation failed state

Madhukar525722 mentioned this issue Oct 2, 2024

[KYUUBI #6720] K8s pod OOM Killed should be identified as Application failed state #6721

Closed

3 tasks

pan3793 closed this as completed in 2d64255 Oct 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Improvement] K8s pod OOM Killed should be identified as Application failed state #6720

[Improvement] K8s pod OOM Killed should be identified as Application failed state #6720

Madhukar525722 commented Oct 2, 2024 •

edited

Loading

github-actions bot commented Oct 2, 2024

[Improvement] K8s pod OOM Killed should be identified as Application failed state #6720

[Improvement] K8s pod OOM Killed should be identified as Application failed state #6720

Comments

Madhukar525722 commented Oct 2, 2024 • edited Loading

Code of Conduct

Search before asking

What would you like to be improved?

How should we improve?

Are you willing to submit PR?

github-actions bot commented Oct 2, 2024

Madhukar525722 commented Oct 2, 2024 •

edited

Loading