Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improvement] K8s pod OOM Killed should be identified as Application failed state #6720

Closed
3 of 4 tasks
Madhukar525722 opened this issue Oct 2, 2024 · 1 comment
Closed
3 of 4 tasks

Comments

@Madhukar525722
Copy link
Contributor

Madhukar525722 commented Oct 2, 2024

Code of Conduct

Search before asking

  • I have searched in the issues and found no similar issues.

What would you like to be improved?

The current behaviour is, when a user engine pod goes into OOMKilled state, it gets into Error operating Launchengine. And even if they try to reconnect a new session, kyuubi connects to same old engine, till the engine timeout and the error persists. This can hinder user experience, who dont have cluster visibility
kyuubi_oom_reconnect
kyuubi_pod_oom

How should we improve?

Expected behaviour should be, instead of Application mapping itself to UNKNOWN state, it should be KILLED, which eventually results in application failed, and allows to reconnect for a new session.

Are you willing to submit PR?

  • Yes. I would be willing to submit a PR with guidance from the Kyuubi community to improve.
  • No. I cannot submit a PR at this time.
Copy link

github-actions bot commented Oct 2, 2024

Hello @Madhukar525722,
Thanks for finding the time to report the issue!
We really appreciate the community's efforts to improve Apache Kyuubi.

Madhukar525722 added a commit to Madhukar525722/kyuubi that referenced this issue Oct 2, 2024
@pan3793 pan3793 closed this as completed in 2d64255 Oct 2, 2024
pan3793 pushed a commit that referenced this issue Oct 2, 2024
… failed state

# 🔍 Description
## Issue References 🔗

This pull request fixes #6720

## Describe Your Solution 🔧

If pod goes into OOMKilled state, application should be marked as KILLED, which is eventually identified as isFailed

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Tested locally, was able to launch new session
<img width="922" alt="kyuubi_new_session" src="https://github.com/user-attachments/assets/b003c86f-484d-40c5-b173-847374a45b1d">

---

**Be nice. Be informative.**

Closes #6721 from Madhukar525722/OOM.

Closes #6720

cd0bdf6 [madlnu] [KYUUBI #6720] K8s pod OOM Killed should be identified as Application failed state

Authored-by: madlnu <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
(cherry picked from commit 2d64255)
Signed-off-by: Cheng Pan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant