Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add alert rules for load #201

Open
facundofc opened this issue Nov 12, 2024 · 0 comments
Open

Add alert rules for load #201

facundofc opened this issue Nov 12, 2024 · 0 comments

Comments

@facundofc
Copy link

Enhancement Proposal

Much like what's mentioned on #12 but I didn't find a way to reopen the issue.

We are having an issue in one of our k8s clusters in which processes in the D state begin to accumulate, rendering pods and other services running on the affected worker unresponsive. We didn't get any HostHigh*WaitingTime alert, because they actually don't fire. However, the worker's load does spike abruptly. I'm attaching screenshots of the situation.

It seems that the HostHigh*WaitingTime alert rules alone are not enough to cover all scenarios. We might need to monitor load after all.

io_waiting_time
cpu_waiting_time
load1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant