You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Much like what's mentioned on #12 but I didn't find a way to reopen the issue.
We are having an issue in one of our k8s clusters in which processes in the D state begin to accumulate, rendering pods and other services running on the affected worker unresponsive. We didn't get any HostHigh*WaitingTime alert, because they actually don't fire. However, the worker's load does spike abruptly. I'm attaching screenshots of the situation.
It seems that the HostHigh*WaitingTime alert rules alone are not enough to cover all scenarios. We might need to monitor load after all.
The text was updated successfully, but these errors were encountered:
Enhancement Proposal
Much like what's mentioned on #12 but I didn't find a way to reopen the issue.
We are having an issue in one of our k8s clusters in which processes in the D state begin to accumulate, rendering pods and other services running on the affected worker unresponsive. We didn't get any
HostHigh*WaitingTime
alert, because they actually don't fire. However, the worker's load does spike abruptly. I'm attaching screenshots of the situation.It seems that the
HostHigh*WaitingTime
alert rules alone are not enough to cover all scenarios. We might need to monitor load after all.The text was updated successfully, but these errors were encountered: