Sysdig Trigger #45
jlangy
announced in
Operations
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Starting a thread for a curious sysdig alert that came up. In sysdig we have an alert for the number of ready patroni pods, using the formula
sum(avg(kubernetes.pod.status.ready))
. We have 3 pods, and a Low severity alert if that drops below 3.When one of the pods spiked in CPU, it caused the sysdig trigger to go off (dropped to 2.98 for a bit), the CPU spike is below:
Strange thing is no events got logged in openshift, even though sysdig showed it drop:
I would think if one pod had lost its ready status the kuberentes API would log an event.
Wondering if I am misinterpreting something here, or maybe a pod can lose its ready status temporarily without logging an event?
NB:As a side note, I dropped to measure the avg < 2.5, might make more sense to use the sum(min(kubernetes.pod.status.ready)) though instead
Beta Was this translation helpful? Give feedback.
All reactions