-
Notifications
You must be signed in to change notification settings - Fork 16.8k
[stable/rabbitmq] failing probes on disk or memory alarms #8635
Comments
This is probably something that needs to be addressed upstream with RabbitMQ. We need a health check endpoint without taking into effect disk and memory alarms. Either a new endpoint or the existing endpoint could make the alarm optional with a HTTP parameter. Switching back to |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions. |
This issue is being automatically closed due to inactivity. |
Has someone started a discussion with upstream rabbitmq about this issue? |
@thomas-riccardi, I dont think it's some kind of the rabbitmq management plugin issue. That's just misconfigured probes. |
@f84anton how would you configure the probes then? The readiness probe should raise an error when cutting new incoming connections would help resolve the issue. As for the liveness probe: it should raise an error when killing the container would help resolve the issue. The memory alarm could indeed be resolved by killing the container, but not the disk alarm (for persistent messages). Conclusion
More generally, what are the rabbitmq failures modes for which kubernetes probe could help? |
@f84anton as for your second sub-issue:
It seems to be a different issue: your node seems to take a lot of time to start up, and the default liveness probe configuration is not calibrated for that and raises an error too early. charts/stable/rabbitmq/values.yaml Line 194 in 6a4608e
|
Is this a request for help?:
Is this a BUG REPORT or FEATURE REQUEST? (choose one):
BUG REPORT
Version of Helm and Kubernetes:
kubernetes v1.11, helm v2.11.0
Which chart:
stable/rabbitmq
What happened:
so probe is failed. But rabbitmq is working at the moment and disabling incoming connections is not fixing anything.
What you expected to happen:
Rabbitmq can accept connections when disk or memory alarm fires.
How to reproduce it (as minimally and precisely as possible):
values:
You need to fill the pv with durable queues data or just create files in the pv to trigger disk alarm.
Anything else we need to know:
@tompizmor
The text was updated successfully, but these errors were encountered: