-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose a health check endpoint #643
Comments
This exposes a basic healtcheck server. Internally, the healtcheck handler doesn't do anything complex. The idea is to just check if the process is responding. Signed-off-by: Michal Wasilewski <[email protected]>
Hi @mwasilew2, we recently fixed an issue in #637 which might have been what you were seeing. Could you upgrade to version 0.63.0 and confirm if the collector is still getting stuck? |
In addition to that, unfortunately, given the current architecture of the collector, I'm not sure if there's good places for a health check to hook into to make a meaningful assertion about the state of the process. As I commented on #644, I don't think a basic health check would have helped, and I don't see an easy path to a more advanced health check without significant re-architecting. Thoughts? |
Thanks a lot, I'll bump our deployed version. Although, the current version was running fine for weeks, so it's possible the problem won't occur immediately (if at all).
that makes sense, I left a comment in #644 , I'm ok with closing it too |
It would be great if the collector exposed a health check that could be used for determining if the process is healthy or not.
We run this collector in ECS. For some reason, it got stuck and wasn't sending any diagnostics upstream. We had to manually restart it to make it work again. If the collector exposed a health check endpoint, we could include it in the ECS task definition which would prevent this from happening.
I imagine that a health check endpoint would be useful for other deployment methods as well.
The text was updated successfully, but these errors were encountered: