-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
elastic-agent: HEALTHY status fluctuates for fleet-server #25341
Comments
Pinging @elastic/agent (Team:Agent) |
@ph I think we should move this to 7.14. Being that it will take some investigating on the best way to fix this issue and it's not critical for the 7.13 release. |
Added to 7.14 and to the meta issue. |
@ph @blakerouse Do we have an update for this issue? |
Could this be similar to issue described in #25940 ? |
I don't have an update for this issue. We need to determine how we want to improve the status command. An approach to providing stability to this command would be to add debouce to healthy reporting. Basically requiring that healthy be reported for an X amount of time before actually saying internally that Elastic Agent is healthy. |
If I remember correctly, one issue in the past was that Elastic Agent was healthy but then goes to unhealthy state again when updating the policy. If this is the case, I'm not sure this is expected as the Elastic Agent itself is still healthy. Something similar was the case during the fleet-server self enrollment. What is our expected behaviour if the health status of the Elastic Agent if a subprocess is not healthy? |
If a subprocess is not healthy then Elastic Agent should not be healthy. When updating policy the Elastic Agent does transition away from healthy to configuring, which is correct state for it. The issue is more that the Elastic Agent is reporting healthy when it should not be healthy, not the opposite direction. |
@mtojek do you have an easy way to reproduce this... maybe re-doing the change as you originally tried? or @blakerouse do you need that at this point? |
I think there is nothing special to reproduce. It's just interpretation of the behavior that Blake described above. |
I'm doing follow-ups on open issue for 7.14 - is this still under review / possible merge for 7.14? If we aren't attempting it, then please update the label to 7.15 or beyond and put it to backlog from 'iteration' as well. Thank you. @blakerouse |
Hi! We're labeling this issue as |
Spotted in elastic/integrations#950
We tried to use
elastic-agent status
as healthcheck for the fleet server, but apparently the stack initialization fails due to unstable HEALTHY status (suspected by @blakerouse ).We would like to use the
status
command instead of/api/status
, but this issue seems to be a blocker.The text was updated successfully, but these errors were encountered: