elastic-agent: HEALTHY status fluctuates for fleet-server #25341

mtojek · 2021-04-27T15:59:59Z

We tried to use elastic-agent status as healthcheck for the fleet server, but apparently the stack initialization fails due to unstable HEALTHY status (suspected by @blakerouse ).

We would like to use the status command instead of /api/status, but this issue seems to be a blocker.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2021-04-27T16:00:00Z

Pinging @elastic/agent (Team:Agent)

blakerouse · 2021-05-04T12:14:49Z

@ph I think we should move this to 7.14. Being that it will take some investigating on the best way to fix this issue and it's not critical for the 7.13 release.

ph · 2021-05-04T12:30:58Z

Added to 7.14 and to the meta issue.

urso · 2021-06-21T12:34:08Z

@ph @blakerouse Do we have an update for this issue?

ph · 2021-06-22T17:54:38Z

Could this be similar to issue described in #25940 ?

blakerouse · 2021-06-23T13:49:06Z

I don't have an update for this issue. We need to determine how we want to improve the status command.

An approach to providing stability to this command would be to add debouce to healthy reporting. Basically requiring that healthy be reported for an X amount of time before actually saying internally that Elastic Agent is healthy.

ruflin · 2021-06-28T08:29:56Z

If I remember correctly, one issue in the past was that Elastic Agent was healthy but then goes to unhealthy state again when updating the policy. If this is the case, I'm not sure this is expected as the Elastic Agent itself is still healthy. Something similar was the case during the fleet-server self enrollment. What is our expected behaviour if the health status of the Elastic Agent if a subprocess is not healthy?

blakerouse · 2021-06-28T13:13:32Z

If a subprocess is not healthy then Elastic Agent should not be healthy. When updating policy the Elastic Agent does transition away from healthy to configuring, which is correct state for it. The issue is more that the Elastic Agent is reporting healthy when it should not be healthy, not the opposite direction.

EricDavisX · 2021-07-14T21:35:51Z

@mtojek do you have an easy way to reproduce this... maybe re-doing the change as you originally tried? or @blakerouse do you need that at this point?

mtojek · 2021-07-15T07:13:33Z

I think there is nothing special to reproduce. It's just interpretation of the behavior that Blake described above.

EricDavisX · 2021-07-19T17:48:18Z

I'm doing follow-ups on open issue for 7.14 - is this still under review / possible merge for 7.14? If we aren't attempting it, then please update the label to 7.15 or beyond and put it to backlog from 'iteration' as well. Thank you. @blakerouse

botelastic · 2022-10-19T15:14:49Z

Hi!
We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1.
Thank you for your contribution!

mtojek added the Team:Elastic-Agent Label for the Agent team label Apr 27, 2021

blakerouse self-assigned this Apr 27, 2021

ph added the v7.13.0 label Apr 27, 2021

ph added v7.14.0 and removed Team:Elastic-Agent Label for the Agent team labels May 4, 2021

botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label May 4, 2021

ph added Team:Elastic-Agent Label for the Agent team and removed v7.13.0 labels May 4, 2021

botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label May 4, 2021

faec mentioned this issue May 13, 2021

Fleet: error: could not decode the response #25405

Closed

mtojek mentioned this issue May 18, 2021

fleet-server doesn't boot up in time elastic/fleet-server#359

Closed

blakerouse added 7.15-candidate and removed v7.14.0 labels Jul 20, 2021

blakerouse removed the 7.15-candidate label Oct 19, 2021

blakerouse removed their assignment Oct 19, 2021

botelastic bot added the Stalled label Oct 19, 2022

botelastic bot closed this as completed Apr 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

elastic-agent: HEALTHY status fluctuates for fleet-server #25341

elastic-agent: HEALTHY status fluctuates for fleet-server #25341

mtojek commented Apr 27, 2021

elasticmachine commented Apr 27, 2021

blakerouse commented May 4, 2021

ph commented May 4, 2021

urso commented Jun 21, 2021

ph commented Jun 22, 2021

blakerouse commented Jun 23, 2021

ruflin commented Jun 28, 2021

blakerouse commented Jun 28, 2021

EricDavisX commented Jul 14, 2021

mtojek commented Jul 15, 2021

EricDavisX commented Jul 19, 2021

botelastic bot commented Oct 19, 2022

elastic-agent: HEALTHY status fluctuates for fleet-server #25341

elastic-agent: HEALTHY status fluctuates for fleet-server #25341

Comments

mtojek commented Apr 27, 2021

elasticmachine commented Apr 27, 2021

blakerouse commented May 4, 2021

ph commented May 4, 2021

urso commented Jun 21, 2021

ph commented Jun 22, 2021

blakerouse commented Jun 23, 2021

ruflin commented Jun 28, 2021

blakerouse commented Jun 28, 2021

EricDavisX commented Jul 14, 2021

mtojek commented Jul 15, 2021

EricDavisX commented Jul 19, 2021

botelastic bot commented Oct 19, 2022