-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Agent]: On restarting Agent host, Metricbeat is orphaned & stuck in crash loop (relates to policy changes + possibly w Fleet server usage) #25829
Comments
Pinging @elastic/fleet (Team:Fleet) |
Pinging @elastic/agent (Team:Agent) |
thanks Amol. With the specifics provided, I am making assumptions as follows, regarding step 3: This isn't a crazy use case for self-managed, and if the assumptions are correct then it is a high urgency issue as the Agent/Fleet-Server were running fine until say, a laptop reboot. note: I am suspicious that 'Endpoint' has much to do with it, we can try to repro this apart from that to narrow it down if needed. I expect it is just the FS vs policy switch and the reboot that are in play. @blakerouse do you have any thoughts? want to look at it / check the logs? |
I'd also like to confirm that this doens't happen when Fleet Server is out of the scenario. And what OS was seen (and we can test others too for more data) |
Hi @EricDavisX
Yes
Yes
Yes, this issue is not reproducible without Endpoint Security. Thanks |
We should further investigate this. On the testing side we should switch over to use 7.13.1-SNAPSHOT instead of the BC as the release is out and some fixes already went into 7.13.1. @michalpristas I think we have seen this transpiler issue in the past? |
Michal offered to look at any burning issues and so I assigned this to him after chatting in slack. |
@EricDavisX wasnt this already fixed long time ago |
i can reproduce this, if i keep refreshing agent i am hitting this issue eventually, i dont know why this is happening so far. i suspect race in shutdown but i might be wrong. another problem i see is that orphaned beat cannot connect to agent and keeps logging this
together with speed logging it also eats up memory - there probably is some leak in |
@michalpristas thanks for raising it - @amolnater-qasource can you confirm versions of Kibana and System Integration you are testing with please? @fearful-symmetry I know there was some Win7 specific case we couldn't fix, not sure if this is that use case or not. Anything you can check on your side? |
That should have been fixed a long time ago. If you're still seeing that error @michalpristas , can you tell me what version of the system integration is currently running? It should be in the
I don't recall that? The guard that prevents this from running on windows should be fairly blunt. |
It may have been win2019 that we couldn't figure out why, upon more thinking. Anyhow, let us wait for host and Integration version info. However, I don't know how it would be 'old enough' at this point with routine testing that it should ever come up. |
|
Yah, I remember that. So, nothing in the config has changed, as far as I can tell. Kinda obvious @michalpristas , can you disable |
Hi @EricDavisX
We are using 7.13.0 [released] Kibana self managed environment. Versions of Integrations: Please let us know if anything else is required. |
@fearful-symmetry do you have bandwidth to try to reproduce since it is with recent code? I'm hoping it isn't 'special' to see it. |
Me getting agent set up on windows always takes a while, but I can try and take a crack at it tomorrow @EricDavisX |
So, I'm still trying to test this properly, but so far I'm not seeing anything in 7.13 of |
Alright @EricDavisX / @michalpristas I tested this out on a fresh install of 7.13.1 with Windows server 2012. I can't seem to reproduce it. Either its been fixed, or it's something more subtle. |
Thank you @fearful-symmetry let's sync with Michal offline via email as he had a working reproduction, maybe he can share the environment / vm. |
@amolnater-qasource can you re-test this please? |
Hi @EricDavisX Observations on assigning to new policy:
Build details:
Thanks |
Steps followed:
Logs:
logs.zip
The text was updated successfully, but these errors were encountered: