-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improper shutdown on Windows can cause Elastic Defend to rescan all files on reboot #1525
Comments
Thanks @bjmcnic for seeing this through all the way to the root cause and not just putting a bandaid on Endpoint. |
@intxgo highlighted another problem - uninstalling Endpoint also deletes its quarantine store. Users lose the ability to restore any files that were previously quarantined. |
Will see if we can get this prioritized for 8.7. This seems like it is significantly annoying for Windows users running Elastic Defend. |
Thanks @michalpristas @cmacknz |
@gabriellandau if you see this happening again please reopen this issue |
Going to reopen this because the fix in elastic/elastic-agent-libs#113 hasn't been ported to the agent itself yet, only the elastic-agent-libs dependency. The agent itself is still using v0.3.6 which does not include this fix: Line 17 in ff0ee71
We can also update Beats which are also able to run as Windows services, although the consequences there are less severe. Beats is using v0.3.3: We should also make sure we manually test that this is fixed. We can have our QA team do this if we can give them detailed steps to reproduce the problem here. |
closing as 0.3.6 is a tag currently pointing to top main. |
🤦♂️ yes I can't read apparently. |
We have collected below logs as per our understanding of this issue on latest 8.8.0 SNAPSHOT and had below observations:
Build details: Agent Logs: Could you please share the detailed steps to reproduce this scenario so that we can share the exact observation details with you. cc: @cmacknz |
The agent listens for a shutdown command from a few different places, one of them is
signal.Notify
which on Windows will captureCTRL_CLOSE_EVENT
,CTRL_LOGOFF_EVENT
orCTRL_SHUTDOWN_EVENT
and send as asyscall.SIGTERM
. Another is the Windows SCM (Service Control Manager), it's analogous tosystemd
on linux. The SCM keeps the agent running, and if the agent isn't running, it restarts it.The correct shutdown procedure is the agent to update its status with the SCM, so the SCM won't restart the agent on a legit shutdown.
Even thought the agent uses the
elastic-agent-libs.service#HandleSignals
to manage its status with the SCM, a race condition exists, which may cause the agent to do not update its status correctly, leading the SCM to try to restart the agent during a system shutdown, which in turns might cause problems with some of the agent integrations. Also the when the agent exits it callsos.Exit
which does not run defered functions. The agent andelastic-agent-libs
usesdefer
as part of its shutdown process, which includes managing its state with the SCM, thus failing to correctly notify its status to the SCM.Last, but not least, the agent listen multiple times, in different parts of the code, to the same shutdown signals through
signal.Notify
. It does it on its on and thoroughelastic-agent-libs.service#HandleSignals
, which might cause the agent to never try to update its status with the SCM.The agent needs to act only on the first shutdown signal it receives, regardless from where it comes,
signal.Notify
, SCM or the CLI, and ignore any shutdown signal after that. Also it needs to ensure to correctly update its status with the SCM.For confirmed bugs, please report:
Relates to:
The text was updated successfully, but these errors were encountered: