Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Agent] Reboot of Agent (with Endpoint installed) shows an temporary failure in Agent Activity log regarding bind port 6788 #21663

Closed
EricDavisX opened this issue Oct 7, 2020 · 5 comments
Assignees

Comments

@EricDavisX
Copy link
Contributor

Not sure if this is high priority, but its scary and is certainly big red warning that goes away so maybe we could squash it (if it isn't otherwise important to fix).

It seems a reboot of a Windows host with Endpoint enabled will yield a red ‘error’ state in the Agent activity log:

MessageApplication: endpoint-security--8.0.0-SNAPSHOT[3a872887-901b-4839-a139-6b15b549a0a2]: State changed to FAILED: failed to start connection credentials listener: listen tcp 127.0.0.1:6788: bind: Only one usage of each socket address (protocol/network address/port) is normally permitted.

tested on
hash of agent is: af3cda3c from build on Oct 6
Agent is here: snapshots.elastic.co/8.0.0-af3cda3c/downloads/beats/elastic-agent/elastic-agent-8.0.0-SNAPSHOT-windows-x86_64.zip

endpoint:
snapshots.elastic.co/8.0.0-af3cda3c/downloads/endpoint-dev/endpoint-security-8.0.0-SNAPSHOT-windows-x86_64.zip

cloud Kibana info is$ git show -s 6f983728d7f8c2cf065a6d5099157a5cfdc3cd08
Date: Tue Oct 6 09:46:56 2020 +0300

see the vm was rebooted twice in this time frame:
Screen Shot 2020-10-07 at 4 34 13 PM

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ingest-management (Team:Ingest Management)

@ph
Copy link
Contributor

ph commented Oct 7, 2020

I think we have discussed that issue yesterday in a zoom call, @blakerouse idea was this would be fixed by locking introduced in a PR from @michalpristas. IIRC this build from october 6 didn't include that fix.

@ph ph added the bug label Oct 7, 2020
@EricDavisX
Copy link
Contributor Author

Thanks PH. I had thought that was a different symptom and fix (about .asc files only) but I see his pr has to do also with configs in general - so I can see that it relates to this now.

Further, I'm seeing another scenario too that seems to relate, when changing policy (from Endpoint Admin page) it started throwing an error repeatedly as Endpoint-security is stuck in a loop trying to re-start.
Screen Shot 2020-10-07 at 5 16 14 PM

Application: endpoint-security--8.0.0-SNAPSHOT[3a872887-901b-4839-a139-6b15b549a0a2]: State changed to FAILED: failed to start connection credentials listener: listen tcp 127.0.0.1:6788: bind: Only one usage of each socket address (protocol/network address/port) is normally permitted.

 
I'm not gonna log this separately, but we can test that tomorrow or soon, unless we know / think it is a separate bug now we could start following up on it. @ph up to you if you want a separate ticket.

@EricDavisX
Copy link
Contributor Author

the policy update also manifested on a linux host as the exact .asc file problem. lol see this screenshot:
Screen Shot 2020-10-07 at 5 25 16 PM

@michalpristas
Copy link
Contributor

This should be fixed by #21573
Tested artifacts above do not include this fix
Today tested a flow multiple times:

  • install agent
  • update policy to include security
  • update security policy

without any issues
image

therefore closing. if we run into this again i will reopen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants