-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Elastic Agent does not open bootstrap port for Elastic Endpoint after reboot #21424
Comments
FYI @paul-tavares |
Pinging @elastic/ingest-management (Team:Ingest Management) |
@gogochan is elastic agent started? how it was installed? @EricDavisX I think we have covered that case in our "install/uninstall" test cases? |
@ph yes, it did run and makes connection to the Fleet. However it didn't open port 6788, until we make modification to the configuration. |
OK, so it's installed, it's come back up when the computer restart. But it doesn't open the port. This is odd because, I presume the local configuration "persisted" to disk by the Elastic Agent would tell him to have endpoint running. @blakerouse Can you take a look? |
I think this could be because there was an issue with Dynamic Inputs that broke saving the inputs into the action_store.yml. With no inputs in the action_store.yml then on restart the Elastic Agent would not think Endpoint should be running so port 6788 would not be open. This was fixed in #21298. Can you confirm that your build includes that PR? |
@ph to your question, we have re-start of agent tested in e2e-testing, but it doesn't have endpoint in the config at the time. bummer. but, we can do that - should it be a separate test tho? or should we update all relevant test cases to also have Endpoint enabled in policy and test for the related expectations? If its the latter it will have impact on the nomenclature and layout of the tests (I bet the Robots team will insist on keeping it in top-notch logical layout). The one-off change is easily doable, it just needs someone's time to get done, too. here is the line: https://github.com/elastic/e2e-testing/blob/master/e2e/_suites/ingest-manager/features/fleet_mode_agent.feature#L37 I've logged this ticket for us to improve this with priority: |
@EricDavisX Yes probably a separate test seems the simplest route? |
Is this still an issue or was it fixed by #21298? |
I am not able to validate this as I cannot spawn an instance of 8.0 at the moment. |
Validated using the latest_snapshot https://snapshots.elastic.co/8.0.0-af3cda3c/downloads/beats/elastic-agent/elastic-agent-8.0.0-SNAPSHOT-windows-x86_64.zip Other than Agent doesn't restart on system reboot, if started manually, the Agent does serve GRPC bootstrap information via 6788 Closing this. |
I don't doubt that the manual start of Endpoint works as Chan notes - but I am seeing this with a linux endpoint on latest code. The host has Endpoint up and running and after it is re-booted it throws errors in the Agent log and has 'Agent Connectivity' errors in the Endpoint side log: I think we urgently need to pair program this on Agent + Endpoint. @ph and @ferullo - @gogochan and @blakerouse are you free to find an environment and check it out? I have one now and can give logs if helpful... |
logs attached hash of agent is: af3cda3c from today’s *just finished build artifacts here installed it with 'install' command and while it took 5 mins for it to come on line (not reflected in logs) it did eventually and then gave the 1 minute check-in calls successfully. i wanted to see that before I rebooted it. |
@EricDavisX Based on the error reported in the screenshot, it seems that you might actually have 2 Elastic Agents installed and running? Are you sure you don't have both the *.deb and the |
it was online happily before I ran it. and i captured the ps ax output as such: lets pair up and figure out and post back what we find. its a long enough thread already, lol |
we decided we think this is fixed, and are going to re-test with a full 'regular' build of snapshot that has the needed .asc files tomorrow. |
@EricDavisX have you been able to test 7.10.0 BC1 to see if this is fixed? I have not been able to reproduce it on my Windows testing. |
have not tested 7.10 BC yet - I saw a good report that 7.10 Agent tests were all passing tho... so its likely fixed. let me review the specific vm / test case later on 7.10 and 8.0 both I guess, and we can close it out. |
reassigned to @EricDavisX send it back to us if its still an issue. |
it is not reproducible as noted here with the 7.10 BC1 build of Agent - i have other issues, but this is fixed. closing. |
Elastic Agent does not serve GRPC bootstrap info over TCP 6788 after reboot, or restart. This results in Policy failure for the Endpoint because it cannot establish connection
For confirmed bugs, please report:
NETSTAT.EXE -an | grep 6788
from a Command.exeThe text was updated successfully, but these errors were encountered: