-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fleet]: Agents installed with Default fleet server agent after setting multiple other Fleet Servers goes online and offline repeatedly. #25940
Comments
Pinging @elastic/fleet (Team:Fleet) |
@dikshachauhan-qasource Please review. |
Reviewed and assigned to @EricDavisX |
Thanks for logging. I've been investigating the same area, and am setting up the Demo / Test server to help us test this routinely. I see from your notes and screenshots that: In step 2: Add url for self created Fleet Server Agent "10.0.x.x:8220" under Fleet Settings.
In step 4) After few minutes observe agent going inactive and agent is trying to connect to all the Fleet Servers available in Fleet Settings.
I will report back more findings myself, let us know, @dikshachauhan-qasource thanks. |
Hi @EricDavisX
Yes
This is a non-Fleet Server agent, the elastic-agent that is installed with the first URL, that is the cloud hosted FS Agent. Further, as per logs we observed that the new secondary agent[ i.e elastic-agent] tries to connect to all the existing Fleet Servers. When it fails to communicate with other Fleet Server Agents, this secondary agent gets "Offline". Thanks |
Is my understanding correct that this scenario mixes an on prem and hosted fleet-server? Even though it should work I remember we mentioned that at the moment it is not something we support (@ph @mostlyjason please clarify if this is not correct). Having said that, we should still dig into this to understand what happens as we should support this in the future. |
I think we talked about not supporting this case, but I'm not sure we made a decision on it. The behavior we defined here allows for multiple hosts elastic/kibana#89442 (comment). It says "The Elastic Agent will iterate through URLs until it connects to one successfully. This allows for automatic failover and subnets." It looks like it only tried a single host in 41 seconds and then switched to the degraded status. It's great that it eventually switches to a healthy host, but it seems unexpected that it changes the status in between. It seems like the logic should be that it tries all the hosts and only switches to degraded if none succeed? Could we just treat this as a bug?
|
I think it is important to support this for 2 reasons:
|
Another mention in slack today of users who want to set up a Fleet Server in the local intranet and then one separately on a DMZ (low side) from Patrick Boulanger @pboulanger74 |
@blakerouse This seems to be a recurrent issue could you take a look? The report date a few weeks ago but the last comment from Eric is from last week. Not sure what is the issue is here, we need to take a serious investigation at hi @nimarezainia @urso @ruflin |
Hi @EricDavisX
Build details:
Hence we are closing this out. However, if in further testing we observed this issue again we will reopen this. Thanks |
see above, i logged a proper place-holder for this work as this one captured only one part of the puzzle, and was closed when that one bug was fixed. ;) |
Kibana version: 7.14.0 Snapshot Kibana cloud environment
Host OS and Browser version: All, All
Build Details:
Preconditions:
Steps to reproduce:
Expected Result:
Agents should remain healthy(active) throughout.
Logs:
Logs.zip
Note:
Screenshots:
The text was updated successfully, but these errors were encountered: