-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fleet-server doesn't boot up in time #359
Comments
I'm trying to understand if this is in any way related to the change I made in elastic/integrations#988. It mainly (should) have fixed that the max connection setting is correctly accepted. But turns out it still doesn't work as expected. As the limit it 50+ this should not have an effect here. I wonder if this flakyness comes from another bugfix that has been merged recently. I could not spot something obvious in the logs that would explain the above. @blakerouse Any ideas? It is also odd that it worked in "most" cases so I suspect it is a timing issue somehow. |
Did you introduce anything heavy last time? Something that can affect timing? |
I managed to reproduce this locally and what I see happening is that the fleet-server gets a new config from Fleet so it restarts the fleet-server. This is not really heavy but I think it is related to the timing. As it reloads it is at first healthy then again not healthy and healthy again afterwards. I wonder if this reloading happened before. The config in Fleet seems to be fully aligned with the default config setup. Maybe the fix to the field name changes are causing this. Are you setting and custom fleet-policy settings or just taking the defaults created by Fleet? Is there any option in docker to say: Healthy if at least passed healthy twice in a row? |
Docker engine considers the container as healthy if only the health check is correct. Once it's considered "healthy", the engine proceeds to the next container to boot it up. In this case the engine didn't proceed, which means that every check had to hit the unhealthy state.
This is basic stack boot up procedure with all default policy.
No, that's why I opened this issue: elastic/beats#25341 |
Lets properly fix the health check. Have you seen the flakyness recently again? |
It happens to me from time to time while starting the stack locally, but so far it failed on the CI once. |
Is this still happening? |
@mtojek still happenning? |
It's definetely stale, we can close it. |
With latest PR elastic/integrations#988 introduced by @ruflin we noticed a flakiness:
https://beats-ci.elastic.co/blue/organizations/jenkins/Ingest-manager%2Fintegrations/detail/master/315/
which means that the fleet-server didn't start in time. Is it too slow now?
More logs:
https://beats-ci.elastic.co/job/Ingest-manager/job/integrations/job/master/315/artifact/build/elastic-stack-dump/latest/infoblox/logs/kibana.log
https://beats-ci.elastic.co/job/Ingest-manager/job/integrations/job/master/315/artifact/build/elastic-stack-dump/latest/infoblox/logs/fleet-server.log
https://beats-ci.elastic.co/job/Ingest-manager/job/integrations/job/master/315/artifact/build/elastic-stack-dump/latest/infoblox/logs/elasticsearch.log
The text was updated successfully, but these errors were encountered: