-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow HTTP metrics to run in bootstrap mode. Add ability to adjust timeouts for Fleet Server. #28260
Allow HTTP metrics to run in bootstrap mode. Add ability to adjust timeouts for Fleet Server. #28260
Conversation
This pull request does not have a backport label. Could you fix it @blakerouse? 🙏
NOTE: |
Pinging @elastic/agent (Team:Elastic-Agent) |
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
@andresrc are you ok with backporting this as a fix to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good to me. I also want to test on ECE though, if you could hold back with merging until then.
/package |
@simitt @blakerouse did you have a chance to test it yet? |
I tested and created elastic/fleet-server#763 as a follow up as the observed behavior was not quite the expected one, and the agent/fleet-server were very noisily logging the same errors. |
I have the fix for elastic/fleet-server#763 here elastic/fleet-server#768. That will provide the behavior we need for this to work properly. |
a29e234
to
68631ae
Compare
/package |
This pull request is now in conflicts. Could you fix it? 🙏
|
I retested with the fleet-server fix, and the agent and fleet-server work as expected now on cloud. The healthcheck endpoint is immediately exposed, the container is considered healthy, while fleet-server is still trying to start up. The agent returns fleet-server still logs every ~5sec that it is waiting for the policy, but the agent logging is pretty silent. |
…meouts for Fleet Server. (#28260) (#28445) * Allow HTTP metrics to run in bootstrap mode. Add ability to adjust timeouts for Fleet Server. * Add changelog. * Add the persistent agent configuration to the fleet.yml in bootstrap mode. * Fix format issues. (cherry picked from commit 15366ff) Co-authored-by: Blake Rouse <[email protected]>
…meouts for Fleet Server. (#28260) (#28444) * Allow HTTP metrics to run in bootstrap mode. Add ability to adjust timeouts for Fleet Server. * Add changelog. * Add the persistent agent configuration to the fleet.yml in bootstrap mode. * Fix format issues. (cherry picked from commit 15366ff) Co-authored-by: Blake Rouse <[email protected]>
…meouts for Fleet Server. (elastic#28260) * Allow HTTP metrics to run in bootstrap mode. Add ability to adjust timeouts for Fleet Server. * Add changelog. * Add the persistent agent configuration to the fleet.yml in bootstrap mode. * Fix format issues.
What does this PR do?
It allows the metrics endpoint to run during Fleet Server bootstrap mode. Adds timeouts (including negative for indefinite) for waiting on the Elastic Agent daemon and the Fleet Server bootstrap process.
Why is it important?
This is needed by Cloud to allow it to check the status of the Elastic Agent even when Fleet Server cannot complete bootstrap process. Cloud will set the timeout to be indefinite and the system will only check every 10 mins after the exponential backoff to see if it should continue.
Checklist
[ ] I have made corresponding changes to the documentation[ ] I have made corresponding change to the default configuration files[ ] I have added tests that prove my fix is effective or that my feature worksCHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Related issues