Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Elastic Agent] Retry deamon reload during enroll #25590

Merged
merged 8 commits into from
May 14, 2021
20 changes: 19 additions & 1 deletion x-pack/elastic-agent/pkg/agent/cmd/enroll_cmd.go
Original file line number Diff line number Diff line change
Expand Up @@ -253,7 +253,7 @@ func (c *enrollCmd) fleetServerBootstrap(ctx context.Context) (string, error) {
var agentSubproc <-chan *os.ProcessState
if agentRunning {
// reload the already running agent
err = c.daemonReload(ctx)
err = c.daemonReloadWithBackoff(ctx)
if err != nil {
return "", errors.New(err, "failed to trigger elastic-agent daemon reload", errors.TypeApplication)
}
Expand Down Expand Up @@ -323,6 +323,24 @@ func (c *enrollCmd) prepareFleetTLS() error {
return nil
}

func (c *enrollCmd) daemonReloadWithBackoff(ctx context.Context) error {
err := c.daemonReload(ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If no error here then no reason to continue. Just return as it has been reloaded.

signal := make(chan struct{})
backExp := backoff.NewExpBackoff(signal, 60*time.Second, 10*time.Minute)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we start with lower amounts? This seems rather large initial wait. Maybe lets do like 10 seconds to maximum of 1 minute?

This is a local daemon it should be quick to connect and communicate.


for i := 5; i >= 0; i-- {
backExp.Wait()
c.log.Info("Retrying to restart...")
err = c.daemonReload(ctx)
if err == nil {
break
}
}

close(signal)
return err
}

func (c *enrollCmd) daemonReload(ctx context.Context) error {
daemon := client.New()
err := daemon.Connect(ctx)
Expand Down