Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(daemon): add Err() to retrieve the tomb death reason #453

Merged
merged 1 commit into from
Jul 18, 2024

Conversation

thp-canonical
Copy link
Contributor

Daemon has a .Dying() method that mirrors the Tomb.Dying() API, but not .Err() for the Tomb.Err() API. Add a .Err() pass-through function and print out the server exit reason for easier debugging should one of the daemon tomb goroutines die.

Comment on lines +779 to +780
// Err returns the death reason, or ErrStillAlive
// if the tomb is not in a dying or dead state.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment taken from d.tomb.Err()'s documentation.

@benhoyt
Copy link
Contributor

benhoyt commented Jul 17, 2024

Looks reasonable to me. However, I'm wondering how this can happen. Currently it looks like the daemon mostly calls d.tomb.Kill(nil) with a nil death reason. Do you have a repro?

@thp-canonical
Copy link
Contributor Author

thp-canonical commented Jul 18, 2024

Looks reasonable to me. However, I'm wondering how this can happen. Currently it looks like the daemon mostly calls d.tomb.Kill(nil) with a nil death reason. Do you have a repro?

(Edit:) There's one kill occurrence here, but granted at that point we already decided to shut down:

d.tomb.Kill(d.serve.Shutdown(ctx))

I think it also happens when one of the tomb goroutines exits with an error, right? From the Tomb.Go() docs:

If f returns a non-nil error, t.Kill is called with that error as the death reason parameter.

So any errors returned from here would (I think) also be reported, not just .Kill() calls:

d.tomb.Go(func() error {
if err := d.serve.Serve(d.generalListener); err != http.ErrServerClosed && d.tomb.Err() == tomb.ErrStillAlive {
return err
}
return nil
})

Same for this related place:

d.tomb.Go(func() error {
err := d.serve.Serve(d.httpListener)
if err != http.ErrServerClosed && d.tomb.Err() == tomb.ErrStillAlive {
return err
}
return nil
})

With the documentation for Tomb.Kill() saying:

Althoguh (sic) Kill may be called multiple times, only the first non-nil error is recorded as the death reason.

So the && d.tomb.Err() == tomb.ErrStillAlive checks above are probably not needed (if their intention is only to avoid overwriting an existing error).

I haven't followed if there's a way to bubble up errors from some handlers all the way to the caller of the .Serve() function, but it seems like it at least any error return value from any of the .Serve() calls that's != http.ErrServerClosed would be good to have reported instead of silently ignored.

@benhoyt benhoyt merged commit a150d22 into canonical:master Jul 18, 2024
16 checks passed
@thp-canonical thp-canonical deleted the feat/get-dying-reason branch July 30, 2024 05:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants