-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lxc: cleanup partially configured containers after errors in Start #3773
lxc: cleanup partially configured containers after errors in Start #3773
Conversation
If there are any errors in container setup after c.Create() in Start(), the container will be left around, with no way to clean it up because the handle will not be created or returned from Start. Added a wrapper that checks for errors and performs appropriate cleanup. Returning a cleanup function from a wrapped function instead of just doing the cleanup before returning the error helps to ensure that future changes that might add or change error exits can't forget to consider a cleanup function. Adds a check to the invalid config test case to check that a container created with an invalid config doesn't get left behind. Signed-off-by: Michael McCracken <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this! Approach looks great, just some minor comments.
client/driver/lxc.go
Outdated
sresp, err, errCleanup := d.StartWithCleanup(ctx, task) | ||
if err != nil { | ||
if cleanupErr := errCleanup(); cleanupErr != nil { | ||
d.logger.Printf("[ERR] error occured:\n%v\nwhile cleaning up from error in Start: %v", cleanupErr, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the newline and err
from this message since err
will be logged later anyway.
|
||
stopAndDestroyCleanup := func() error { | ||
if err := c.Stop(); err != nil { | ||
return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we attempt to Destroy the container even if Stop fails, or will Destroy never succeed if Stop errors? I'm unfamiliar with lxc's behavior in this circumstance and the Go library's docs aren't very useful.
If it's safe to call Destroy if Stop errors, I'd suggest just ignoring Destroy's return value and returning the original Stop effort (so Destroy is just a best-effort at cleaning up).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the only reasons that Stop might fail would also cause Destroy to fail. I think returning without trying Destroy is the right thing here.
client/driver/lxc.go
Outdated
if err := c.Destroy(); err != nil { | ||
return err | ||
} | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This last few lines can be simplified to return c.Destroy()
client/driver/lxc.go
Outdated
return sresp, err | ||
} | ||
|
||
func (d *LxcDriver) StartWithCleanup(ctx *ExecContext, task *structs.Task) (*StartResponse, error, func() error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nitpick, but let's lowercase Start
so this method isn't exported as it shouldn't be called directly.
don't export an internal function, and simplify some code Signed-off-by: Michael McCracken <[email protected]>
Signed-off-by: Michael McCracken <[email protected]>
Thanks @mikemccracken! |
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
Context:
A test added in #3687 does not clean up the container it creates.
This is actually a bug in the driver, not the test - there's no way for the test to clean up a partially configured container because Start() doesn't return a handle on error. This PR fixes this.
Here's the commit comment:
If there are any errors in container setup after c.Create() in
Start(), the container will be left around, with no way to clean it up
because the handle will not be created or returned from Start.
Added a wrapper that checks for errors and performs appropriate
cleanup. Returning a cleanup function from a wrapped function instead
of just doing the cleanup before returning the error helps to ensure
that future changes that might add or change error exits can't forget
to consider a cleanup function.
Adds a check to the invalid config test case to check that a container
created with an invalid config doesn't get left behind.