Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fusemanager: fix container fail after ttl timeout in detach mode #1905

Merged
merged 1 commit into from
Jan 7, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 1 addition & 5 deletions cmd/containerd-stargz-grpc/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,7 @@ func main() {
if err != nil {
log.G(ctx).WithError(err).Fatalf("failed to configure fusemanager")
}
rs, err = snbase.NewSnapshotter(ctx, filepath.Join(*rootDir, "snapshotter"), fs, snbase.AsynchronousRemove)
rs, err = snbase.NewSnapshotter(ctx, filepath.Join(*rootDir, "snapshotter"), fs, snbase.AsynchronousRemove, snbase.SetDetachFlag)
if err != nil {
log.G(ctx).WithError(err).Fatalf("failed to configure snapshotter")
}
Expand Down Expand Up @@ -213,10 +213,6 @@ func main() {
log.G(ctx).WithError(err).Fatalf("failed to serve snapshotter")
}

// TODO: In detach mode, rs is taken over by fusemanager,
// but client will send unmount request to fusemanager,
// and fusemanager need get mount info from local db to
// determine its behavior
if cleanup {
log.G(ctx).Debug("Closing the snapshotter")
rs.Close()
Expand Down
12 changes: 12 additions & 0 deletions docs/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,18 @@ When upgrading the fuse manager, it's recommended to follow these steps:

This ensures a clean upgrade without impacting running containers.

### Important Considerations

Before restarting the `containerd-stargz-grpc` process, it is essential to consider the state of any running containers.

1. **When to Use SIGKILL** :

If there are running containers, it is crucial to terminate the `containerd-stargz-grpc` process using `SIGKILL`. This approach prevents the normal shutdown sequence from attempting to clean up the mount points of the running containers, which could disrupt their availability. By using `SIGKILL`, you ensure that the process is forcefully terminated without affecting the ongoing operations of the containers.

2. **When to Use SIGTERM** :

If there are no running containers, you should use `SIGTERM` to terminate the `containerd-stargz-grpc` process. This allows the process to follow its normal shutdown sequence, ensuring that it properly cleans up resources and mount points.

## Registry-related configuration

You can configure stargz snapshotter for accessing registries with custom configurations.
Expand Down
12 changes: 12 additions & 0 deletions snapshot/snapshot.go
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,7 @@ type SnapshotterConfig struct {
asyncRemove bool
noRestore bool
allowInvalidMountsOnRestart bool
detach bool
}

// Opt is an option to configure the remote snapshotter
Expand All @@ -97,6 +98,11 @@ func AllowInvalidMountsOnRestart(config *SnapshotterConfig) error {
return nil
}

func SetDetachFlag(config *SnapshotterConfig) error {
config.detach = true
return nil
}

type snapshotter struct {
root string
ms *storage.MetaStore
Expand All @@ -107,6 +113,7 @@ type snapshotter struct {
userxattr bool // whether to enable "userxattr" mount option
noRestore bool
allowInvalidMountsOnRestart bool
detach bool
}

// NewSnapshotter returns a Snapshotter which can use unpacked remote layers
Expand Down Expand Up @@ -157,6 +164,7 @@ func NewSnapshotter(ctx context.Context, root string, targetFs FileSystem, opts
userxattr: userxattr,
noRestore: config.noRestore,
allowInvalidMountsOnRestart: config.allowInvalidMountsOnRestart,
detach: config.detach,
}

if err := o.restoreRemoteSnapshot(ctx); err != nil {
Expand Down Expand Up @@ -736,6 +744,10 @@ func (o *snapshotter) checkAvailability(ctx context.Context, key string) bool {
}

func (o *snapshotter) restoreRemoteSnapshot(ctx context.Context) error {
if o.detach {
return nil
}

mounts, err := mountinfo.GetMounts(nil)
if err != nil {
return err
Expand Down
Loading