Skip to content

Commit

Permalink
Implement --sdnotify cmdline option to control sd-notify behavior
Browse files Browse the repository at this point in the history
--sdnotify container|conmon|ignore
With "conmon", we send the MAINPID, and clear the NOTIFY_SOCKET so the OCI
runtime doesn't pass it into the container. We also advertise "ready" when the
OCI runtime finishes to advertise the service as ready.

With "container", we send the MAINPID, and leave the NOTIFY_SOCKET so the OCI
runtime passes it into the container for initialization, and let the container advertise further metadata.
This is the default, which is closest to the behavior podman has done in the past.

The "ignore" option removes NOTIFY_SOCKET from the environment, so neither podman nor
any child processes will talk to systemd.

This removes the need for hardcoded CID and PID files in the command line, and
the PIDFile directive, as the pid is advertised directly through sd-notify.

Signed-off-by: Joseph Gooch <[email protected]>
  • Loading branch information
goochjj committed Jul 6, 2020
1 parent 9532509 commit 0b1c1ef
Show file tree
Hide file tree
Showing 17 changed files with 274 additions and 7 deletions.
6 changes: 6 additions & 0 deletions cmd/podman/common/create.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import (

"github.com/containers/common/pkg/auth"
"github.com/containers/libpod/v2/cmd/podman/registry"
"github.com/containers/libpod/v2/libpod/define"
"github.com/spf13/pflag"
)

Expand Down Expand Up @@ -394,6 +395,11 @@ func GetCreateFlags(cf *ContainerCLIOpts) *pflag.FlagSet {
"rootfs", false,
"The first argument is not an image but the rootfs to the exploded container",
)
createFlags.StringVar(
&cf.SdNotifyMode,
"sdnotify", define.SdNotifyModeContainer,
`control sd-notify behavior ("container"|"conmon"|"ignore")`,
)
createFlags.StringArrayVar(
&cf.SecurityOpt,
"security-opt", containerConfig.SecurityOptions(),
Expand Down
1 change: 1 addition & 0 deletions cmd/podman/common/create_opts.go
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,7 @@ type ContainerCLIOpts struct {
Rm bool
RootFS bool
SecurityOpt []string
SdNotifyMode string
ShmSize string
StopSignal string
StopTimeout uint
Expand Down
1 change: 1 addition & 0 deletions cmd/podman/common/specgen.go
Original file line number Diff line number Diff line change
Expand Up @@ -443,6 +443,7 @@ func FillOutSpecGen(s *specgen.SpecGenerator, c *ContainerCLIOpts, args []string
}

s.Systemd = c.Systemd
s.SdNotifyMode = c.SdNotifyMode
if s.ResourceLimits == nil {
s.ResourceLimits = &specs.LinuxResources{}
}
Expand Down
11 changes: 11 additions & 0 deletions docs/source/markdown/podman-create.1.md
Original file line number Diff line number Diff line change
Expand Up @@ -702,6 +702,17 @@ If specified, the first argument refers to an exploded container on the file sys
This is useful to run a container without requiring any image management, the rootfs
of the container is assumed to be managed externally.

**--sdnotify**=**container**|**conmon**|**ignore**

Determines how to use the NOTIFY_SOCKET, as passed with systemd and Type=notify.

Default is **container**, which means allow the OCI runtime to proxy the socket into the
container to receive ready notification. Podman will set the MAINPID to conmon's pid.
The **conmon** option sets MAINPID to conmon's pid, and sends READY when the container
has started. The socket is never passed to the runtime or the container.
The **ignore** option removes NOTIFY_SOCKET from the environment for itself and child processes,
for the case where some other process above Podman uses NOTIFY_SOCKET and Podman should not use it.

**--seccomp-policy**=*policy*

Specify the policy to select the seccomp profile. If set to *image*, Podman will look for a "io.podman.seccomp.profile" label in the container-image config and use its value as a seccomp profile. Otherwise, Podman will follow the *default* policy by applying the default profile unless specified otherwise via *--security-opt seccomp* as described below.
Expand Down
11 changes: 11 additions & 0 deletions docs/source/markdown/podman-run.1.md
Original file line number Diff line number Diff line change
Expand Up @@ -723,6 +723,17 @@ of the container is assumed to be managed externally.
Note: On **SELinux** systems, the rootfs needs the correct label, which is by default
**unconfined_u:object_r:container_file_t**.

**--sdnotify**=**container**|**conmon**|**ignore**

Determines how to use the NOTIFY_SOCKET, as passed with systemd and Type=notify.

Default is **container**, which means allow the OCI runtime to proxy the socket into the
container to receive ready notification. Podman will set the MAINPID to conmon's pid.
The **conmon** option sets MAINPID to conmon's pid, and sends READY when the container
has started. The socket is never passed to the runtime or the container.
The **ignore** option removes NOTIFY_SOCKET from the environment for itself and child processes,
for the case where some other process above Podman uses NOTIFY_SOCKET and Podman should not use it.

**--seccomp-policy**=*policy*

Specify the policy to select the seccomp profile. If set to *image*, Podman will look for a "io.podman.seccomp.profile" label in the container-image config and use its value as a seccomp profile. Otherwise, Podman will follow the *default* policy by applying the default profile unless specified otherwise via *--security-opt seccomp* as described below.
Expand Down
2 changes: 2 additions & 0 deletions libpod/container.go
Original file line number Diff line number Diff line change
Expand Up @@ -414,6 +414,8 @@ type ContainerConfig struct {
// sharing kernel namespaces in a pod
IsInfra bool `json:"pause"`

// SdNotifyMode tells libpod what to do with a NOTIFY_SOCKET if passed
SdNotifyMode string `json:"sdnotifyMode,omitempty"`
// Systemd tells libpod to setup the container in systemd mode
Systemd bool `json:"systemd"`

Expand Down
14 changes: 14 additions & 0 deletions libpod/container_internal.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ import (
"github.com/containers/storage/pkg/archive"
"github.com/containers/storage/pkg/idtools"
"github.com/containers/storage/pkg/mount"
"github.com/coreos/go-systemd/v22/daemon"
securejoin "github.com/cyphar/filepath-securejoin"
spec "github.com/opencontainers/runtime-spec/specs-go"
"github.com/opencontainers/runtime-tools/generate"
Expand Down Expand Up @@ -1192,6 +1193,19 @@ func (c *Container) start() error {

c.state.State = define.ContainerStateRunning

if c.config.SdNotifyMode != define.SdNotifyModeIgnore {
payload := fmt.Sprintf("MAINPID=%d", c.state.ConmonPID)
if c.config.SdNotifyMode == define.SdNotifyModeConmon {
payload += "\n"
payload += daemon.SdNotifyReady
}
if sent, err := daemon.SdNotify(false, payload); err != nil {
logrus.Errorf("Error notifying systemd of Conmon PID: %s", err.Error())
} else if sent {
logrus.Debugf("Notify sent successfully")
}
}

if c.config.HealthCheckConfig != nil {
if err := c.updateHealthStatus(define.HealthCheckStarting); err != nil {
logrus.Error(err)
Expand Down
7 changes: 7 additions & 0 deletions libpod/define/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -75,3 +75,10 @@ const JSONLogging = "json-file"

// NoLogging is the string conmon expects when specifying to use no log driver whatsoever
const NoLogging = "none"

// Strings used for --sdnotify option to podman
const (
SdNotifyModeContainer = "container"
SdNotifyModeConmon = "conmon"
SdNotifyModeIgnore = "ignore"
)
2 changes: 1 addition & 1 deletion libpod/oci_conmon_exec_linux.go
Original file line number Diff line number Diff line change
Expand Up @@ -444,7 +444,7 @@ func (r *ConmonOCIRuntime) startExec(c *Container, sessionID string, options *Ex
// }
// }

conmonEnv, extraFiles, err := r.configureConmonEnv(runtimeDir)
conmonEnv, extraFiles, err := r.configureConmonEnv(c, runtimeDir)
if err != nil {
return nil, nil, err
}
Expand Down
30 changes: 24 additions & 6 deletions libpod/oci_conmon_linux.go
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ import (
"github.com/containers/libpod/v2/utils"
pmount "github.com/containers/storage/pkg/mount"
"github.com/coreos/go-systemd/v22/activation"
"github.com/coreos/go-systemd/v22/daemon"
spec "github.com/opencontainers/runtime-spec/specs-go"
"github.com/opencontainers/selinux/go-selinux"
"github.com/opencontainers/selinux/go-selinux/label"
Expand Down Expand Up @@ -365,8 +366,10 @@ func (r *ConmonOCIRuntime) StartContainer(ctr *Container) error {
return err
}
env := []string{fmt.Sprintf("XDG_RUNTIME_DIR=%s", runtimeDir)}
if notify, ok := os.LookupEnv("NOTIFY_SOCKET"); ok {
env = append(env, fmt.Sprintf("NOTIFY_SOCKET=%s", notify))
if ctr.config.SdNotifyMode == define.SdNotifyModeContainer {
if notify, ok := os.LookupEnv("NOTIFY_SOCKET"); ok {
env = append(env, fmt.Sprintf("NOTIFY_SOCKET=%s", notify))
}
}
if path, ok := os.LookupEnv("PATH"); ok {
env = append(env, fmt.Sprintf("PATH=%s", path))
Expand Down Expand Up @@ -887,6 +890,12 @@ func (r *ConmonOCIRuntime) createOCIContainer(ctr *Container, restoreOptions *Co
}
}

if ctr.config.SdNotifyMode == define.SdNotifyModeIgnore {
if err := os.Unsetenv("NOTIFY_SOCKET"); err != nil {
logrus.Warnf("Error unsetting NOTIFY_SOCKET %s", err.Error())
}
}

args := r.sharedConmonArgs(ctr, ctr.ID(), ctr.bundlePath(), filepath.Join(ctr.state.RunDir, "pidfile"), ctr.LogPath(), r.exitsDir, ociLog, ctr.LogDriver(), logTag)

if ctr.config.Spec.Process.Terminal {
Expand Down Expand Up @@ -940,7 +949,7 @@ func (r *ConmonOCIRuntime) createOCIContainer(ctr *Container, restoreOptions *Co
}

// 0, 1 and 2 are stdin, stdout and stderr
conmonEnv, envFiles, err := r.configureConmonEnv(runtimeDir)
conmonEnv, envFiles, err := r.configureConmonEnv(ctr, runtimeDir)
if err != nil {
return err
}
Expand Down Expand Up @@ -1034,6 +1043,13 @@ func (r *ConmonOCIRuntime) createOCIContainer(ctr *Container, restoreOptions *Co
// conmon not having a pid file is a valid state, so don't set it if we don't have it
logrus.Infof("Got Conmon PID as %d", conmonPID)
ctr.state.ConmonPID = conmonPID
if ctr.config.SdNotifyMode != define.SdNotifyModeIgnore {
if sent, err := daemon.SdNotify(false, fmt.Sprintf("MAINPID=%d", conmonPID)); err != nil {
logrus.Errorf("Error notifying systemd of Conmon PID: %s", err.Error())
} else if sent {
logrus.Debugf("Notify MAINPID sent successfully")
}
}
}

if ctr.config.PreserveFDs > 0 {
Expand Down Expand Up @@ -1137,7 +1153,7 @@ func prepareProcessExec(c *Container, cmd, env []string, tty bool, cwd, user, se

// configureConmonEnv gets the environment values to add to conmon's exec struct
// TODO this may want to be less hardcoded/more configurable in the future
func (r *ConmonOCIRuntime) configureConmonEnv(runtimeDir string) ([]string, []*os.File, error) {
func (r *ConmonOCIRuntime) configureConmonEnv(ctr *Container, runtimeDir string) ([]string, []*os.File, error) {
env := make([]string, 0, 6)
env = append(env, fmt.Sprintf("XDG_RUNTIME_DIR=%s", runtimeDir))
env = append(env, fmt.Sprintf("_CONTAINERS_USERNS_CONFIGURED=%s", os.Getenv("_CONTAINERS_USERNS_CONFIGURED")))
Expand All @@ -1149,8 +1165,10 @@ func (r *ConmonOCIRuntime) configureConmonEnv(runtimeDir string) ([]string, []*o
env = append(env, fmt.Sprintf("HOME=%s", home))

extraFiles := make([]*os.File, 0)
if notify, ok := os.LookupEnv("NOTIFY_SOCKET"); ok {
env = append(env, fmt.Sprintf("NOTIFY_SOCKET=%s", notify))
if ctr.config.SdNotifyMode == define.SdNotifyModeContainer {
if notify, ok := os.LookupEnv("NOTIFY_SOCKET"); ok {
env = append(env, fmt.Sprintf("NOTIFY_SOCKET=%s", notify))
}
}
if !r.sdNotify {
if listenfds, ok := os.LookupEnv("LISTEN_FDS"); ok {
Expand Down
22 changes: 22 additions & 0 deletions libpod/options.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ import (
"net"
"os"
"path/filepath"
"strings"
"syscall"

"github.com/containers/common/pkg/config"
Expand All @@ -22,6 +23,10 @@ import (
)

// Runtime Creation Options
var (
// SdNotifyModeValues describes the only values that SdNotifyMode can be
SdNotifyModeValues = []string{define.SdNotifyModeContainer, define.SdNotifyModeConmon, define.SdNotifyModeIgnore}
)

// WithStorageConfig uses the given configuration to set up container storage.
// If this is not specified, the system default configuration will be used
Expand Down Expand Up @@ -550,6 +555,23 @@ func WithSystemd() CtrCreateOption {
}
}

// WithSdNotifyMode sets the sd-notify method
func WithSdNotifyMode(mode string) CtrCreateOption {
return func(ctr *Container) error {
if ctr.valid {
return define.ErrCtrFinalized
}

// verify values
if len(mode) > 0 && !util.StringInSlice(strings.ToLower(mode), SdNotifyModeValues) {
return errors.Wrapf(define.ErrInvalidArg, "--sdnotify values must be one of %q", strings.Join(SdNotifyModeValues, ", "))
}

ctr.config.SdNotifyMode = mode
return nil
}
}

// WithShmSize sets the size of /dev/shm tmpfs mount.
func WithShmSize(size int64) CtrCreateOption {
return func(ctr *Container) error {
Expand Down
7 changes: 7 additions & 0 deletions pkg/specgen/container_validate.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ package specgen
import (
"strings"

"github.com/containers/libpod/v2/libpod/define"
"github.com/containers/libpod/v2/pkg/rootless"
"github.com/containers/libpod/v2/pkg/util"
"github.com/pkg/errors"
Expand All @@ -13,6 +14,8 @@ var (
ErrInvalidSpecConfig = errors.New("invalid configuration")
// SystemDValues describes the only values that SystemD can be
SystemDValues = []string{"true", "false", "always"}
// SdNotifyModeValues describes the only values that SdNotifyMode can be
SdNotifyModeValues = []string{define.SdNotifyModeContainer, define.SdNotifyModeConmon, define.SdNotifyModeIgnore}
// ImageVolumeModeValues describes the only values that ImageVolumeMode can be
ImageVolumeModeValues = []string{"ignore", "tmpfs", "anonymous"}
)
Expand Down Expand Up @@ -40,6 +43,10 @@ func (s *SpecGenerator) Validate() error {
if len(s.ContainerBasicConfig.Systemd) > 0 && !util.StringInSlice(strings.ToLower(s.ContainerBasicConfig.Systemd), SystemDValues) {
return errors.Wrapf(ErrInvalidSpecConfig, "--systemd values must be one of %q", strings.Join(SystemDValues, ", "))
}
// sdnotify values must be container, conmon, or ignore
if len(s.ContainerBasicConfig.SdNotifyMode) > 0 && !util.StringInSlice(strings.ToLower(s.ContainerBasicConfig.SdNotifyMode), SdNotifyModeValues) {
return errors.Wrapf(ErrInvalidSpecConfig, "--sdnotify values must be one of %q", strings.Join(SdNotifyModeValues, ", "))
}

//
// ContainerStorageConfig
Expand Down
4 changes: 4 additions & 0 deletions pkg/specgen/generate/container_create.go
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,10 @@ func createContainerOptions(ctx context.Context, rt *libpod.Runtime, s *specgen.

options = append(options, libpod.WithSystemd())
}
if len(s.SdNotifyMode) > 0 {
options = append(options, libpod.WithSdNotifyMode(s.SdNotifyMode))
}

if len(s.Name) > 0 {
logrus.Debugf("setting container name %s", s.Name)
options = append(options, libpod.WithName(s.Name))
Expand Down
5 changes: 5 additions & 0 deletions pkg/specgen/specgen.go
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,11 @@ type ContainerBasicConfig struct {
// If not specified, "false" will be assumed.
// Optional.
Systemd string `json:"systemd,omitempty"`
// Determine how to handle the NOTIFY_SOCKET - do we participate or pass it through
// "container" - let the OCI runtime deal with it, advertise conmon's MAINPID
// "conmon-only" - advertise conmon's MAINPID, send READY when started, don't pass to OCI
// "ignore" - unset NOTIFY_SOCKET
SdNotifyMode string `json:"sdnotifyMode,omitempty"`
// Namespace is the libpod namespace the container will be placed in.
// Optional.
Namespace string `json:"namespace,omitempty"`
Expand Down
84 changes: 84 additions & 0 deletions vendor/github.com/coreos/go-systemd/v22/daemon/sdnotify.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 0b1c1ef

Please sign in to comment.