Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rootless podman in rootless podman reports outer container's PIDs #19440

Closed
PangHLIT opened this issue Jul 31, 2023 · 9 comments
Closed

Rootless podman in rootless podman reports outer container's PIDs #19440

PangHLIT opened this issue Jul 31, 2023 · 9 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@PangHLIT
Copy link

Issue Description

The inner container sees the outer container's process tree. The command is a modification of the "Rootless Podman running rootless Podman" example in https://www.redhat.com/sysadmin/podman-inside-container.

Steps to reproduce the issue

Steps to reproduce the issue

  1. podman run --security-opt label=disable --user podman --device /dev/fuse quay.io/podman/stable podman run alpine ps

Describe the results you received

$ podman run --security-opt label=disable --user podman --device /dev/fuse quay.io/podman/stable podman run alpine ps
Resolved "alpine" as an alias (/etc/containers/registries.conf.d/000-shortnames.conf)
Trying to pull docker.io/library/alpine:latest...
Getting image source signatures
Copying blob sha256:31e352740f534f9ad170f75378a84fe453d6156e40700b882d737a8f4a6988a3
Copying config sha256:c1aabb73d2339c5ebaa3681de2e9d9c18d57485045a4e311d9f8004bec208d67
Writing manifest to image destination
Storing signatures
PID USER TIME COMMAND
1 root 0:00 podman run alpine ps
17 root 0:00 podman run alpine ps
21 root 0:00 catatonit -P
47 root 0:00 /usr/bin/conmon --api-version 1 -c e2f30c87112ac0f65a67c5d263df90d38bbe186b776f73fc259da8e601a774c8 -u e2f30c87112ac0f65a67c5d263df90d38bbe186b776f73fc259da8e601a774c8 -r /usr/bin/crun -b /home/podman/.local/share/containers/storage/overlay-containers/e2f30c87112ac0f65a67c5d263df90d38bbe186b776f73fc259da8e601a774c8/userdata -p /tmp/containers-user-1000/containers/overlay-containers/e2f30c87112ac0f65a67c5d263df90d38bbe186b776f73fc259da8e601a774c8/userdata/pidfile -n gallant_agnesi --exit-dir /tmp/podman-run-1000/libpod/tmp/exits --full-attach -l k8s-file:/home/podman/.local/share/containers/storage/overlay-containers/e2f30c87112ac0f65a67c5d263df90d38bbe186b776f73fc259da8e601a774c8/userdata/ctr.log --log-level warning --runtime-arg --log-format=json --runtime-arg --log --runtime-arg=/tmp/containers-user-1000/containers/overlay-containers/e2f30c87112ac0f65a67c5d263df90d38bbe186b776f73fc259da8e601a774c8/userdata/oci-log --runtime-arg --cgroup-manager --runtime-arg disabled --conmon-pidfile /tmp/containers-user-1000/containers/overlay-containers/e2f30c87112ac0f65a67c5d263df90d38bbe186b776f73fc259da8e601a774c8/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /home/podman/.local/share/containers/storage --exit-command-arg --runroot --exit-command-arg /tmp/containers-user-1000/containers --exit-command-arg --log-level --exit-command-arg warning --exit-command-arg --cgroup-manager --exit-command-arg cgroupfs --exit-command-arg --tmpdir --exit-command-arg /tmp/podman-run-1000/libpod/tmp --exit-command-arg --network-config-dir --exit-command-arg --exit-command-arg --network-backend --exit-command-arg netavark --exit-command-arg --volumepath --exit-command-arg /home/podman/.local/share/containers/storage/volumes --exit-command-arg --db-backend --exit-command-arg boltdb --exit-command-arg --transient-store=false --exit-command-arg --runtime --exit-command-arg crun --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg --events-back
49 root 0:00 ps

Describe the results you expected

Like using rootful podman, the inner container should see its own process tree only.

podman run --privileged quay.io/podman/stable podman run alpine ps
Resolved "alpine" as an alias (/etc/containers/registries.conf.d/000-shortnames.conf)
Trying to pull docker.io/library/alpine:latest...
Getting image source signatures
Copying blob sha256:31e352740f534f9ad170f75378a84fe453d6156e40700b882d737a8f4a6988a3
Copying config sha256:c1aabb73d2339c5ebaa3681de2e9d9c18d57485045a4e311d9f8004bec208d67
Writing manifest to image destination
Storing signatures
PID USER TIME COMMAND
1 root 0:00 ps

podman info output

host:
  arch: amd64
  buildahVersion: 1.29.0
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - hugetlb
  - pids
  - rdma
  - misc
  cgroupManager: cgroupfs
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.6-1.module_el8.8.0+3470+252b1910.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.6, commit: 21c43a36b2ccb9799dfab0e8428837fca920fb45'
  cpuUtilization:
    idlePercent: 98.13
    systemPercent: 0.54
    userPercent: 1.33
  cpus: 7
  distribution:
    distribution: '"almalinux"'
    version: "8.8"
  eventLogger: file
  hostname: lap0832
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.15.90.1-microsoft-standard-WSL2
  linkmode: dynamic
  logDriver: k8s-file
  memFree: 1816068096
  memTotal: 14661337088
  networkBackend: cni
  ociRuntime:
    name: runc
    package: runc-1.1.4-1.module_el8.7.0+3407+95aa0ca9.x86_64
    path: /usr/bin/runc
    version: |-
      runc version 1.1.4
      spec: 1.0.2-dev
      go: go1.18.9
      libseccomp: 2.5.2
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_SYS_CHROOT,CAP_NET_RAW,CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-2.module_el8.7.0+3407+95aa0ca9.x86_64
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.4.0
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.2
  swapFree: 0
  swapTotal: 0
  uptime: 117h 12m 24.00s (Approximately 4.88 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - registry.access.redhat.com
  - registry.redhat.io
  - docker.io
store:
  configFile: /home/ypcheung/.config/containers/storage.conf
  containerStore:
    number: 2
    paused: 0
    running: 0
    stopped: 2
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/ypcheung/.local/share/containers/storage
  graphRootAllocated: 269427478528
  graphRootUsed: 163245617152
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 9
  runRoot: /run/user/1000/containers
  transientStore: false
  volumePath: /home/ypcheung/.local/share/containers/storage/volumes
version:
  APIVersion: 4.4.1
  Built: 1688126783
  BuiltTime: Fri Jun 30 20:06:23 2023
  GitCommit: ""
  GoVersion: go1.19.10
  Os: linux
  OsArch: linux/amd64
  Version: 4.4.1

Podman in a container

No

Privileged Or Rootless

None

Upstream Latest Release

Yes

Additional environment details

Additional environment details

Additional information

This is likely the result of the bind mount of /proc of the outer container into the inner container due to the ~/.config/containers/containers.conf shipped in quay.io/podman/stable.

@PangHLIT PangHLIT added the kind/bug Categorizes issue or PR as related to a bug. label Jul 31, 2023
@giuseppe
Copy link
Member

giuseppe commented Aug 9, 2023

nested podman is not able to mount a fresh /proc so it is forced to use a bind mount of /proc from the outer container

@PangHLIT
Copy link
Author

PangHLIT commented Aug 9, 2023

Thanks for that information. The problem is that any process running in the inner container cannot use its own pid to get its own information, as the pid namespace is incorrect: e.g.

$ podman run --security-opt label=disable --user podman --device /dev/fuse quay.io/podman/stable podman run alpine sh -c "cat /proc/$$/cmdline"
Resolved "alpine" as an alias (/etc/containers/registries.conf.d/000-shortnames.conf)
Trying to pull docker.io/library/alpine:latest...
Getting image source signatures
Copying blob sha256:7264a8db6415046d36d16ba98b79778e18accee6ffa71850405994cffa9be7de
Copying config sha256:7e01a0d0a1dcd9e539f8e9bbd80106d59efbdf97293b3d38f5d7a34501526cdb
Writing manifest to image destination
cat: can't open '/proc/191593/cmdline': No such file or directory

Could you please consider fixing this?

@giuseppe
Copy link
Member

giuseppe commented Aug 9, 2023

that looks like a quoting issue, could you try:

podman run --security-opt label=disable --user podman --device /dev/fuse quay.io/podman/stable podman run alpine sh -c 'cat /proc/$$/cmdline'

@PangHLIT
Copy link
Author

PangHLIT commented Aug 9, 2023

Indeed there is a quoting issue. Thanks for pointing that out. However, after fixing the quotes, the test shows the outer container's command line (with the executable being podman), not the inner container's (with the executable being sh).

$ podman run --security-opt label=disable --user podman --device /dev/fuse quay.io/podman/stable podman run alpine sh -c 'cat /proc/$$/cmdline'
Resolved "alpine" as an alias (/etc/containers/registries.conf.d/000-shortnames.conf)
Trying to pull docker.io/library/alpine:latest...
Getting image source signatures
Copying blob sha256:7264a8db6415046d36d16ba98b79778e18accee6ffa71850405994cffa9be7de
Copying config sha256:7e01a0d0a1dcd9e539f8e9bbd80106d59efbdf97293b3d38f5d7a34501526cdb
Writing manifest to image destination
podmanrunalpinesh-ccat /proc/$$/cmdline

@giuseppe
Copy link
Member

giuseppe commented Aug 9, 2023

that is indeed an issue. We are bind mounting /proc but still creating a new PID namespace.

The mount for proc is specified in the /home/podman/.config/containers/containers.conf file in the podman image.

I double checked and we must create a new PID namespace because we are not using cgroups inside the container, and the runtime requires either a PID namespace or a cgroup (otherwise there is no way to track the processes).

I am not sure we can come out with a better default, since mounting a fresh proc requires a fully visible /proc in the outer container (you can get it with --security-opt=unmask=/proc/*).

If you need to access proc in the nested container, you could create a variant of the podman image that doesn't bind mount /proc:

FROM quay.io/podman/stable
RUN rm /home/podman/.config/containers/containers.conf

and then run the outer container with a fully visible /proc:

$ podman run --security-opt=unmask=/proc/* --security-opt label=disable --user podman podman-local-image podman run alpine sh -c 'cat /proc/$$/cmdline'
cat/proc/1/cmdline

@giuseppe
Copy link
Member

giuseppe commented Aug 9, 2023

I've found an easier way that doesn't require a new image:

$ podman run --env CONTAINERS_CONF=/etc/containers/containers.conf --security-opt=unmask=/proc/* --security-opt label=disable --user podman quay.io/podman/stable podman run alpine sh -c 'cat /proc/$$/cmdline'
cat/proc/1/cmdline

@PangHLIT
Copy link
Author

The workaround works well. Thanks for that and the detailed explanation!

If there is no better default that the official image could use, then perhaps the documentation (the podman image or rootless podman?) could be updated to mention this use case. Hopefully, the information could help save other people time in troubleshooting.

@krisdevopsbot
Copy link

I've found an easier way that doesn't require a new image:

$ podman run --env CONTAINERS_CONF=/etc/containers/containers.conf --security-opt=unmask=/proc/* --security-opt label=disable --user podman quay.io/podman/stable podman run alpine sh -c 'cat /proc/$$/cmdline'
cat/proc/1/cmdline

Is there a way to pass this security-opt when using podman play kube?

@giuseppe
Copy link
Member

giuseppe commented Sep 6, 2023

I am closing this issue since there is no better way for Podman to solve it, it requires a custom configuration for the outer container otherwise nested Podman cannot mount a fresh /proc file system.

Is there a way to pass this security-opt when using podman play kube?

I think there is no way at the moment, but this should be addressed using:

    securityContext:
      procMount: "Unmasked"

could you please file a new RFE issue requesting this feature?

@giuseppe giuseppe closed this as completed Sep 6, 2023
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Dec 6, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 6, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

No branches or pull requests

3 participants