Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long running container becomes unusable/unable to start again: setrlimit 'RLIMIT_NPROC': Operation not permitted: OCI permission denied #19059

Closed
gbraad opened this issue Jun 30, 2023 · 7 comments
Labels
locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@gbraad
Copy link
Member

gbraad commented Jun 30, 2023

Note: this especially affects people who use Distrobox to run alternative environments


I have a container that I often use for development purpose, called devsys. I use podman exec -it devsys su - gbraad to enter it for normal use. It runs as systemd=always, so it is expected to be long-running.

From time to time the host might reboot, or the container gets restarted. And once in a while I get the following:

$ podman start devsys
Error: unable to start container "adf2dec1dab9542ac14676dede38c2326e9e3a122c9f814b2ce3fe3597418931": crun: setrlimit `RLIMIT_NPROC`: Operation not permitted: OCI permission denied

This has happened for me on Fedora 38, podman machine (WSL2), on Alma 9, etc. So this is not an isolated case.

The container gets started with:

$ podman run -d --name=devsys --hostname $HOSTNAME-devsys --systemd=always --cap-add=NET_ADMIN --cap-add=NET_RAW --device=/dev/net/tun -v $HOME/Projects:/home/${USER}/Projects ghcr.io/gbraad-devenv/fedora/systemd:${FEDORA_VERSION}

and sometimes the lifecycle on the host is done with podman generate systemd, although otherwise it is a simple podman start devsys as mentioned before.

What can cause this issue?

$ podman version
Client:       Podman Engine
Version:      4.5.1
API Version:  4.5.1
Go Version:   go1.20.4
Built:        Sat May 27 01:58:48 2023
OS/Arch:      linux/amd64
$ podman info
podman info
host:
  arch: amd64
  buildahVersion: 1.30.0
  cgroupControllers: []
  cgroupManager: cgroupfs
  cgroupVersion: v1
  conmon:
    package: conmon-2.1.7-2.fc38.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.7, commit: '
  cpuUtilization:
    idlePercent: 99.31
    systemPercent: 0.39
    userPercent: 0.3
  cpus: 2
  databaseBackend: boltdb
  distribution:
    distribution: fedora
    variant: container
    version: "38"
  eventLogger: journald
  hostname: wint14-podman
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 524288
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 524288
      size: 65536
  kernel: 5.15.90.1-microsoft-standard-WSL2
  linkmode: dynamic
  logDriver: journald
  memFree: 15744282624
  memTotal: 16646328320
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun-1.8.5-1.fc38.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.8.5
      commit: b6f80f766c9a89eb7b1440c0a70ab287434b17ed
      rundir: /run/user/1000/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-12.fc38.x86_64
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.3
  swapFree: 4294967296
  swapTotal: 4294967296
  uptime: 20h 26m 39.00s (Approximately 0.83 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - docker.io
store:
  configFile: /home/user/.config/containers/storage.conf
  containerStore:
    number: 3
    paused: 0
    running: 1
    stopped: 2
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/user/.local/share/containers/storage
  graphRootAllocated: 1081101176832
  graphRootUsed: 7375065088
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 6
  runRoot: /run/user/1000/containers
  transientStore: false
  volumePath: /home/user/.local/share/containers/storage/volumes
version:
  APIVersion: 4.5.1
  Built: 1685123928
  BuiltTime: Sat May 27 01:58:48 2023
  GitCommit: ""
  GoVersion: go1.20.4
  Os: linux
  OsArch: linux/amd64
  Version: 4.5.1
@gbraad gbraad changed the title Long running container becomes unusable/unable to start again Long running container becomes unusable/unable to start again: setrlimit RLIMIT_NPROC: Operation not permitted: OCI permission denied Jun 30, 2023
@gbraad gbraad changed the title Long running container becomes unusable/unable to start again: setrlimit RLIMIT_NPROC: Operation not permitted: OCI permission denied Long running container becomes unusable/unable to start again: setrlimit 'RLIMIT_NPROC': Operation not permitted: OCI permission denied Jun 30, 2023
@vrothberg
Copy link
Member

Thanks for filing the issue, @gbraad!

@giuseppe can you take a look?

@gbraad
Copy link
Member Author

gbraad commented Jun 30, 2023

$ ulimit -a
-t: cpu time (seconds)              unlimited
-f: file size (blocks)              unlimited
-d: data seg size (kbytes)          unlimited
-s: stack size (kbytes)             8192
-c: core file size (blocks)         0
-m: resident set size (kbytes)      unlimited
-u: processes                       63475
-n: file descriptors                1024
-l: locked-in-memory size (kbytes)  65536
-v: address space (kbytes)          unlimited
-x: file locks                      unlimited
-i: pending signals                 63475
-q: bytes in POSIX msg queues       819200
-e: max nice                        0
-r: max rt priority                 0
-N 15: rt cpu time (microseconds)   unlimited
$ podman inspect devsys
...
               "Ulimits": [
                    {
                         "Name": "RLIMIT_NOFILE",
                         "Soft": 1048576,
                         "Hard": 1048576
                    },
                    {
                         "Name": "RLIMIT_NPROC",
                         "Soft": 63475,
                         "Hard": 63475
                    }
               ],
...

@Luap99
Copy link
Member

Luap99 commented Jun 30, 2023

This is already fixed in main, #18721
Duplicate of #18714

@Luap99 Luap99 closed this as not planned Won't fix, can't repro, duplicate, stale Jun 30, 2023
@gbraad
Copy link
Member Author

gbraad commented Jun 30, 2023

So what can be done to run the current container without having to wait for a new drop of Podman?

@Luap99
Copy link
Member

Luap99 commented Jun 30, 2023

Recreate the container and that will be required regardless of the new version or not. Only newly created container are able to use the current limits on start. Old ones will continue to use incorrect limits as they are part of the spec.

@gbraad
Copy link
Member Author

gbraad commented Jun 30, 2023

#18696 (comment)

"Every reboot it seems to have a chance of changing."

Recreate the container

So I will have to export to container and import to preserve the state/data? So there is no way to clear the ulimits that have been set... why set them in the first place? (I would expect there would be a way to clear this to allow the container to run).

@emmetoneillpdx
Copy link

So I will have to export to container and import to preserve the state/data?

That worked for me. I tried both podman export / podman import as well as distrobox create --clone OLD_CONTAINER_NAME. Both worked just fine and now my containers are up and running again!

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 29, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 29, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

No branches or pull requests

4 participants