Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Since kernel 6.1.28-rpi4, can't start pods or containers in rootless #18696

Closed
Elrondo46 opened this issue May 26, 2023 · 13 comments
Closed

Since kernel 6.1.28-rpi4, can't start pods or containers in rootless #18696

Elrondo46 opened this issue May 26, 2023 · 13 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@Elrondo46
Copy link

Issue Description

Since kernel 6.1.28-rpi4, can't start pods or containers in rootless (Arch Arm). Tested in two RPI4 devices. Downgrade to 6.1.27 solves the problem.

Podman Version

Client: Podman Engine
Version: 4.5.0
API Version: 4.5.0
Go Version: go1.20.4
Git Commit: 75e3c12-dirty
Built: Sun May 14 22:57:05 2023
OS/Arch: linux/arm64

Kernel

Linux 6.1.29-2-rpi-ARCH #1 SMP PREEMPT Thu May 25 05:17:11 MDT 2023 aarch64 GNU/Linux

Steps to reproduce the issue

Steps to reproduce the issue

Upgrade to kernel 6.1.28 and your rootless pods/containers are broken

Describe the results you received

This is the error:

Error: starting container 4d5fbe664ff0bf716556b20c774610e5eb058d2ff808a4265e65979f38502fac: a dependency of container 4d5fbe664ff0bf716556b20c774610e5eb058d2ff808a4265e65979f38502fac failed to start: container state improper Error: starting container 5637c53445934b36bd829a43380762f0d29a01ebb0d93702e424c6f7661d2377: crun: setrlimit RLIMIT_NPROC: Operation not permitted: OCI permission denied

Describe the results you expected

Just wants my pods function

podman info output

Client:       Podman Engine
Version:      4.5.0
API Version:  4.5.0
Go Version:   go1.20.4
Git Commit:   75e3c12579d391b81d871fd1cded6cf0d043550a-dirty
Built:        Sun May 14 22:57:05 2023
OS/Arch:      linux/arm64

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

Yes

Additional environment details

Additional environment details

Additional information

Additional information like issue happens only occasionally or issue happens with a particular architecture or on a particular setting

@Elrondo46 Elrondo46 added the kind/bug Categorizes issue or PR as related to a bug. label May 26, 2023
@Luap99
Copy link
Member

Luap99 commented May 26, 2023

Please provide steps to reproduce this, how do you create/run the containers and pods?

@Elrondo46
Copy link
Author

I create pods and containers in theses pods with ansible playbooks. This method is fully functionnal in all GNU/Linux based systems including WSL. The real user is named podman. I have no problem in Arch x86 or other distros. The problem is just with rpi4 in arm64 based kernel since 6.1.28

@Luap99
Copy link
Member

Luap99 commented May 26, 2023

Why do think this is a podman bug? Sounds more like a kernel regression then.
In order to debug you have to provide actual podman commands. Without a reproducer there is nothing we can do. Did you try to recreate the containers/pods on the newer kernel?

@Elrondo46
Copy link
Author

Playbook example all secret infos are censored:

- name: Create internal podman network for davical
  containers.podman.podman_network:
    name: davical_network
    recreate: true
    ip_range: 10.1.1.8/25
    subnet: 10.1.1.0/24
    gateway: 10.1.1.1
  become: false

- name: Install Davical Pod
  become: false
  podman_pod:
    name: davical
    recreate: true
    network: davical_network
    ip: 10.1.1.10
    ports:
      - "127.0.0.1:4564:80"
      - "127.0.0.1:5432:5432"

- name: Create davical db volume
  become: false
  podman_volume:
    state: present
    name: davical-standalone_pgsql_data


- name: Carete davical app volume
  become: false
  podman_volume:
    state: present
    name: davical-standalone-app-1


- name: Install davical db container
  become: false
  containers.podman.podman_container:
    name: davical-standalone-db-1
    image: docker.io/postgres:13-alpine
    env:
      POSTGRES_PASSWORD: CENSORED
    state: present
    recreate: yes
    pod: davical
    volumes:
      - "davical-standalone_pgsql_data:/var/lib/postgresql/data"

- name: Install davical app container
  become: false
  containers.podman.podman_container:
    name: davical-standalone-app-1
    image: docker.io/tuxnvape/davical-standalone:82-1.1.11
    env:
      HOST_NAME: CENSORED
      PGSQL_ROOT_PASS: CENSORED
      PASSDAVDB: CENSORED
      DBHOST: davical-standalone-db-1
      ADMINDAVICALPASS: CENSORED
      LANG: fr_FR.UTF-8
      LC_ALL: fr_FR.UTF-8
      DAVICAL_LANG: fr_FR
    pod: davical
    state: present
    recreate: yes
    volumes:
      - "davical-standalone_davical_config:/config"
      - "davical-standalone_pgsql_data:/var/lib/postgresql"

- name: Start davical Pod
  ansible.builtin.command:
    cmd: podman pod start davical

@Elrondo46
Copy link
Author

Why do think this is a podman bug? Sounds more like a kernel regression then. In order to debug you have to provide actual podman commands. Without a reproducer there is nothing we can do. Did you try to recreate the containers/pods on the newer kernel?

That's why I send a bug here:

raspberrypi/linux#5483

They said, this bug is up to podman team

@Elrondo46
Copy link
Author

After recreating the pod. It is now fully fuctionnal. BUT WHY ???

@pelwell
Copy link

pelwell commented May 26, 2023

They said, this bug is up to podman team

FYI, I said "The podman devs are still much more likely to know what is going wrong (especially when you've provided so little information)".

@Elrondo46
Copy link
Author

Anyway for me the bug is solved. Don't know exactly if it's originally a podman bug or podman bug caused by last kernels.

Sorry but you had all informations for a simple permission denied in RLIMIT config.

The bug is solved by recreating the pods. It's not a good option I think especialy in prod env with lots of pods, but that's sufficiant for me.

@Luap99
Copy link
Member

Luap99 commented May 26, 2023

My best guess is that the NPROC limit changed between the two kernel versions, you can check with ulimit -u.
AFAIK podman only records the NPROC limit on container creation and then sets this limit in the oci runtime spec. When your limited was changed crun could no longer apply the limit correctly.

I think it is the same underlying issue as #18555

@Elrondo46
Copy link
Author

Elrondo46 commented May 26, 2023

ulimit -u value has not changed between the change of kernel and not changed after recreating the pods...

[podman@alarmpi ~]$ ulimit -u 5292
[podman@alarmpi ~]$ cat /proc/sys/kernel/threads-max 10585

Don't think this the same as #18555

@ErrorNoInternet
Copy link

I noticed the limit changes occasionally on my Fedora install. Every reboot it seems to have a chance of changing.

From another comment I made:

Could it be due to Linux kernel defaults changing?

I'd think of the distro or systemd to change the default value.

I have 3 containers, 2 of which are working properly (63410). But the oldest one I made (about 3 weeks ago) had a limit of 63412, which is 1 higher than my system's limit, 63411.

I ran sudo grep -R "Max processes" /var/spool/abrt and noticed the values changed a lot. For me it started at 63404 on April 20 (when I first installed Fedora 38), then changed to to 63411, then 63412, 62310, and about 10 days ago it became 63411.

@Luap99
Copy link
Member

Luap99 commented May 26, 2023

Yeah I think it is clear now that we cannot treat NPROC as static. Thus we should not set the limit on container creation, instead set it on each start so it will be always based on current value.

@Luap99
Copy link
Member

Luap99 commented Jun 1, 2023

Fixed in #18721 so it will be in the next podman version.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

No branches or pull requests

4 participants