Dangling file in /var/lib/cni/networks/podman prevents container starting podman 3.0 #9465

skitoxe · 2021-02-22T12:06:01Z

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description
When doing an podman stop <container> && sleep 5 && podman start <container> or just a normal stop and start i on some occasions get an orphan in /var/lib/cni/networks/podman/.
In exactly the same manner as described in issue #3759, which is now closed. But is see further comments indicating this is still an issue and i can see that its still a problem in 3.0. I have to manually delete the file to get the container to start again. Error message recived is:

ERRO[0000] Error adding network: failed to allocate for range 0: 10.89.0.64 has been allocated to 5e1c23b05eee19f0833d7ff116fa50886ead824f944652239640c8aefdd1e5fe, duplicate allocation is not allowed
ERRO[0000] Error while adding pod to CNI network "podman_default": failed to allocate for range 0: 10.89.0.64 has been allocated to 5e1c23b05eee19f0833d7ff116fa50886ead824f944652239640c8aefdd1e5fe, duplicate allocation is not allowed
Error: unable to start container "5e1c23b05eee19f0833d7ff116fa50886ead824f944652239640c8aefdd1e5fe": error configuring network namespace for container 5e1c23b05eee19f0833d7ff116fa50886ead824f944652239640c8aefdd1e5fe: failed to allocate for range 0: 10.89.0.64 has been allocated to 5e1c23b05eee19f0833d7ff116fa50886ead824f944652239640c8aefdd1e5fe, duplicate allocation is not allowed

Steps to reproduce the issue:

podman stop && sleep 5 && podman start

Describe the results you received:
No starting container and the follwing error when trying to start:

ERRO[0000] Error adding network: failed to allocate for range 0: 10.89.0.64 has been allocated to 5e1c23b05eee19f0833d7ff116fa50886ead824f944652239640c8aefdd1e5fe, duplicate allocation is not allowed
ERRO[0000] Error while adding pod to CNI network "podman_default": failed to allocate for range 0: 10.89.0.64 has been allocated to 5e1c23b05eee19f0833d7ff116fa50886ead824f944652239640c8aefdd1e5fe, duplicate allocation is not allowed
Error: unable to start container "5e1c23b05eee19f0833d7ff116fa50886ead824f944652239640c8aefdd1e5fe": error configuring network namespace for container 5e1c23b05eee19f0833d7ff116fa50886ead824f944652239640c8aefdd1e5fe: failed to allocate for range 0: 10.89.0.64 has been allocated to 5e1c23b05eee19f0833d7ff116fa50886ead824f944652239640c8aefdd1e5fe, duplicate allocation is not allowed

Describe the results you expected:
Container starts

Additional information you deem important (e.g. issue happens only occasionally):
Seems to be more common when run by a crontab. But that might be my imagination.
Output of podman version:

Version:      3.0.0
API Version:  3.0.0
Go Version:   go1.15.7
Built:        Fri Feb 12 00:12:57 2021
OS/Arch:      linux/amd64

Output of podman info --debug:

host:
  arch: amd64
  buildahVersion: 1.19.2
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.0.26-1.fc33.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.26, commit: 777074ecdb5e883b9bec233f3630c5e7fa37d521'
  cpus: 40
  distribution:
    distribution: fedora
    version: "33"
  eventLogger: journald
  hostname: xxxx
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 5.10.15-200.fc33.x86_64
  linkmode: dynamic
  memFree: 190546157568
  memTotal: 201951862784
  ociRuntime:
    name: crun
    package: crun-0.17-1.fc33.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 0.17
      commit: 0e9229ae34caaebcb86f1fde18de3acaf18c6d9a
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    selinuxEnabled: true
  slirp4netns:
    executable: ""
    package: ""
    version: ""
  swapFree: 4294963200
  swapTotal: 4294963200
  uptime: 92h 34m 59.04s (Approximately 3.83 days)
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - registry.centos.org
  - docker.io
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 3
    paused: 0
    running: 3
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev,metacopy=on
  graphRoot: /var/lib/containers/storage
  graphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "true"
  imageStore:
    number: 3
  runRoot: /run/containers/storage
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 3.0.0
  Built: 1613085177
  BuiltTime: Fri Feb 12 00:12:57 2021
  GitCommit: ""
  GoVersion: go1.15.7
  OsArch: linux/amd64
  Version: 3.0.0

Package info (e.g. output of rpm -q podman or apt list podman):

podman-3.0.0-1.fc33.x86_64

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?

Yes

Additional environment details (AWS, VirtualBox, physical, etc.):
Bare metal server dual xeon cpus and 192GB RAM.

I also have this in my /usr/lib/tmpfiles.d/podman.conf

# /tmp/podman-run-* directory can contain content for Podman containers that have run
# for many days. This following line prevents systemd from removing this content.
x /tmp/podman-run-*
D! /run/podman 0700 root root
D! /var/lib/cni/networks

The text was updated successfully, but these errors were encountered:

Luap99 · 2021-02-22T15:28:38Z

I assume you have a static IP for your container defined.
Can you add --log-level debug to the podman stop command. Maybe this gives some insight on what is going on.

skitoxe · 2021-02-22T16:05:48Z

No i do not have static ip configured for the container.

INFO[0000] podman filtering at log level debug
DEBU[0000] Called stop.PersistentPreRunE(podman stop <container_name> --log-level debug)
DEBU[0000] Reading configuration file "/usr/share/containers/containers.conf"
DEBU[0000] Merged system config "/usr/share/containers/containers.conf": &{Containers:{Devices:[] Volumes:[] ApparmorProfile:containers-default-0.33.4 Annotations:[] CgroupNS:private Cgroups:enabled DefaultCapabilities:[CHOWN DAC_OVERRIDE FOWNER FSETID KILL NET_BIND_SERVICE SETFCAP SETGID SETPCAP SETUID SYS_CHROOT] DefaultSysctls:[net.ipv4.ping_group_range=0 0] DefaultUlimits:[nproc=4194304:4194304] DefaultMountsFile: DNSServers:[] DNSOptions:[] DNSSearches:[] EnableKeyring:true EnableLabeling:true Env:[PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin TERM=xterm] EnvHost:false HTTPProxy:true Init:false InitPath: IPCNS:private LogDriver:k8s-file LogSizeMax:-1 NetNS:bridge NoHosts:false PidsLimit:2048 PidNS:private SeccompProfile:/usr/share/containers/seccomp.json ShmSize:65536k TZ: Umask:0022 UTSNS:private UserNS:host UserNSSize:65536} Engine:{ImageBuildFormat:oci CgroupCheck:false CgroupManager:systemd ConmonEnvVars:[PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin] ConmonPath:[/usr/libexec/podman/conmon /usr/local/libexec/podman/conmon /usr/local/lib/podman/conmon /usr/bin/conmon /usr/sbin/conmon /usr/local/bin/conmon /usr/local/sbin/conmon /run/current-system/sw/bin/conmon] DetachKeys:ctrl-p,ctrl-q EnablePortReservation:true Env:[] EventsLogFilePath:/run/libpod/events/events.log EventsLogger:journald HooksDir:[/usr/share/containers/oci/hooks.d] ImageDefaultTransport:docker:// InfraCommand: InfraImage:k8s.gcr.io/pause:3.2 InitPath:/usr/libexec/podman/catatonit LockType:shm MultiImageArchive:false Namespace: NetworkCmdPath: NetworkCmdOptions:[] NoPivotRoot:false NumLocks:2048 OCIRuntime:crun OCIRuntimes:map[crun:[/usr/bin/crun /usr/sbin/crun /usr/local/bin/crun /usr/local/sbin/crun /sbin/crun /bin/crun /run/current-system/sw/bin/crun] kata:[/usr/bin/kata-runtime /usr/sbin/kata-runtime /usr/local/bin/kata-runtime /usr/local/sbin/kata-runtime /sbin/kata-runtime /bin/kata-runtime /usr/bin/kata-qemu /usr/bin/kata-fc] runc:[/usr/bin/runc /usr/sbin/runc /usr/local/bin/runc /usr/local/sbin/runc /sbin/runc /bin/runc /usr/lib/cri-o-runc/sbin/runc /run/current-system/sw/bin/runc]] PullPolicy:missing Remote:false RemoteURI: RemoteIdentity: ActiveService: ServiceDestinations:map[] RuntimePath:[] RuntimeSupportsJSON:[crun runc] RuntimeSupportsNoCgroups:[crun] RuntimeSupportsKVM:[kata kata-runtime kata-qemu kata-fc] SetOptions:{StorageConfigRunRootSet:false StorageConfigGraphRootSet:false StorageConfigGraphDriverNameSet:false StaticDirSet:false VolumePathSet:false TmpDirSet:false} SignaturePolicyPath:/etc/containers/policy.json SDNotify:false StateType:3 StaticDir:/var/lib/containers/storage/libpod StopTimeout:10 TmpDir:/run/libpod VolumePath:/var/lib/containers/storage/volumes VolumePlugins:map[]} Network:{CNIPluginDirs:[/usr/libexec/cni /usr/lib/cni /usr/local/lib/cni /opt/cni/bin] DefaultNetwork:podman NetworkConfigDir:/etc/cni/net.d/}}
DEBU[0000] Using conmon: "/usr/bin/conmon"
DEBU[0000] Initializing boltdb state at /var/lib/containers/storage/libpod/bolt_state.db
DEBU[0000] Using graph driver overlay
DEBU[0000] Using graph root /var/lib/containers/storage
DEBU[0000] Using run root /run/containers/storage
DEBU[0000] Using static dir /var/lib/containers/storage/libpod
DEBU[0000] Using tmp dir /run/libpod
DEBU[0000] Using volume path /var/lib/containers/storage/volumes
DEBU[0000] Set libpod namespace to ""
DEBU[0000] [graphdriver] trying provided driver "overlay"
DEBU[0000] cached value indicated that overlay is supported
DEBU[0000] cached value indicated that metacopy is being used
DEBU[0000] cached value indicated that native-diff is not being used
INFO[0000] Not using native diff for overlay, this may cause degraded performance for building images: kernel has CONFIG_OVERLAY_FS_REDIRECT_DIR enabled
DEBU[0000] backingFs=btrfs, projectQuotaSupported=false, useNativeDiff=false, usingMetacopy=true
DEBU[0000] Initializing event backend journald
DEBU[0000] using runtime "/usr/bin/crun"
DEBU[0000] using runtime "/usr/bin/runc"
INFO[0000] Error initializing configured OCI runtime kata: no valid executable found for OCI runtime kata: invalid argument
INFO[0000] Found CNI network podman (type=bridge) at /etc/cni/net.d/87-podman-bridge.conflist
INFO[0000] Found CNI network podman_default (type=bridge) at /etc/cni/net.d/podman_default.conflist
DEBU[0000] Default CNI network name podman is unchangeable
INFO[0000] Setting parallel job count to 121
DEBU[0000] Starting parallel job on container 3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7
DEBU[0000] Stopping ctr 3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7 (timeout 0)
DEBU[0000] Stopping container 3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7 (PID 1734021)
DEBU[0000] Sending signal 9 to container 3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7
DEBU[0000] Cleaning up container 3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7
DEBU[0000] Tearing down network namespace at /run/netns/cni-91364a4a-2782-b19a-3eda-41f64e471856 for container 3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7
INFO[0000] Got pod network &{Name:<container_name> Namespace:<container_name> ID:3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7 NetNS:/run/netns/cni-91364a4a-2782-b19a-3eda-41f64e471856 Networks:[{Name:podman_default Ifname:}] RuntimeConfig:map[podman_default:{IP: MAC: PortMappings:[{HostPort:25565 ContainerPort:25565 Protocol:tcp HostIP:}] Bandwidth:<nil> IpRanges:[]}] Aliases:map[]}
INFO[0000] About to del CNI network podman_default (type=bridge)
DEBU[0000] Successfully cleaned up container 3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7
DEBU[0000] unmounted container "3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7"
3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7
DEBU[0000] Called stop.PersistentPostRunE(podman stop <container_name> --log-level debug)

my versions are as follows of some other components as well:

Package crun-0.17-1.fc33.x86_64 is already installed.
Package runc-2:1.0.0-279.dev.gitdedadbf.fc33.x86_64 is already installed.
Package fuse-overlayfs-1.4.0-1.fc33.x86_64 is already installed.

Can this be caused by container being launched by one user via sudo and then stopped and re-launched by root's crontab?

mheon · 2021-02-22T16:11:24Z

Can you check if podman restart resolves the issue? It could be that the cleanup process isn't firing, or isn't finishing - restart does not have that issue, and would narrow things down if it works.

skitoxe · 2021-02-22T16:30:31Z

Funny you should ask. The reason i went with the stop and start to begin with is that restart seems to be so fast that it doesnt allow time for the bind for the port to release. That's the issue i had from the beginning that led me to using stop && wait && start.
I get the below failure on restart, every single time. And i can immediately issue another restart or start and it will pop back up again.

INFO[0000] podman filtering at log level debug
DEBU[0000] Called restart.PersistentPreRunE(podman restart <container_name> --log-level debug)
DEBU[0000] Reading configuration file "/usr/share/containers/containers.conf"
DEBU[0000] Merged system config "/usr/share/containers/containers.conf": &{Containers:{Devices:[] Volumes:[] ApparmorProfile:containers-default-0.33.4 Annotations:[] CgroupNS:private Cgroups:enabled DefaultCapabilities:[CHOWN DAC_OVERRIDE FOWNER FSETID KILL NET_BIND_SERVICE SETFCAP SETGID SETPCAP SETUID SYS_CHROOT] DefaultSysctls:[net.ipv4.ping_group_range=0 0] DefaultUlimits:[nproc=4194304:4194304] DefaultMountsFile: DNSServers:[] DNSOptions:[] DNSSearches:[] EnableKeyring:true EnableLabeling:true Env:[PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin TERM=xterm] EnvHost:false HTTPProxy:true Init:false InitPath: IPCNS:private LogDriver:k8s-file LogSizeMax:-1 NetNS:bridge NoHosts:false PidsLimit:2048 PidNS:private SeccompProfile:/usr/share/containers/seccomp.json ShmSize:65536k TZ: Umask:0022 UTSNS:private UserNS:host UserNSSize:65536} Engine:{ImageBuildFormat:oci CgroupCheck:false CgroupManager:systemd ConmonEnvVars:[PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin] ConmonPath:[/usr/libexec/podman/conmon /usr/local/libexec/podman/conmon /usr/local/lib/podman/conmon /usr/bin/conmon /usr/sbin/conmon /usr/local/bin/conmon /usr/local/sbin/conmon /run/current-system/sw/bin/conmon] DetachKeys:ctrl-p,ctrl-q EnablePortReservation:true Env:[] EventsLogFilePath:/run/libpod/events/events.log EventsLogger:journald HooksDir:[/usr/share/containers/oci/hooks.d] ImageDefaultTransport:docker:// InfraCommand: InfraImage:k8s.gcr.io/pause:3.2 InitPath:/usr/libexec/podman/catatonit LockType:shm MultiImageArchive:false Namespace: NetworkCmdPath: NetworkCmdOptions:[] NoPivotRoot:false NumLocks:2048 OCIRuntime:crun OCIRuntimes:map[crun:[/usr/bin/crun /usr/sbin/crun /usr/local/bin/crun /usr/local/sbin/crun /sbin/crun /bin/crun /run/current-system/sw/bin/crun] kata:[/usr/bin/kata-runtime /usr/sbin/kata-runtime /usr/local/bin/kata-runtime /usr/local/sbin/kata-runtime /sbin/kata-runtime /bin/kata-runtime /usr/bin/kata-qemu /usr/bin/kata-fc] runc:[/usr/bin/runc /usr/sbin/runc /usr/local/bin/runc /usr/local/sbin/runc /sbin/runc /bin/runc /usr/lib/cri-o-runc/sbin/runc /run/current-system/sw/bin/runc]] PullPolicy:missing Remote:false RemoteURI: RemoteIdentity: ActiveService: ServiceDestinations:map[] RuntimePath:[] RuntimeSupportsJSON:[crun runc] RuntimeSupportsNoCgroups:[crun] RuntimeSupportsKVM:[kata kata-runtime kata-qemu kata-fc] SetOptions:{StorageConfigRunRootSet:false StorageConfigGraphRootSet:false StorageConfigGraphDriverNameSet:false StaticDirSet:false VolumePathSet:false TmpDirSet:false} SignaturePolicyPath:/etc/containers/policy.json SDNotify:false StateType:3 StaticDir:/var/lib/containers/storage/libpod StopTimeout:10 TmpDir:/run/libpod VolumePath:/var/lib/containers/storage/volumes VolumePlugins:map[]} Network:{CNIPluginDirs:[/usr/libexec/cni /usr/lib/cni /usr/local/lib/cni /opt/cni/bin] DefaultNetwork:podman NetworkConfigDir:/etc/cni/net.d/}}
DEBU[0000] Using conmon: "/usr/bin/conmon"
DEBU[0000] Initializing boltdb state at /var/lib/containers/storage/libpod/bolt_state.db
DEBU[0000] Using graph driver overlay
DEBU[0000] Using graph root /var/lib/containers/storage
DEBU[0000] Using run root /run/containers/storage
DEBU[0000] Using static dir /var/lib/containers/storage/libpod
DEBU[0000] Using tmp dir /run/libpod
DEBU[0000] Using volume path /var/lib/containers/storage/volumes
DEBU[0000] Set libpod namespace to ""
DEBU[0000] [graphdriver] trying provided driver "overlay"
DEBU[0000] cached value indicated that overlay is supported
DEBU[0000] cached value indicated that metacopy is being used
DEBU[0000] cached value indicated that native-diff is not being used
INFO[0000] Not using native diff for overlay, this may cause degraded performance for building images: kernel has CONFIG_OVERLAY_FS_REDIRECT_DIR enabled
DEBU[0000] backingFs=btrfs, projectQuotaSupported=false, useNativeDiff=false, usingMetacopy=true
DEBU[0000] Initializing event backend journald
DEBU[0000] using runtime "/usr/bin/crun"
DEBU[0000] using runtime "/usr/bin/runc"
INFO[0000] Error initializing configured OCI runtime kata: no valid executable found for OCI runtime kata: invalid argument
INFO[0000] Found CNI network podman (type=bridge) at /etc/cni/net.d/87-podman-bridge.conflist
INFO[0000] Found CNI network podman_default (type=bridge) at /etc/cni/net.d/podman_default.conflist
DEBU[0000] Default CNI network name podman is unchangeable
INFO[0000] Setting parallel job count to 121
DEBU[0000] Stopping ctr 3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7 (timeout 0)
DEBU[0000] Stopping container 3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7 (PID 1768486)
DEBU[0000] Sending signal 9 to container 3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7
DEBU[0000] Tearing down network namespace at /run/netns/cni-17639dc8-c383-44bf-f6ed-6a3fa5ee7a20 for container 3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7
INFO[0000] Got pod network &{Name:<container_name> Namespace:<container_name> ID:3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7 NetNS:/run/netns/cni-17639dc8-c383-44bf-f6ed-6a3fa5ee7a20 Networks:[{Name:podman_default Ifname:}] RuntimeConfig:map[podman_default:{IP: MAC: PortMappings:[{HostPort:25565 ContainerPort:25565 Protocol:tcp HostIP:}] Bandwidth:<nil> IpRanges:[]}] Aliases:map[]}
INFO[0000] About to del CNI network podman_default (type=bridge)
DEBU[0000] Created root filesystem for container 3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7 at /var/lib/containers/storage/overlay/203c50b0d3fb2bfc01cef7400ba5d4fd8394059a959cbe7b64fd3ec112733ea3/merged
DEBU[0000] Made network namespace at /run/netns/cni-5e42f9e8-8a36-c4ac-5d20-e1cce9d5f3fd for container 3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7
INFO[0000] Got pod network &{Name:<container_name> Namespace:<container_name> ID:3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7 NetNS:/run/netns/cni-5e42f9e8-8a36-c4ac-5d20-e1cce9d5f3fd Networks:[{Name:podman_default Ifname:eth0}] RuntimeConfig:map[podman_default:{IP: MAC: PortMappings:[{HostPort:25565 ContainerPort:25565 Protocol:tcp HostIP:}] Bandwidth:<nil> IpRanges:[]}] Aliases:map[podman_default:[3d7d3511df63 <container_name>]]}
INFO[0000] About to add CNI network podman_default (type=bridge)
DEBU[0001] [0] CNI result: &{0.4.0 [{Name:cni-podman1 Mac:26:3d:6e:a2:0e:ce Sandbox:} {Name:vethb35434a3 Mac:0a:f8:b1:25:67:df Sandbox:} {Name:eth0 Mac:46:d9:4d:d9:1b:a9 Sandbox:/run/netns/cni-5e42f9e8-8a36-c4ac-5d20-e1cce9d5f3fd}] [{Version:4 Interface:0xc0006b5748 Address:{IP:10.89.0.86 Mask:ffffff00} Gateway:10.89.0.1}] [{Dst:{IP:0.0.0.0 Mask:00000000} GW:<nil>}] {[10.89.0.1]  [] []}}
DEBU[0001] Workdir "/data" resolved to a volume or mount
DEBU[0001] Recreating container 3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7 in OCI runtime
DEBU[0001] Successfully cleaned up container 3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7
DEBU[0001] adding nameserver(s) from cni response of '["10.89.0.1"]'
DEBU[0001] /etc/system-fips does not exist on host, not mounting FIPS mode subscription
DEBU[0001] Setting CGroups for container 3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7 to machine.slice:libpod:3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7
DEBU[0001] reading hooks from /usr/share/containers/oci/hooks.d
DEBU[0001] Created OCI spec for container 3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7 at /var/lib/containers/storage/overlay-containers/3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7/userdata/config.json
DEBU[0001] /usr/bin/conmon messages will be logged to syslog
DEBU[0001] running conmon: /usr/bin/conmon               args="[--api-version 1 -c 3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7 -u 3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7 -r /usr/bin/crun -b /var/lib/containers/storage/overlay-containers/3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7/userdata -p /run/containers/storage/overlay-containers/3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7/userdata/pidfile -n <container_name> --exit-dir /run/libpod/exits --socket-dir-path /run/libpod/socket -s -l k8s-file:/var/lib/containers/storage/overlay-containers/3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7/userdata/ctr.log --log-level debug --syslog --conmon-pidfile /run/containers/storage/overlay-containers/3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /var/lib/containers/storage --exit-command-arg --runroot --exit-command-arg /run/containers/storage --exit-command-arg --log-level --exit-command-arg info --exit-command-arg --cgroup-manager --exit-command-arg systemd --exit-command-arg --tmpdir --exit-command-arg /run/libpod --exit-command-arg --runtime --exit-command-arg crun --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg --storage-opt --exit-command-arg overlay.mountopt=nodev,metacopy=on --exit-command-arg --events-backend --exit-command-arg journald --exit-command-arg container --exit-command-arg cleanup --exit-command-arg 3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7]"
DEBU[0001] Cleaning up container 3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7
DEBU[0001] Tearing down network namespace at /run/netns/cni-5e42f9e8-8a36-c4ac-5d20-e1cce9d5f3fd for container 3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7
INFO[0001] Got pod network &{Name:<container_name> Namespace:<container_name> ID:3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7 NetNS:/run/netns/cni-5e42f9e8-8a36-c4ac-5d20-e1cce9d5f3fd Networks:[{Name:podman_default Ifname:}] RuntimeConfig:map[podman_default:{IP: MAC: PortMappings:[{HostPort:25565 ContainerPort:25565 Protocol:tcp HostIP:}] Bandwidth:<nil> IpRanges:[]}] Aliases:map[]}
INFO[0001] About to del CNI network podman_default (type=bridge)
DEBU[0001] unmounted container "3d7d3511df63951ca773e18b251d3d49c3185f88d0618e4726a7ad64b180a5e7"
Error: cannot listen on the TCP port: listen tcp4 :25565: bind: address already in use

mheon · 2021-02-22T16:33:34Z

Alright, that's very strange. podman restart should already handle that case (manually killing Conmon if required)

baude · 2021-02-23T13:59:54Z

this is a secondary issue and should be split off.

baude · 2021-02-23T14:03:51Z

@mheon does cleanup remove files in /var/lib/cni/networks/podman/ ?

baude · 2021-02-23T14:04:22Z

or how about when we detect a restart has occurred?

mheon · 2021-02-23T14:10:51Z

As of Podman 3.0, we ship a systemd-tmpfiles file instructing systemd to wipe the CNI reservations directory on reboot.

mheon · 2021-02-23T14:13:17Z

I doubt this is actually a CNI reservation issue given the way this is being described, though. If podman restart shows it as well, something rather more serious is wrong.

@skitoxe Can you check if there are any orphan conmon processes hanging around? Stopping all containers should remove them all, so any hanging around after that would be interesting to know about.

baude · 2021-02-23T14:19:11Z

@mheon could selinux be interupting things?

mheon · 2021-02-23T14:30:47Z

Possible? Checking the audit log could be helpful.

I'm still not certain what's actually going on here, though. We're getting port in use errors from Conmon, which usually indicate that Conmon is hanging around for a while and blocking the ports - but we fixed podman restart to kill -9 the old Conmon process, so that shouldn't be happening. After the manual delay added, we see CNI complain about an already-allocated IP - but the container's not using a static IP and CNI is smart enough not to use already-allocated IPs, so something is telling it to use an already-in-use IP or CNI is getting exceptionally confused. Something tells me these two are related but I can't put my finger on how.

Luap99 · 2021-02-23T14:47:58Z

Can you please paste your cni config file for the podman_default network.

skitoxe · 2021-02-23T17:30:54Z

Can you please paste your cni config file for the podman_default network.

You mean this file right? /etc/cni/net.d

cat podman_default.conflist
{
   "args": {
      "podman_labels": {
         "com.docker.compose.network": "default",
         "com.docker.compose.project": "podman",
         "com.docker.compose.version": "1.27.4"
      }
   },
   "cniVersion": "0.4.0",
   "name": "podman_default",
   "plugins": [
      {
         "type": "bridge",
         "bridge": "cni-podman1",
         "isGateway": true,
         "ipMasq": true,
         "hairpinMode": true,
         "ipam": {
            "type": "host-local",
            "routes": [
               {
                  "dst": "0.0.0.0/0"
               }
            ],
            "ranges": [
               [
                  {
                     "subnet": "10.89.0.0/24",
                     "gateway": "10.89.0.1"
                  }
               ]
            ]
         }
      },
      {
         "type": "portmap",
         "capabilities": {
            "portMappings": true
         }
      },
      {
         "type": "firewall",
         "backend": ""
      },
      {
         "type": "tuning"
      },
      {
         "type": "dnsname",
         "domainName": "dns.podman",
         "capabilities": {
            "aliases": true
         }
      }
   ]

skitoxe · 2021-02-23T17:35:53Z

I doubt this is actually a CNI reservation issue given the way this is being described, though. If podman restart shows it as well, something rather more serious is wrong.

@skitoxe Can you check if there are any orphan conmon processes hanging around? Stopping all containers should remove them all, so any hanging around after that would be interesting to know about.

i didnt appear to have, granted i could have been to slow to catch them if they stick around too long but still very short, depends on the speed the restart allocates the port again, in comparison to how fast i get the process grep.

I did however get this error while stopping this time:

Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
ERRO[0000] unable to cleanup network for container cd67e977bf281eeab3b192e18aceb5d614335962d0770f0f50360ea8cdb2472f: "error tearing down CNI namespace configuration for container cd67e977bf281eeab3b192e18aceb5d614335962d0770f0f50360ea8cdb2472f: Error while removing pod from CNI network \"podman_default\": running [/usr/sbin/iptables -t nat -D POSTROUTING -s 10.89.0.80 -j CNI-dd5462ad46da9543b99f4cff -m comment --comment name: \"podman_default\" id: \"cd67e977bf281eeab3b192e18aceb5d614335962d0770f0f50360ea8cdb2472f\" --wait]: exit status 2: iptables v1.8.5 (legacy): Couldn't load target `CNI-dd5462ad46da9543b99f4cff':No such file or directory\n\nTry `iptables -h' or 'iptables --help' for more information.\n"

mheon · 2021-02-23T18:14:22Z

Hm. Any chance the firewall was reloaded since the container was created? That could be related, though I'm not 100% sure how yet.

Luap99 · 2021-02-23T18:29:36Z

Yes that's the correct file and it also looks correct to me.

cpus: 40

Do you really have 40 cores on your system? This could be race which is more likely to happen with so many cpu cores.

baude · 2021-02-23T19:09:45Z

couple of observations here. the network config provided ^^^ was created by docker-compose (based on the labels). So the first question I have is if that is true? And then /me wonders if you should be using a network that you create with podman network create?

skitoxe · 2021-02-23T20:04:14Z

Hm. Any chance the firewall was reloaded since the container was created? That could be related, though I'm not 100% sure how yet.

Yes it has been reloaded.

Do you really have 40 cores on your system? This could be race which is more likely to happen with so many cpu cores.

10 cores and 20 threads per CPU. 2x xeon e5 2640v4.

couple of observations here. the network config provided ^^^ was created by docker-compose (based on the labels). So the first question I have is if that is true? And then /me wonders if you should be using a network that you create with podman network create?

The containers were created by Docker-compose but with no additional network configuration in the .yml file apart from dns configuration passed to the container.

github-actions · 2021-03-26T00:09:08Z

A friendly reminder that this issue had no activity for 30 days.

rhatdan · 2021-03-26T10:27:55Z

@skitoxe @mheon @baude is this still an issue?

github-actions · 2021-04-26T00:08:00Z

A friendly reminder that this issue had no activity for 30 days.

rhatdan · 2021-04-27T12:58:02Z

Since I was never answered, I am going to assume no and close. Reopen if I am mistaken.

openshift-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Feb 22, 2021

baude mentioned this issue Feb 24, 2021

podman restart with port bindings results in 'port in use' errors containers/conmon#240

Closed

github-actions bot added the stale-issue label Mar 26, 2021

rhatdan removed the stale-issue label Mar 26, 2021

github-actions bot added the stale-issue label Apr 26, 2021

rhatdan closed this as completed Apr 27, 2021

github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 22, 2023

github-actions bot locked as resolved and limited conversation to collaborators Sep 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dangling file in /var/lib/cni/networks/podman prevents container starting podman 3.0 #9465

Dangling file in /var/lib/cni/networks/podman prevents container starting podman 3.0 #9465

skitoxe commented Feb 22, 2021

Luap99 commented Feb 22, 2021

skitoxe commented Feb 22, 2021 •

edited

Loading

mheon commented Feb 22, 2021

skitoxe commented Feb 22, 2021 •

edited

Loading

mheon commented Feb 22, 2021

baude commented Feb 23, 2021

baude commented Feb 23, 2021

baude commented Feb 23, 2021

mheon commented Feb 23, 2021

mheon commented Feb 23, 2021

baude commented Feb 23, 2021

mheon commented Feb 23, 2021

Luap99 commented Feb 23, 2021

skitoxe commented Feb 23, 2021

skitoxe commented Feb 23, 2021

mheon commented Feb 23, 2021

Luap99 commented Feb 23, 2021

baude commented Feb 23, 2021

skitoxe commented Feb 23, 2021

github-actions bot commented Mar 26, 2021

rhatdan commented Mar 26, 2021

github-actions bot commented Apr 26, 2021

rhatdan commented Apr 27, 2021

Dangling file in /var/lib/cni/networks/podman prevents container starting podman 3.0 #9465

Dangling file in /var/lib/cni/networks/podman prevents container starting podman 3.0 #9465

Comments

skitoxe commented Feb 22, 2021

Luap99 commented Feb 22, 2021

skitoxe commented Feb 22, 2021 • edited Loading

mheon commented Feb 22, 2021

skitoxe commented Feb 22, 2021 • edited Loading

mheon commented Feb 22, 2021

baude commented Feb 23, 2021

baude commented Feb 23, 2021

baude commented Feb 23, 2021

mheon commented Feb 23, 2021

mheon commented Feb 23, 2021

baude commented Feb 23, 2021

mheon commented Feb 23, 2021

Luap99 commented Feb 23, 2021

skitoxe commented Feb 23, 2021

skitoxe commented Feb 23, 2021

mheon commented Feb 23, 2021

Luap99 commented Feb 23, 2021

baude commented Feb 23, 2021

skitoxe commented Feb 23, 2021

github-actions bot commented Mar 26, 2021

rhatdan commented Mar 26, 2021

github-actions bot commented Apr 26, 2021

rhatdan commented Apr 27, 2021

skitoxe commented Feb 22, 2021 •

edited

Loading

skitoxe commented Feb 22, 2021 •

edited

Loading