Dangling file in /var/lib/cni/networks/podman after hard restart #5433

Schrottfresse · 2020-03-09T18:32:13Z

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

My server restarts hard occasionally (as in I have a brownout because my fuse triggers). When this happens I have the same dangling files reported here and here. Podman has no possibility to clean up.

Steps to reproduce the issue:

Hard restart server.

Describe the results you received:

Dangling files still in /var/lib/cni/networks/podman.

Describe the results you expected:

Cleaned up filesystem, no dangling files in /var/lib/cni/networks/podman.

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

# podman version
Version:            1.6.4
RemoteAPI Version:  1
Go Version:         go1.12.12
OS/Arch:            linux/amd64

Output of podman info --debug:

# podman info --debug
debug:
  compiler: gc
  git commit: ""
  go version: go1.12.12
  podman version: 1.6.4
host:
  BuildahVersion: 1.12.0-dev
  CgroupVersion: v1
  Conmon:
    package: conmon-2.0.6-1.module_el8.1.0+272+3e64ee36.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.6, commit: 7a4f0dd7b20a3d4bf9ef3e5cbfac05606b08eac0'
  Distribution:
    distribution: '"centos"'
    version: "8"
  MemFree: 32140668928
  MemTotal: 33369112576
  OCIRuntime:
    name: runc
    package: runc-1.0.0-64.rc9.module_el8.1.0+272+3e64ee36.x86_64
    path: /usr/bin/runc
    version: 'runc version spec: 1.0.1-dev'
  SwapFree: 16835932160
  SwapTotal: 16835932160
  arch: amd64
  cpus: 4
  eventlogger: journald
  hostname: carter
  kernel: 4.18.0-147.5.1.el8_1.x86_64
  os: linux
  rootless: false
  uptime: 1h 35m 6.12s (Approximately 0.04 days)
registries:
  blocked: null
  insecure: null
  search:
  - registry.access.redhat.com
  - registry.fedoraproject.org
  - registry.centos.org
  - docker.io
store:
  ConfigFile: /etc/containers/storage.conf
  ContainerStore:
    number: 2
  GraphDriverName: overlay
  GraphOptions: {}
  GraphRoot: /var/lib/containers/storage
  GraphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  ImageStore:
    number: 3
  RunRoot: /var/run/containers/storage
  VolumePath: /var/lib/containers/storage/volumes

Package info (e.g. output of rpm -q podman or apt list podman):

# rpm -q podman
podman-1.6.4-2.module_el8.1.0+272+3e64ee36.x86_64

Additional environment details (AWS, VirtualBox, physical, etc.):

physical server

The text was updated successfully, but these errors were encountered:

mheon · 2020-03-09T18:40:51Z

I'll take this one.

mheon · 2020-03-09T20:17:02Z

@Schrottfresse Any chance you can test #5434 and see if this resolves it?

Schrottfresse · 2020-03-09T20:54:10Z

@mheon I will try to do it the next days.

We previously attempted to work within CNI to do this, without success. So let's do it manually, instead. We know where the files should live, so we can remove them ourselves instead. This solves issues around sudden reboots where containers do not have time to fully tear themselves down, and leave IP address allocations which, for various reasons, are not stored in tmpfs and persist through reboot. Fixes containers#5433 Signed-off-by: Matthew Heon <[email protected]>

wingzero0 · 2020-05-08T02:28:38Z

I have the same issue on the centos 8, podman version 1.9.1.

Do I need to apply any setting to activate the #5434 ?

Thank you

Output of podman version :

# podman version
Version:            1.9.1
RemoteAPI Version:  1
Go Version:         go1.12.12
OS/Arch:            linux/amd64

Output of podman info --debug:

# podman info --debug
debug:
  compiler: gc
  gitCommit: ""
  goVersion: go1.12.12
  podmanVersion: 1.9.1
host:
  arch: amd64
  buildahVersion: 1.14.8
  cgroupVersion: v1
  conmon:
    package: conmon-2.0.6-1.module_el8.1.0+298+41f9343a.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.6, commit: 2721f230f94894671f141762bd0d1af2fb263239'
  cpus: 8
  distribution:
    distribution: '"centos"'
    version: "8"
  eventLogger: file
  hostname: dmzAppServer
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 4.18.0-147.8.1.el8_1.x86_64
  memFree: 1215787008
  memTotal: 3963326464
  ociRuntime:
    name: runc
    package: runc-1.0.0-64.rc9.module_el8.1.0+298+41f9343a.x86_64
    path: /usr/bin/runc
    version: 'runc version spec: 1.0.1-dev'
  os: linux
  rootless: false
  slirp4netns:
    executable: ""
    package: ""
    version: ""
  swapFree: 4265603072
  swapTotal: 4265603072
  uptime: 39m 44.73s
registries:
  localhost:5000:
    Blocked: false
    Insecure: true
    Location: localhost:5000
    MirrorByDigestOnly: false
    Mirrors: []
    Prefix: localhost:5000
  search:
  - registry.access.redhat.com
  - registry.fedoraproject.org
  - registry.centos.org
  - docker.io
  - localhost:5000
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 5
    paused: 0
    running: 5
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /var/lib/containers/storage
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 10
  runRoot: /var/run/containers/storage
  volumePath: /var/lib/containers/storage/volumes

Package info (e.g. output of rpm -q podman or apt list podman):

# rpm -q podman
podman-1.9.1-1.1.el8.x86_64

Additional environment details (AWS, VirtualBox, physical, etc.):
Vsphere VM

wingzero0 · 2020-05-08T02:55:47Z

For more detail, there are 4 running containers with dedicated IP settings (10.88.0.2, 10.88.0.3, 10.88.0.4, 10.88.0.5) in my centos 8.

After reboots (I use reboot command), there are usually 1 or 2 IP exists at /var/lib/cni/networks/podman

I have the same issue on the centos 8, podman version 1.9.1.

Do I need to apply any setting to activate the #5434 ?

Thank you

Output of podman version :

# podman version
Version:            1.9.1
RemoteAPI Version:  1
Go Version:         go1.12.12
OS/Arch:            linux/amd64

Output of podman info --debug:

# podman info --debug
debug:
  compiler: gc
  gitCommit: ""
  goVersion: go1.12.12
  podmanVersion: 1.9.1
host:
  arch: amd64
  buildahVersion: 1.14.8
  cgroupVersion: v1
  conmon:
    package: conmon-2.0.6-1.module_el8.1.0+298+41f9343a.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.6, commit: 2721f230f94894671f141762bd0d1af2fb263239'
  cpus: 8
  distribution:
    distribution: '"centos"'
    version: "8"
  eventLogger: file
  hostname: dmzAppServer
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 4.18.0-147.8.1.el8_1.x86_64
  memFree: 1215787008
  memTotal: 3963326464
  ociRuntime:
    name: runc
    package: runc-1.0.0-64.rc9.module_el8.1.0+298+41f9343a.x86_64
    path: /usr/bin/runc
    version: 'runc version spec: 1.0.1-dev'
  os: linux
  rootless: false
  slirp4netns:
    executable: ""
    package: ""
    version: ""
  swapFree: 4265603072
  swapTotal: 4265603072
  uptime: 39m 44.73s
registries:
  localhost:5000:
    Blocked: false
    Insecure: true
    Location: localhost:5000
    MirrorByDigestOnly: false
    Mirrors: []
    Prefix: localhost:5000
  search:
  - registry.access.redhat.com
  - registry.fedoraproject.org
  - registry.centos.org
  - docker.io
  - localhost:5000
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 5
    paused: 0
    running: 5
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /var/lib/containers/storage
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 10
  runRoot: /var/run/containers/storage
  volumePath: /var/lib/containers/storage/volumes

Package info (e.g. output of rpm -q podman or apt list podman):

# rpm -q podman
podman-1.9.1-1.1.el8.x86_64

Additional environment details (AWS, VirtualBox, physical, etc.):
Vsphere VM

mheon · 2020-05-08T17:36:21Z

Can you provide the names of the files? Do they prevent you from starting the containers which use those IPs?

We previously attempted to work within CNI to do this, without success. So let's do it manually, instead. We know where the files should live, so we can remove them ourselves instead. This solves issues around sudden reboots where containers do not have time to fully tear themselves down, and leave IP address allocations which, for various reasons, are not stored in tmpfs and persist through reboot. Fixes containers#5433 Signed-off-by: Matthew Heon <[email protected]>

openshift-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Mar 9, 2020

mheon self-assigned this Mar 9, 2020

mheon mentioned this issue Mar 9, 2020

Attempt manual removal of CNI IP allocations on refresh #5434

Merged

openshift-merge-robot closed this as completed in #5434 Mar 23, 2020

github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 23, 2023

github-actions bot locked as resolved and limited conversation to collaborators Sep 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dangling file in /var/lib/cni/networks/podman after hard restart #5433

Dangling file in /var/lib/cni/networks/podman after hard restart #5433

Schrottfresse commented Mar 9, 2020

mheon commented Mar 9, 2020

mheon commented Mar 9, 2020

Schrottfresse commented Mar 9, 2020

wingzero0 commented May 8, 2020 •

edited

Loading

wingzero0 commented May 8, 2020

mheon commented May 8, 2020

Dangling file in /var/lib/cni/networks/podman after hard restart #5433

Dangling file in /var/lib/cni/networks/podman after hard restart #5433

Comments

Schrottfresse commented Mar 9, 2020

mheon commented Mar 9, 2020

mheon commented Mar 9, 2020

Schrottfresse commented Mar 9, 2020

wingzero0 commented May 8, 2020 • edited Loading

wingzero0 commented May 8, 2020

mheon commented May 8, 2020

wingzero0 commented May 8, 2020 •

edited

Loading