Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Podman machine start fails with 'no such file or directory', reboot resolves issue on Mac M1 #16163

Closed
holly-cummins opened this issue Oct 13, 2022 · 6 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. remote Problem is in podman-remote

Comments

@holly-cummins
Copy link

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

We are seeing an issue on a self-hosted github runner, on Mac M1. If it's not already running, we start the podman machine at the start of the github job,

podman machine info | grep Running ||podman machine start

Usually this works, but at some point after a few days the machine gets into a state where it won't start. Rebooting fixes the issue, but it would be nice to identify a less drastic workaround.

Starting machine "podman-machine-default"
Error: dial unix /var/folders/mq/dx78dkw10dv2hb0fm0qvchqm0000gp/T/podman/qmp_podman-machine-default.sock: connect: no such file or directory
Error: Process completed with exit code 125.

Steps to reproduce the issue:

Sorry, this isn't super-reproducible since it only happens after a while. We see the issue when the machine has been running for a few days, but then stops (crashes?) and won't start correctly. I don't have logs for why it stops.

Output of podman version:

4.2.1

Output of podman info:

host:
  arch: arm64
  buildahVersion: 1.27.0
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - pids
  - misc
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.4-2.fc36.aarch64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.4, commit: '
  cpuUtilization:
    idlePercent: 99.44
    systemPercent: 0.36
    userPercent: 0.2
  cpus: 1
  distribution:
    distribution: fedora
    variant: coreos
    version: "36"
  eventLogger: journald
  hostname: localhost.localdomain
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 5.19.12-200.fc36.aarch64
  linkmode: dynamic
  logDriver: journald
  memFree: 1051627520
  memTotal: 2051641344
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun-1.6-2.fc36.aarch64
    path: /usr/bin/crun
    version: |-
      crun version 1.6
      commit: 18cf2efbb8feb2b2f20e316520e0fd0b6c41ef4d
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-0.2.beta.0.fc36.aarch64
    version: |-
      slirp4netns version 1.2.0-beta.0
      commit: 477db14a24ff1a3de3a705e51ca2c4c1fe3dda64
      libslirp: 4.6.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.3
  swapFree: 0
  swapTotal: 0
  uptime: 0h 57m 27.00s
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  search:
  - docker.io
store:
  configFile: /usr/share/containers/storage.conf
  containerStore:
    number: 6
    paused: 0
    running: 0
    stopped: 6
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev,metacopy=on
  graphRoot: /var/lib/containers/storage
  graphRootAllocated: 106825756672
  graphRootUsed: 7693189120
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "true"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 5
  runRoot: /run/containers/storage
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 4.2.1
  Built: 1662580765
  BuiltTime: Wed Sep  7 15:59:25 2022
  GitCommit: ""
  GoVersion: go1.18.5
  Os: linux
  OsArch: linux/arm64
  Version: 4.2.1

Package info (e.g. output of rpm -q podman or apt list podman or brew info podman):

githubactions@administrators-Mac-mini ~ % brew info podman
==> podman: stable 4.2.1 (bottled), HEAD
Tool for managing OCI containers and pods
https://podman.io/
/opt/homebrew/Cellar/podman/4.2.1 (178 files, 48MB) *
  Poured from bottle on 2022-09-12 at 05:57:08
From: https://github.com/Homebrew/homebrew-core/blob/HEAD/Formula/podman.rb
License: Apache-2.0
==> Dependencies
Build: go-md2man ✘, [email protected] ✘
Required: qemu ✔
==> Options
--HEAD
	Install HEAD version
==> Caveats
zsh completions have been installed to:
  /opt/homebrew/share/zsh/site-functions
==> Analytics
install: 20,082 (30 days), 64,183 (90 days), 212,333 (365 days)
install-on-request: 19,372 (30 days), 62,488 (90 days), 210,270 (365 days)
build-error: 0 (30 days)

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)

Yes

Additional environment details (AWS, VirtualBox, physical, etc.):

This is a Mac M1.

@openshift-ci openshift-ci bot added the kind/bug Categorizes issue or PR as related to a bug. label Oct 13, 2022
@github-actions github-actions bot added the remote Problem is in podman-remote label Oct 13, 2022
@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Nov 14, 2022

@baude do you think this is a leak in gvisor or somewhere else?

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@cameronmitchellapps
Copy link

+1 I have also been experiencing this issue. For me it seems to happen often after the laptop is restarted

@vrothberg
Copy link
Member

This issue should have been fixed by now. The start+stop sequences have been made much more robust in the meantime.

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Jan 18, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 18, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. remote Problem is in podman-remote
Projects
None yet
Development

No branches or pull requests

5 participants