Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: qemu exited unexpectedly with exit code 1, stderr: qemu-system-x86_64: cannot create PID file: Cannot lock pid file: Resource temporarily unavailable #16054

Closed
arcusfelis opened this issue Oct 5, 2022 · 28 comments · Fixed by #19469
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. machine podman-desktop

Comments

@arcusfelis
Copy link

podman fails to start

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

Error: qemu exited unexpectedly with exit code 1, stderr: qemu-system-x86_64: cannot create PID file: Cannot lock pid file: Resource temporarily unavailable

Steps to reproduce the issue:

  1. Stop machine.

  2. Start machine.

  3. Machine starts or not starts very unpredictably.

Describe the results you received:

Error: qemu exited unexpectedly with exit code 1, stderr: qemu-system-x86_64: cannot create PID file: Cannot lock pid file: Resource temporarily unavailable

Describe the results you expected:
Stuff to work? More descriptive error messages?

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

podman version
Cannot connect to Podman. Please verify your connection to the Linux system using `podman system connection list`, or try `podman machine init` and `podman machine start` to manage a new Linux VM
Error: unable to connect to Podman. failed to create sshClient: connection to bastion host (ssh://core@localhost:50422/run/user/502/podman/podman.sock) failed: dial tcp [::1]:50422: connect: connection refused

Output of podman info:

podman info
Cannot connect to Podman. Please verify your connection to the Linux system using `podman system connection list`, or try `podman machine init` and `podman machine start` to manage a new Linux VM
Error: unable to connect to Podman. failed to create sshClient: connection to bastion host (ssh://core@localhost:50422/run/user/502/podman/podman.sock) failed: dial tcp [::1]:50422: connect: connection refused

Package info (e.g. output of rpm -q podman or apt list podman or brew info podman):

==> podman: stable 4.2.1 (bottled), HEAD
Tool for managing OCI containers and pods
https://podman.io/
/usr/local/Cellar/podman/4.2.1 (178 files, 48.5MB) *
  Poured from bottle on 2022-09-15 at 22:39:56
From: https://github.com/Homebrew/homebrew-core/blob/HEAD/Formula/podman.rb
License: Apache-2.0
==> Dependencies
Build: go-md2man ✘, [email protected] ✘
Required: qemu ✔
==> Options
--HEAD
        Install HEAD version
==> Caveats
zsh completions have been installed to:
  /usr/local/share/zsh/site-functions
==> Analytics
install: 24,813 (30 days), 64,356 (90 days), 211,186 (365 days)
install-on-request: 24,012 (30 days), 62,808 (90 days), 209,311 (365 days)
build-error: 1 (30 days)

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)

No

Additional environment details (AWS, VirtualBox, physical, etc.): mac

@openshift-ci openshift-ci bot added the kind/bug Categorizes issue or PR as related to a bug. label Oct 5, 2022
@arcusfelis
Copy link
Author

I've removed old machine and created a new one. For a NEW working machine I have:

podman version
Client:       Podman Engine
Version:      4.2.1
API Version:  4.2.1
Go Version:   go1.18.6
Built:        Tue Sep  6 21:16:02 2022
OS/Arch:      darwin/amd64

Server:       Podman Engine
Version:      4.2.1
API Version:  4.2.1
Go Version:   go1.18.5
Built:        Wed Sep  7 21:58:19 2022
OS/Arch:      linux/amd64

info

host:
  arch: amd64
  buildahVersion: 1.27.0
  cgroupControllers:
  - cpu
  - io
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.4-2.fc36.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.4, commit: '
  cpuUtilization:
    idlePercent: 89.07
    systemPercent: 6.84
    userPercent: 4.08
  cpus: 1
  distribution:
    distribution: fedora
    variant: coreos
    version: "36"
  eventLogger: journald
  hostname: localhost.localdomain
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 1000000
    uidmap:
    - container_id: 0
      host_id: 502
      size: 1
    - container_id: 1
      host_id: 100000
      size: 1000000
  kernel: 5.19.12-200.fc36.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 1683369984
  memTotal: 2066890752
  networkBackend: netavark
  
  ociRuntime:
    name: crun
    package: crun-1.6-2.fc36.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.6
      commit: 18cf2efbb8feb2b2f20e316520e0fd0b6c41ef4d
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/user/502/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-0.2.beta.0.fc36.x86_64
    version: |-
      slirp4netns version 1.2.0-beta.0
      commit: 477db14a24ff1a3de3a705e51ca2c4c1fe3dda64
      libslirp: 4.6.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.3
  swapFree: 0
  swapTotal: 0
  uptime: 0h 1m 28.00s
  
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  search:
  - docker.io
store:
  configFile: /var/home/core/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /var/home/core/.local/share/containers/storage
  graphRootAllocated: 106825756672
  graphRootUsed: 2288799744
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 0
  runRoot: /run/user/502/containers
  volumePath: /var/home/core/.local/share/containers/storage/volumes
version:
  APIVersion: 4.2.1
  Built: 1662580699
  BuiltTime: Wed Sep  7 21:58:19 2022
  GitCommit: ""
  GoVersion: go1.18.5
  Os: linux
  OsArch: linux/amd64
  Version: 4.2.1

@Luap99 Luap99 added the machine label Oct 5, 2022
@Luap99
Copy link
Member

Luap99 commented Oct 5, 2022

This error is coming from qemu, likely a bug in qemu and not podman.

@DarcySail
Copy link

I have met the same issue, but I didn't use podman, only qemu.

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@Thomasdezeeuw
Copy link

I recently also hit this issue, so it's still relevant (not stale).

@rhatdan
Copy link
Member

rhatdan commented Nov 28, 2022

Since this is a qemu bug there is little Podman can do.

@yusufcakmakk
Copy link

It was solved when I restarted the computer. You can restart your computer. I have no idea about the problem.

@Thomasdezeeuw
Copy link

It was solved when I restarted the computer. You can restart your computer. I have no idea about the problem.

For me it's also solved by waiting a little bit. It seems some process (podman or QEMU I don't know) is slower in shutdown than the Podman Desktop UI is indicating.

@archiewx
Copy link

archiewx commented Dec 6, 2022

This is very strange~

@JavierValtierra
Copy link

I had this same issue today, what I did to "solve" it is to kill the qemu process with a kill -9 PID command. Then, I run the podman machine start command and it started successfully.

@barnysanchez
Copy link

I had this same issue today, what I did to "solve" it is to kill the qemu process with a kill -9 PID command. Then, I run the podman machine start command and it started successfully.

this saved my day. I have experienced the problem multiple times, ending on having to reboot my Mac, but this did the trick.

@oherrala
Copy link

oherrala commented Jan 17, 2023

Today I also got into this problem. I tried to start podman machine and it gave me exit status 255. Next try was unsuccessfully and the podman machine start process just hanged and I killed it with ctrl-c. Third time I got the error mentioned in this issue (Error: qemu exited unexpectedly with exit code 1). After killing the /opt/homebrew/bin/qemu-system-aarch64 process from my machine, I could once again start podman machine.

The session with my shell:

% podman machine start                                                                                                                                            
Starting machine "podman-machine-default"                                                                                                                                                     
Waiting for VM ...                                                                             
Mounting volume... /Users/oherrala:/Users/oherrala                                             
Error: exit status 255

% podman machine start                                             
Starting machine "podman-machine-default"                                                      
Waiting for VM ... 

<I kept waiting and waiting and waiting, then hit CTRL-C to end the waiting>
^C

% podman machine start
Starting machine "podman-machine-default"                                                      
Waiting for VM ...                                                                             
Error: qemu exited unexpectedly with exit code 1, stderr: qemu-system-aarch64: cannot create PID file: Cannot lock pid file: Resource temporarily unavailable

% ps auxwww|grep podman
% kill 63230

% podman machine start 
Starting machine "podman-machine-default"
Waiting for VM ...
Mounting volume... /Users/oherrala:/Users/oherrala

This machine is currently configured in rootless mode. If your containers
require root permissions (e.g. ports < 1024), or if you run into compatibility
issues with non-podman clients, you can switch using the following command: 

        podman machine set --rootful

API forwarding listening on: /var/run/docker.sock
Docker API clients default to this address. You do not need to set DOCKER_HOST.

Machine "podman-machine-default" started successfully

@sirmspencer
Copy link

It was a failed shutdown for me. Thanks @oherrala! More simply

% ps auxwww|grep podman
% kill 63230
% podman machine start

@thmang82
Copy link

Just also run into this issue on an M1 Macbook. Started with podman not being able to do anything. The virtual machine did not react. Stopping the machine with "podman machine stop" did hang forever. Afterwards i killed the qemu process and started the machine again. Not very user friendly.

Any chance podman can detect a non-responsive vm? This would help a lot. Happens every x days after resuming from suspend unfortunately.

@ctrought
Copy link

Just also run into this issue on an M1 Macbook. Started with podman not being able to do anything. The virtual machine did not react. Stopping the machine with "podman machine stop" did hang forever. Afterwards i killed the qemu process and started the machine again. Not very user friendly.

Any chance podman can detect a non-responsive vm? This would help a lot. Happens every x days after resuming from suspend unfortunately.

Same here, also M1 macbook. Needs to restart podman machine several times a day due to frequent hanging.

@heyvito
Copy link

heyvito commented Mar 14, 2023

I don't know up to which level this is relevant (perhaps virtualisation-wise?) but I have noticed Docker for macOS also behaving like that after resuming from suspend on an M1, but not on Intel. Killing the process and starting the VM again (both on Docker and Podman) fixes the issue.

Edit: running macOS 13.2.1 (22D68)

@hundehausen
Copy link

I have the same problem on latest macOS

@hawktang
Copy link

Same problem M1

@lihuanshuai
Copy link

if your qemu is upgraded, try remove podman machine and initialize again

podman machine rm
podman machine init

@tscuite
Copy link

tscuite commented May 11, 2023

Occasionally this situation is required and the process needs to be manually killed

@flomar77
Copy link

I get the same error on Intel mac

@sneko
Copy link

sneko commented Jun 28, 2023

Except if I'm wrong your solution @lihuanshuai (#16054 (comment)) removes all images locally.

Any workaround to keep them?

(The initial issue is pretty annoying, it's been 3 times in a few weeks... wondering if it's due to my macos going inactive/sleep)

@vrothberg
Copy link
Member

@deboer-tim, with #19210 I am unable to see any flake in stop+start. As reported in #18648, when doing the restart in the UI, are stop and start overlapping/running concurrently or do they run in sequence?

@deboer-tim
Copy link

They are in sequence. Initially I thought it was when I stopped and quickly restarted (because that's what I was doing at the time, and I couldn't always reproduce), but it doesn't appear to be timing related.

When I opened #18648 I thought there were times when I couldn't restart in the UI but it worked from CLI, but my memory is a bit foggy. Today I have a machine that's failing in both cases:

% podman machine start podman-machine-demo
Starting machine "podman-machine-demo"
Waiting for VM ...
Error: qemu exited unexpectedly with exit code 1, stderr: qemu-system-aarch64: cannot create PID file: Cannot lock pid file: Resource temporarily unavailable

If there's anything you want me to check feel free to ping.

@vrothberg
Copy link
Member

Thanks, @deboer-tim !

Can you try running #19210?

@deboer-tim
Copy link

As per DMs, I get the same behaviour with one of my machines even with #19210. I did try to start it without first, but fails consistently now.

@vrothberg
Copy link
Member

@deboer-tim, can you try again with the latest main branch?

vrothberg added a commit to vrothberg/libpod that referenced this issue Aug 2, 2023
After a failed start, we can run into (somehow inconsistent) states
where the machine won't start because a previous QEMU process is still
running and the PID file is being used.  Stop didn't resolve the issue
as this state wasn't detected.

Allow to recover from this state by a) detecting it during start and
error out with a more helpful message than the error QEMU would
otherwise spit out, and b) by enabling stop to kill the dangling QEMU
process - even after a failed stop.

With the changes, a recovery may look as follows:
```
_  podman git:(main) _ ./bin/darwin/podman machine start
Starting machine "podman-machine-default"
Error: cannot start VM "podman-machine-default": another instance of "/opt/homebrew/bin/qemu-system-aarch64" is already running with process ID 970: please stop and restart the VM
_  podman git:(main) _ ./bin/darwin/podman machine stop
Machine "podman-machine-default" stopped successfully
_  podman git:(main) _ ./bin/darwin/podman machine start
Starting machine "podman-machine-default"
Waiting for VM ...
```

Please note that this change does not prevent us from running into such
inconsistent states but only allows for recovering from them.

[NO NEW TESTS NEEDED] - there is no reliable reproducer.

Fixes: containers#16054
Signed-off-by: Valentin Rothberg <[email protected]>
@chevdor
Copy link

chevdor commented Aug 9, 2023

I ran into this issue as well today with:

  • podman version 4.6.0
  • QEMU emulator version 8.0.3 (qemu-system-x86_64 )

Here is a oneliner to solve the issue:

ps -edf | grep qemu-system | grep -v grep | awk '{print $2}' | xargs -I{} kill -9 {}; podman machine stop

You may then run podman machine start and keep rolling

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. machine podman-desktop
Projects
None yet
Development

Successfully merging a pull request may close this issue.