Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

podman cannot pull certain images from docker hub #23822

Closed
waffshappen opened this issue Aug 31, 2024 · 16 comments
Closed

podman cannot pull certain images from docker hub #23822

waffshappen opened this issue Aug 31, 2024 · 16 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@waffshappen
Copy link

Issue Description

podman pull docker.io/ollama/ollama:rocm fails consistently with

Error: writing blob: storing blob to file "/var/tmp/container_images_storage791753323/1": pigz: skipping: <stdin>: corrupted -- crc32 mismatch: exit status 1

All :rocm tags from the image are affected going back as far as i can tell.

The :latest (smaller, dont contain the 4GB rocm binary layer) images are not affected.

skopeo is affected with the same issue.

Workaround: loading the image with docker pull works, and exporting it to a .tar, and importing it with podman image load works.

Steps to reproduce the issue

Steps to reproduce the issue

  1. podman pull docker.io/ollama/ollama:0.3.8-rocm

Describe the results you received

Error: writing blob: storing blob to file "/var/tmp/container_images_storage791753323/1": pigz: skipping: <stdin>: corrupted -- crc32 mismatch: exit status 1

Describe the results you expected

podman pull should successfully pull the image, like docker pull can

podman info output

host:
  arch: amd64
  buildahVersion: 1.37.2
  cgroupControllers:
  - cpu
  - io
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.10-1.fc40.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.10, commit: '
  cpuUtilization:
    idlePercent: 87.8
    systemPercent: 2.67
    userPercent: 9.52
  cpus: 16
  databaseBackend: sqlite
  distribution:
    distribution: fedora
    variant: workstation
    version: "40"
  eventLogger: journald
  freeLocks: 2043
  hostname: framework
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 524288
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 524288
      size: 65536
  kernel: 6.10.6-200.fc40.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 4563808256
  memTotal: 62984425472
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.12.1-1.fc40.x86_64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.12.1
    package: netavark-1.12.1-1.fc40.x86_64
    path: /usr/libexec/podman/netavark
    version: netavark 1.12.1
  ociRuntime:
    name: crun
    package: crun-1.15-1.fc40.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.15
      commit: e6eacaf4034e84185fd8780ac9262bbf57082278
      rundir: /run/user/1000/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-0^20240821.g1d6142f-1.fc40.x86_64
    version: |
      pasta 0^20240821.g1d6142f-1.fc40.x86_64
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: false
    path: /run/user/1000/podman/podman.sock
  rootlessNetworkCmd: pasta
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: ""
    package: ""
    version: ""
  swapFree: 8589930496
  swapTotal: 8589930496
  uptime: 19h 30m 49.00s (Approximately 0.79 days)
  variant: ""
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
store:
  configFile: /home/$user/.config/containers/storage.conf
  containerStore:
    number: 3
    paused: 0
    running: 3
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/$user/.local/share/containers/storage
  graphRootAllocated: 504641880064
  graphRootUsed: 355768459264
  graphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 11
  runRoot: /run/user/1000/containers
  transientStore: false
  volumePath: /home/$user/.local/share/containers/storage/volumes
version:
  APIVersion: 5.2.2
  Built: 1724198400
  BuiltTime: Wed Aug 21 02:00:00 2024
  GitCommit: ""
  GoVersion: go1.22.6
  Os: linux
  OsArch: linux/amd64
  Version: 5.2.2

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

No

Additional environment details

No response

Additional information

No response

@waffshappen waffshappen added the kind/bug Categorizes issue or PR as related to a bug. label Aug 31, 2024
@Luap99
Copy link
Member

Luap99 commented Sep 2, 2024

@giuseppe PTAL There is also a fedora bz reporting the same thing with a different image, https://bugzilla.redhat.com/show_bug.cgi?id=2307237

@p5
Copy link

p5 commented Sep 2, 2024

Facing the same with vllm/vllm-openai:latest, however it's occurring on both Docker and Podman.

Users on the Docker side have said it's something to do with containerd preferring unpigz for extraction when it's installed, and removing the pigz package resolves this.
Obviously things may be different on Podman, and removing the Pigz package isn't as easy to do on Atomic Desktop systems.

Edit:
I can see Podman also has the logic to prefer using Pigz for compressing/decompressing gzip images
https://github.com/containers/storage/blob/main/pkg/archive/archive.go#L200-L212
Although removing this logic should fix this, Pigz was chosen for a reason unknown to me, so figuring out why Pigz isn't working would be ideal.

@Luap99
Copy link
Member

Luap99 commented Sep 2, 2024

It was added in containers/storage#1964 with podman 5.2 to make it faster.

I however can pull the mention images fine though so it is not a consistent error either.

@p5
Copy link

p5 commented Sep 2, 2024

Will try and gather some additional information.
Is there anything I can provide to help identify the issue?

Currently pulling an image to see if there's any system logs
There's no system logs (journalctl) other than the error shared in this issue description.

$ rpm -qf /usr/bin/pigz
pigz-2.8-4.fc40.x86_64

@Luap99
Copy link
Member

Luap99 commented Sep 2, 2024

What is your pull speed? It may be very well related to the buffer/read size.

@p5
Copy link

p5 commented Sep 2, 2024

What is your pull speed? It may be very well related to the buffer/read size.

It fluctuates quite a lot, but looks to be about 40MiB/s total
image

After switching to Ethernet, I'm getting about 100Mib/s total and the error is still ocurring.

image

@Luap99
Copy link
Member

Luap99 commented Sep 2, 2024

Yeah I think slower is better, I have a 7 MiB/s and it pulls fine

@giuseppe
Copy link
Member

giuseppe commented Sep 2, 2024

I am able to reproduce locally.

@Luap99 could you try against a local registry? I've used the following commands:

# podman run -d -p5000:5000 registry
# skopeo copy docker://docker.io/ollama/ollama:0.3.8-rocm docker://localhost:5000/ollama:0.3.8-rocm
# podman pull localhost:5000/ollama:0.3.8-rocm

@Luap99
Copy link
Member

Luap99 commented Sep 2, 2024

@giuseppe Sure local registry seem to work fine for me as well. I used vllm/vllm-openai:latest for the test.

I am still on f39 in case it matters and use podman from main right now, pigz is pigz-2.8-2.fc39.x86_64 and I can confirm it uses it by checking the process list.

@giuseppe
Copy link
Member

giuseppe commented Sep 2, 2024

it seems to be a pigz issue:

$ skopeo copy docker://docker.io/ollama/ollama:0.3.8-rocm oci:/var/tmp/ollama
$ pigz -d < /var/tmp/ollama/blobs/sha256/1db71b1d7c67607172266cd839e3e429bf523aab5b0df4761fc0d05ec55dc727 > /dev/null
pigz: skipping: <stdin>: corrupted -- crc32 mismatch

$ gzip --test /var/tmp/ollama/blobs/sha256/1db71b1d7c67607172266cd839e3e429bf523aab5b0df4761fc0d05ec55dc727 && ok
ok

@giuseppe
Copy link
Member

giuseppe commented Sep 2, 2024

opened an issue for pigz: madler/pigz#123

@andersensam
Copy link

Also seeing the same with vllm/vllm-openai:v0.5.5. Was able to pull the vllm/vllm-openai:latest image a few weeks back without any issues. Also FC40

$ rpm -qf /usr/bin/pigz
pigz-2.8-4.fc40.x86_64

Exact same error:

Error: writing blob: storing blob to file "/var/tmp/container_images_storage2414770981/2": pigz: skipping: <stdin>: corrupted -- crc32 mismatch: exit status 1

@waffshappen
Copy link
Author

Given that this might be a bigger issue you can work around this issue on fedora as user with

sudo ln -s /usr/bin/gzip /usr/local/bin/pigz

Which will un-parallel pigz back to regular gzip, which works flawlessly.

Once a fix is out you can
sudo unlink /usr/local/bin/pigz
to revert this.

@Colonial-Dev
Copy link

sudo ln -s /usr/bin/gzip /usr/local/bin/pigz

For users on ostree systems like Silverblue, you can use sudo ostree admin unlock to temporarily make the root filesystem writable and apply this fix. (Add the --hotfix flag if you want the changes to persist across reboots.)

brianmcarey added a commit to brianmcarey/project-infra that referenced this issue Sep 11, 2024
There are a number of jobs that are failing to pull images due to
corruption and a crc32mismatch[1]

There is an open issue for this problem in the podman repo[2] that
suggests this change as a workaround.

This workaround should be removed when the related issues are resolved.

[1] https://prow.ci.kubevirt.io/view/gs/kubevirt-prow/pr-logs/pull/kubevirt_kubevirt/12677/pull-kubevirt-e2e-k8s-1.31-sig-compute/1833657514537783296#1:build-log.txt%3A262
[2] containers/podman#23822
[3] zlib-ng/zlib-ng#1772

Signed-off-by: Brian Carey <[email protected]>
kubevirt-bot pushed a commit to kubevirt/project-infra that referenced this issue Sep 11, 2024
There are a number of jobs that are failing to pull images due to
corruption and a crc32mismatch[1]

There is an open issue for this problem in the podman repo[2] that
suggests this change as a workaround.

This workaround should be removed when the related issues are resolved.

[1] https://prow.ci.kubevirt.io/view/gs/kubevirt-prow/pr-logs/pull/kubevirt_kubevirt/12677/pull-kubevirt-e2e-k8s-1.31-sig-compute/1833657514537783296#1:build-log.txt%3A262
[2] containers/podman#23822
[3] zlib-ng/zlib-ng#1772

Signed-off-by: Brian Carey <[email protected]>
@giuseppe
Copy link
Member

the issue was fixed upstream in zlib-ng zlib-ng/zlib-ng#1773

So closing as there is nothing more to do on our side

@p5
Copy link

p5 commented Sep 16, 2024

FYI - today's rounds of updates in Silverblue 40 brought zlib-ng 2.1.7-1.fc40 -> 2.1.7-2.fc40. There's a changelog reference in Koji pointing to rhbz#2307237 being fixed, so I'm hopeful

@stale-locking-app stale-locking-app bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Dec 16, 2024
@stale-locking-app stale-locking-app bot locked as resolved and limited conversation to collaborators Dec 16, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

No branches or pull requests

6 participants