Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kind load docker-image fails, but nodes believe image exists #2402

Closed
JLHasson opened this issue Aug 2, 2021 · 25 comments
Closed

kind load docker-image fails, but nodes believe image exists #2402

JLHasson opened this issue Aug 2, 2021 · 25 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/not-reproducible Indicates an issue can not be reproduced as described.
Milestone

Comments

@JLHasson
Copy link

JLHasson commented Aug 2, 2021

What happened:
I was following the quickstart and attempting to load an image I built locally into my Kind cluster. When I run kind load docker-image my-image:tag the command fails with

kind load docker-image api:test-kind
Image: "api:test-kind" with ID "sha256:721afbc9e128b90662115740df0afd5cda80d33ee743d6cdf1abe1106b098317" not yet present on node "kind-control-plane", loading...
ERROR: failed to load image: command "docker exec --privileged -i kind-control-plane ctr --namespace=k8s.io images import -" failed with error: exit status 1
Command Output: unpacking docker.io/library/api:test-kind (sha256:dd7e3f38c29dacc07dff7d561e0f894ab8e7bbc3200b1f6c374ade47e19586b5)...ctr: content digest sha256:87c8a1d8f54f3aa4e05569e8919397b65056aa71cdf48b7f061432c98475eee9: not found

however, subsequent runs of kind load docker-image my-image:tag result in the image being "present on all nodes":

kind load docker-image api:test-kind
Image: "api:test-kind" with ID "sha256:721afbc9e128b90662115740df0afd5cda80d33ee743d6cdf1abe1106b098317" found to be already present on all nodes.

Attempting to apply a manifest using this image results in CreateContainerError:

$ kubectl get pods
NAME                   READY   STATUS                 RESTARTS   AGE
api-56876d68b6-7mp6t   0/1     CreateContainerError   0          3s

$ kubectl describe pod api-56876d68b6-7mp6t
...
Warning  Failed     12s (x2 over 13s)  kubelet            Error: failed to create containerd container: error unpacking image: failed to resolve rootfs: content digest sha256:721afbc9e128b90662115740df0afd5cda80d33ee743d6cdf1abe1106b098317: not found

What you expected to happen:
Kind should load my local image successfully.

How to reproduce it (as minimally and precisely as possible):

$ cat repro.sh
#!/bin/bash

kind delete cluster
kind create cluster
kubectl config get-contexts

cat > Dockerfile <<EOF
FROM tiangolo/uvicorn-gunicorn-fastapi:python3.8-slim

COPY ./app /app
EOF

mkdir -p app
cat > app/main.py <<EOF
from fastapi import FastAPI

app = FastAPI()


@app.get("/")
def read_root():
    return {"Hello": "World!"}
EOF

docker build -t fastapi-test:latest .

kind load docker-image fastapi-test:latest  # fails
kind load docker-image fastapi-test:latest  # succeeds

# cleanup
rm -r app
rm Dockerfile
docker image rm fastapi-test

$ ./repro.sh
...
# first call
Image: "fastapi-test:latest" with ID "sha256:08454bdfe6f40f8c1cfd5e1234de319aa0e1b4e6b1b4ac13f183320fc27b7120" not yet present on node "kind-control-plane", loading...
ERROR: failed to load image: command "docker exec --privileged -i kind-control-plane ctr --namespace=k8s.io images import -" failed with error: exit status 1
Command Output: unpacking docker.io/library/fastapi-test:latest (sha256:a6b359e4e43b14667f91079f2bbc102c9f228f376f37de627415de287b1890b5)...ctr: content digest sha256:87c8a1d8f54f3aa4e05569e8919397b65056aa71cdf48b7f061432c98475eee9: not found

# second call
Image: "fastapi-test:latest" with ID "sha256:08454bdfe6f40f8c1cfd5e1234de319aa0e1b4e6b1b4ac13f183320fc27b7120" found to be already present on all nodes.
...

Anything else we need to know?:
I was also trying to use skaffold with Kind and it was producing the same error. Running skaffold dev the first time would fail at

$ skaffold dev
...
Starting deploy...
Loading images into kind cluster nodes...
 - api:58f6832b421fbc0a592454a1a8fce96dbfde35c850c27b05e191f904af93d6b9 -> Failed
loading images into kind nodes: unable to load image "api:58f6832b421fbc0a592454a1a8fce96dbfde35c850c27b05e191f904af93d6b9" into cluster: running [kind load docker-image --name kind api:58f6832b421fbc0a592454a1a8fce96dbfde35c850c27b05e191f904af93d6b9]

and then the next run would succeed at getting past this step but would fail during container creation:

Loading images into kind cluster nodes...
 - api:58f6832b421fbc0a592454a1a8fce96dbfde35c850c27b05e191f904af93d6b9 -> Found
Images loaded in 48.927667ms
 - deployment.apps/api configured
 - service/api configured
Waiting for deployments to stabilize...
 - deployment/api: Failed: Error: failed to create containerd container: error unpacking image: failed to resolve rootfs: content digest sha256:58f6832b421fbc0a592454a1a8fce96dbfde35c850c27b05e191f904af93d6b9: not found

Environment:

  • kind version: (use kind version): kind v0.11.1 go1.16.4 darwin/arm64
  • Kubernetes version: (use kubectl version):
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.2", GitCommit:"092fbfbf53427de67cac1e9fa54aaa09a28371d7", GitTreeState:"clean", BuildDate:"2021-06-16T12:59:11Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"darwin/arm64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.1", GitCommit:"5e58841cce77d4bc13713ad2b91fa0d961e69192", GitTreeState:"clean", BuildDate:"2021-05-21T23:06:30Z", GoVersion:"go1.16.4", Compiler:"gc", Platform:"linux/arm64"}
  • Docker version: (use docker info):
$ docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Build with BuildKit (Docker Inc., v0.5.1-docker)
  compose: Docker Compose (Docker Inc., v2.0.0-beta.6)
  scan: Docker Scan (Docker Inc., v0.8.0)

Server:
 Containers: 5
  Running: 2
  Paused: 0
  Stopped: 3
 Images: 22
 Server Version: 20.10.7
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runtime.v1.linux runc io.containerd.runc.v2
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: d71fcd7d8303cbf684402823e425e9dd2e99285d
 runc version: b9ee9c6314599f1b4a7f497e1f1f856fe433d3b7
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 5.10.25-linuxkit
 Operating System: Docker Desktop
 OSType: linux
 Architecture: aarch64
 CPUs: 4
 Total Memory: 7.765GiB
 Name: docker-desktop
 ID: B3O3:X434:CIJJ:RGWM:KTQL:DI5A:LYAN:THWY:IHDJ:MFLM:BTTA:OQGD
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 HTTP Proxy: http.docker.internal:3128
 HTTPS Proxy: http.docker.internal:3128
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
  • OS (e.g. from /etc/os-release): macOS 11.4
@JLHasson JLHasson added the kind/bug Categorizes issue or PR as related to a bug. label Aug 2, 2021
@BenTheElder
Copy link
Member

So basically what kind does is:

For each image:
1. Inspect the image with docker to see what the digest is.
2. Check if that digest is present in containerd/CRI on the node(s)

For the images not present:
3. docker save image(s) from the host
4. ctr images import the tarball into each node's runtime.

All of this depends on the fact that docker images are identifiable by the hash/digest of the json config that contains the entrypoint, image names, and the hashes of the layers in the image. https://github.com/moby/moby/blob/master/image/spec/v1.2.md#image-json-description

If docker saves an image without the layer present or containerd fails to import the layer but still creates the image entry, I don't know if kind can do much about this. This seems likely to be a bug in either docker or containerd.

loading images into kind nodes: unable to load image "api:58f6832b421fbc0a592454a1a8fce96dbfde35c850c27b05e191f904af93d6b9" into cluster: running [kind load docker-image --name kind api:58f6832b421fbc0a592454a1a8fce96dbfde35c850c27b05e191f904af93d6b9]

@BenTheElder
Copy link
Member

I've not seen this issue so far -- Since we don't have your /app we can't quite fully reproduce this example, do you see this with any other images or a more minimal reproducer?

I suspect we're going to have to track this to a bug in docker or containerd. Containerd version at least kind is responsible for and we upgrade regularly.

We could start by mimicing the docker save of this same image and inspecting the result to see if the layer reported missing by the image load is in fact present or not. You can just untar the saved image and look at the contents in your local file broweser, there should be a 721afbc9e128b90662115740df0afd5cda80d33ee743d6cdf1abe1106b098317/layer.tar IIRC, if not then docker save is the culprit.

@JLHasson
Copy link
Author

JLHasson commented Aug 2, 2021

Thanks for the reply. Did you try the repro.sh script I put under the "How to reproduce it" section? It should create a app/ that reproduces (at least on my machine):

mkdir -p app
cat > app/main.py <<EOF
from fastapi import FastAPI

app = FastAPI()


@app.get("/")
def read_root():
    return {"Hello": "World!"}
EOF

@JLHasson
Copy link
Author

JLHasson commented Aug 2, 2021

FWIW I don't get the issue with this example:

$ cat Dockerfile
FROM debian:buster-slim

CMD ["sleep", "9999"]

$ docker build -t sleepy:latest .

$ kind load docker-image sleepy:latest
Image: "sleepy:latest" with ID "sha256:5cdc87a60cc668c0d9a48f70539ae21fc96259575a58ff76dff411e34931bdf8" not yet present on node "kind-control-plane", loading...

I've tried several different versions of the tiangolo/uvicorn-gunicorn-fastapi images and they all lead to the same error. Is it possible there's some issue with the base image's layers?

@JLHasson
Copy link
Author

JLHasson commented Aug 2, 2021

You can just untar the saved image and look at the contents in your local file broweser, there should be a 721afbc9e128b90662115740df0afd5cda80d33ee743d6cdf1abe1106b098317/layer.tar IIRC, if not then docker save is the culprit.

Alright, I did the following:

$ docker save fastapi-test:latest > fastapi-test-latest.tar
$ kind load docker-image fastapi-test:latest -v 1000

ERROR: failed to load image: command "docker exec --privileged -i kind-control-plane ctr --namespace=k8s.io images import -" failed with error: exit status 1
Command Output: unpacking docker.io/library/fastapi-test:latest (sha256:a6b359e4e43b14667f91079f2bbc102c9f228f376f37de627415de287b1890b5)...ctr: content digest sha256:87c8a1d8f54f3aa4e05569e8919397b65056aa71cdf48b7f061432c98475eee9: not found

$ tar -xvzf fastapi-test-latest.tar

$ ls | grep 87c8a1d8f54f3aa4e05569e8919397b65056aa71cdf48b7f061432c98475eee9
# returns nothing

So I guess the resulting image is missing some layer that Kind expects?

Full expanded tar if you're interested:
tar -xvzf fastapi-test-latest.tar
x 08454bdfe6f40f8c1cfd5e1234de319aa0e1b4e6b1b4ac13f183320fc27b7120.json
x 0b57f58e7ae17029a3c2311b3074c11b1ee404ac72dd05ce4d2666c98f9889b9/
x 0b57f58e7ae17029a3c2311b3074c11b1ee404ac72dd05ce4d2666c98f9889b9/VERSION
x 0b57f58e7ae17029a3c2311b3074c11b1ee404ac72dd05ce4d2666c98f9889b9/json
x 0b57f58e7ae17029a3c2311b3074c11b1ee404ac72dd05ce4d2666c98f9889b9/layer.tar
x 0dbc882ee757142340d098e2990632d4497e3b3bdab6a7b03960d236799d48c1/
x 0dbc882ee757142340d098e2990632d4497e3b3bdab6a7b03960d236799d48c1/VERSION
x 0dbc882ee757142340d098e2990632d4497e3b3bdab6a7b03960d236799d48c1/json
x 0dbc882ee757142340d098e2990632d4497e3b3bdab6a7b03960d236799d48c1/layer.tar
x 12f575862babad73b7f45128213869da793e8f343cd08c9b2fe702a5d3ec7468/
x 12f575862babad73b7f45128213869da793e8f343cd08c9b2fe702a5d3ec7468/VERSION
x 12f575862babad73b7f45128213869da793e8f343cd08c9b2fe702a5d3ec7468/json
x 12f575862babad73b7f45128213869da793e8f343cd08c9b2fe702a5d3ec7468/layer.tar
x 27f8b10532fb50306c5ef617e73d839339e1fdd0c5f78c67039e2e91cf8fcff7/
x 27f8b10532fb50306c5ef617e73d839339e1fdd0c5f78c67039e2e91cf8fcff7/VERSION
x 27f8b10532fb50306c5ef617e73d839339e1fdd0c5f78c67039e2e91cf8fcff7/json
x 27f8b10532fb50306c5ef617e73d839339e1fdd0c5f78c67039e2e91cf8fcff7/layer.tar
x 43a5c27a897878502ffd2530aae0e73b66a0eaaebbdf41a9575ffaf4bd968565/
x 43a5c27a897878502ffd2530aae0e73b66a0eaaebbdf41a9575ffaf4bd968565/VERSION
x 43a5c27a897878502ffd2530aae0e73b66a0eaaebbdf41a9575ffaf4bd968565/json
x 43a5c27a897878502ffd2530aae0e73b66a0eaaebbdf41a9575ffaf4bd968565/layer.tar
x 72bc90bf2e1e685c7bb4f816292a7d7f0ca4109354d8614bf53edcbf2a1a0b42/
x 72bc90bf2e1e685c7bb4f816292a7d7f0ca4109354d8614bf53edcbf2a1a0b42/VERSION
x 72bc90bf2e1e685c7bb4f816292a7d7f0ca4109354d8614bf53edcbf2a1a0b42/json
x 72bc90bf2e1e685c7bb4f816292a7d7f0ca4109354d8614bf53edcbf2a1a0b42/layer.tar
x 7611940348b8c4920e8c891eaff00680e1a3f9087f77b9cfeeceee33a922f685/
x 7611940348b8c4920e8c891eaff00680e1a3f9087f77b9cfeeceee33a922f685/VERSION
x 7611940348b8c4920e8c891eaff00680e1a3f9087f77b9cfeeceee33a922f685/json
x 7611940348b8c4920e8c891eaff00680e1a3f9087f77b9cfeeceee33a922f685/layer.tar
x 9c0807e94b4e84ce26d75036f9b416f4a5cb62a9c58e5399997988549115cf8a/
x 9c0807e94b4e84ce26d75036f9b416f4a5cb62a9c58e5399997988549115cf8a/VERSION
x 9c0807e94b4e84ce26d75036f9b416f4a5cb62a9c58e5399997988549115cf8a/json
x 9c0807e94b4e84ce26d75036f9b416f4a5cb62a9c58e5399997988549115cf8a/layer.tar
x a6c4b57a4f8a36ce2587a43a5b522291987c5e985a01316e238519fe9021ac54/
x a6c4b57a4f8a36ce2587a43a5b522291987c5e985a01316e238519fe9021ac54/VERSION
x a6c4b57a4f8a36ce2587a43a5b522291987c5e985a01316e238519fe9021ac54/json
x a6c4b57a4f8a36ce2587a43a5b522291987c5e985a01316e238519fe9021ac54/layer.tar
x b1ef77b50ae641c3d11080bdd86053f73b6397a31c0b47d4fc04bb94e7315824/
x b1ef77b50ae641c3d11080bdd86053f73b6397a31c0b47d4fc04bb94e7315824/VERSION
x b1ef77b50ae641c3d11080bdd86053f73b6397a31c0b47d4fc04bb94e7315824/json
x b1ef77b50ae641c3d11080bdd86053f73b6397a31c0b47d4fc04bb94e7315824/layer.tar
x b22fc732c022e173edc776531699169dff7a1108bb9a246e81bd16b9498aabe7/
x b22fc732c022e173edc776531699169dff7a1108bb9a246e81bd16b9498aabe7/VERSION
x b22fc732c022e173edc776531699169dff7a1108bb9a246e81bd16b9498aabe7/json
x b22fc732c022e173edc776531699169dff7a1108bb9a246e81bd16b9498aabe7/layer.tar
x c9fce6fa119065d0826e3a6cb315ada2ef80f1b630e47af2f2d3cd904d90159e/
x c9fce6fa119065d0826e3a6cb315ada2ef80f1b630e47af2f2d3cd904d90159e/VERSION
x c9fce6fa119065d0826e3a6cb315ada2ef80f1b630e47af2f2d3cd904d90159e/json
x c9fce6fa119065d0826e3a6cb315ada2ef80f1b630e47af2f2d3cd904d90159e/layer.tar
x d45aba65c8c47d3916160194832dc6528479d2f34619e12255d65b33fd4147d8/
x d45aba65c8c47d3916160194832dc6528479d2f34619e12255d65b33fd4147d8/VERSION
x d45aba65c8c47d3916160194832dc6528479d2f34619e12255d65b33fd4147d8/json
x d45aba65c8c47d3916160194832dc6528479d2f34619e12255d65b33fd4147d8/layer.tar
x d975bf1d9c6058403e6fb1e0a9c979c173b519d6b34605dd91b62cb6fde008bf/
x d975bf1d9c6058403e6fb1e0a9c979c173b519d6b34605dd91b62cb6fde008bf/VERSION
x d975bf1d9c6058403e6fb1e0a9c979c173b519d6b34605dd91b62cb6fde008bf/json
x d975bf1d9c6058403e6fb1e0a9c979c173b519d6b34605dd91b62cb6fde008bf/layer.tar
x ee515fead54710f8cab971606fbcc6ce90e538494b2076bb725f572f5f5b05f1/
x ee515fead54710f8cab971606fbcc6ce90e538494b2076bb725f572f5f5b05f1/VERSION
x ee515fead54710f8cab971606fbcc6ce90e538494b2076bb725f572f5f5b05f1/json
x ee515fead54710f8cab971606fbcc6ce90e538494b2076bb725f572f5f5b05f1/layer.tar
x manifest.json
x repositories

@BenTheElder
Copy link
Member

Thanks for the reply. Did you try the repro.sh script I put under the "How to reproduce it" section? It should create a app/ that reproduces (at least on my machine):

Sorry I had not yet and missed that bit. I've tested it now and sadly at least initially it does not repro:

+ kind load docker-image fastapi-test:latest
Image: "fastapi-test:latest" with ID "sha256:6ecf0858e5af96b455496e303381cc2198dc58a636236a05a99953920294a3ec" not yet present on node "kind-control-plane", loading...
+ kind load docker-image fastapi-test:latest
Image: "fastapi-test:latest" with ID "sha256:6ecf0858e5af96b455496e303381cc2198dc58a636236a05a99953920294a3ec" found to be already present on all nodes.
+ rm -r app
+ rm Dockerfile
+ docker image rm fastapi-test
Untagged: fastapi-test:latest
Deleted: sha256:6ecf0858e5af96b455496e303381cc2198dc58a636236a05a99953920294a3ec

So I guess the resulting image is missing some layer that Kind expects?

Well not kind per se, but containerd. Tentatively this smells like a bug in docker.

This may not repro for me due to the docker version or some other reason, currently I have 20.10.6.

@JLHasson
Copy link
Author

JLHasson commented Aug 4, 2021

This may not repro for me due to the docker version or some other reason, currently I have 20.10.6.

Interestingly I downgraded to 20.10.6 and still get the same error 🤯

$ ./repro.sh
...
ERROR: failed to load image: command "docker exec --privileged -i kind-control-plane ctr --namespace=k8s.io images import -" failed with error: exit status 1
Command Output: unpacking docker.io/library/fastapi-test:0.1 (sha256:2eb2f799bb7dfdeedcddfe4399f7b12a2abda8725226446e4a8a00209bd38a8c)...ctr: content digest sha256:1b3ee35aacca9866b01dd96e870136266bde18006ac2f0d6eb706c798d1fa3c3: not found
Image: "fastapi-test:0.1" with ID "sha256:77a653dee0c46bb404f2796ded986b4956206b00efce2698f0b06e55fbb3b481" found to be already present on all nodes.

$ docker --version
Docker version 20.10.6, build 370c289

@JLHasson
Copy link
Author

JLHasson commented Aug 4, 2021

Also, are there two issues here?

  1. My Docker isn't saving/loading the image correctly
  2. After a failed load, kind believes that the image exists and will proceed as if things are OK?

@BenTheElder
Copy link
Member

BenTheElder commented Aug 4, 2021

kind believes the image exists because containerd reports it exists. I think that does seem like a containerd bug but I don't think there's much we can reasonably do here about it, we'd need to fix that upstream.

I'd like to repro either before filing bugs against those projects myself but I haven't been able to yet and I'll be on some other tasks for a bit (like the Kubernetes release today).

If we can identify those for reproducing with their developers we can then upgrade containerd in kind when it is patched, though we continually keep up to date anyhow.

@BenTheElder BenTheElder added the triage/not-reproducible Indicates an issue can not be reproduced as described. label Sep 10, 2021
@LittleChimera
Copy link

LittleChimera commented Sep 16, 2021

Had the same issue using M1 Mac. I'm guessing that building from an image that's not built for arm corrupts something. After I built my image from the locally built base image, load went through fine.

@AaronME
Copy link

AaronME commented Oct 6, 2021

We've seen this error loading images on M1 Mac as well. Our colleagues on amd64 hardware are not able to reproduce using the same apps/configurations.

@BenTheElder
Copy link
Member

Do you maybe have an amd64 layer in the image? Containerd validates images more strictly than docker has, AIUI.

@gastonqiu
Copy link

gastonqiu commented Oct 20, 2021

I have the same issue on my M1 Mac. I use rosetta 2 to open my terminal. However the kind node I create use kind create cluster is still arm64. You can check node arch version with docker exec -it ${kind-worker-node} uname -m. When I try to load amd64 image into the arm64 node, the exactly error happened. I solve the problem by adding --platform flag to build arm64 version images docker build -f ./Dockerfile --platform arm64 then I can finally successfully load images into my kind cluster.

@BenTheElder
Copy link
Member

BenTheElder commented Nov 10, 2021

kind could perhaps inspect images for the correct architecture itself to warn when there is a bad layer.

We already have something related to workaround problems with older kubernetes builds when creating the node images, but, this is also something I'd hope would clear up with time as the issue isn't limited to kind (images with this issue should also affect real containerd-on-arm64 clusters pulling them e.g. say in EKS on graviton)

@vanpelt
Copy link

vanpelt commented Mar 2, 2022

I'm running into the same problem. I'm building an image for amd64 on my M1 Mac and I'm unable to import the image. This same image runs fine in the kind cluster when it's pulled from a registry. Anyone looking for a quick workaround, follow the instructions here to run kind with a local docker registry and just tag and push to that as you're doing local dev.

Really curious what the underlying bug is here. I imagine it's in the ctr images import logic. I found this thread in the containerd repo, I imagine kind is running ctr > 1.4.12?

@zaunist
Copy link
Contributor

zaunist commented Mar 2, 2022

I have same error too when I use M1 Mac, I solved this problem by compiling my own image of the arm64 architecture.

@Haegi
Copy link

Haegi commented Mar 2, 2022

As alternative you could also import the the image manually using:

docker save <image> | docker exec --privileged -i kind-control-plane ctr --namespace=k8s.io images import --all-platforms -

This worked fine on my M1 Mac machine.

@chang48
Copy link

chang48 commented Jun 30, 2022

I have the same issue on my M1 Mac. I use rosetta 2 to open my terminal. However the kind node I create use kind create cluster is still arm64. You can check node arch version with docker exec -it ${kind-worker-node} uname -m. When I try to load amd64 image into the arm64 node, the exactly error happened. I solve the problem by adding --platform flag to build arm64 version images docker build -f ./Dockerfile --platform arm64 then I can finally successfully load images into my kind cluster.

My mac mini M1 is having the same issue. Followed your tips and found out indeed the cluster built is arrch64. I rebuilt my docker image into the same platform arrch64 and low and behold I was able to load the image successfully.
Thanks @gastonqiu!

Screen Shot 2022-06-30 at 2 03 32 PM

@NicoK
Copy link

NicoK commented Nov 15, 2022

I can consistently reproduce this locally on x86_64 with kind 0.17.0 (also on 0.14.0) and kindest/node:v1.22.15 base node image when using multiple images in the command (all of them have the same base image but not sure that's relevant/required to reproduce) like

> kind load docker-image --name test imagea:latest imageb:latest imagec:latest imaged:latest
Image: "" with ID "sha256:a123..." not yet present on node "test-control-plane", loading...
Image: "" with ID "sha256:b123..." not yet present on node "test-control-plane", loading...
Image: "" with ID "sha256:c123..." not yet present on node "test-control-plane", loading...
Image: "" with ID "sha256:d123..." not yet present on node "test-control-plane", loading...
ERROR: failed to load image: command "docker exec --privileged -i test-control-plane ctr --namespace=k8s.io images import --all-platforms --digests --snapshotter=overlayfs -" failed with error: exit status 1
Command Output: ctr: image "imagea:latest": already exists

-> could it be a race condition importing these images somewhere?


I found two workarounds:

  • using the underlying commands as proposed above:
    docker save imagea:latest imageb:latest imagec:latest imaged:latest | docker exec --privileged -i test-control-plane ctr --namespace=k8s.io images import --all-platforms --digests --snapshotter=overlayfs -
    
  • importing images one after another which would (needlessly) copy the base image over and over again instead of having this in a single tar (but I'm not sure whether kind is even doing that optimisation - is it?)

@cpuguy83
Copy link

cpuguy83 commented Jan 3, 2023

This is due to containerd garbage collecting layers it doesn't think it needs since the platform is non-native to the system.
By default ctr is only retaining layers (and config) for the native platform. ctr's --all-platforms on import retains all layers.

Either kind should just assume we want to keep all layers and handle that accordingly or provide a --platform flag for loading non-native platforms.

@BenTheElder
Copy link
Member

We use --all-platforms in kind load ... since v0.17.0

@cpuguy83
Copy link

cpuguy83 commented Jan 4, 2023

I can take a look, but I just fired up 0.17 yesterday and ran into the issue. I knew immediately the problem but not if there was some way to deal with it in kind and wound up here.

As a work-around I am manually execing into the node container and running ctr myself.

@cpuguy83
Copy link

cpuguy83 commented Jan 4, 2023

I can see what's tagged in v0.17 should be using --all-platforms: 3f6f231

@cpuguy83
Copy link

cpuguy83 commented Jan 4, 2023

Oh, I am absolutely silly... I was fetching a local version of kind to make sure it was the version I want for my pipeline but then was just using the kind from $PATH which is older...

This should be totally fixed in v0.17.

@BenTheElder
Copy link
Member

Thanks for investigating this and confirming!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/not-reproducible Indicates an issue can not be reproduced as described.
Projects
None yet
Development

No branches or pull requests