-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kind load docker-image fails, but nodes believe image exists #2402
Comments
So basically what kind does is: For each image: For the images not present: All of this depends on the fact that docker images are identifiable by the hash/digest of the json config that contains the entrypoint, image names, and the hashes of the layers in the image. https://github.com/moby/moby/blob/master/image/spec/v1.2.md#image-json-description If docker saves an image without the layer present or containerd fails to import the layer but still creates the image entry, I don't know if kind can do much about this. This seems likely to be a bug in either docker or containerd.
|
I've not seen this issue so far -- Since we don't have your I suspect we're going to have to track this to a bug in docker or containerd. Containerd version at least kind is responsible for and we upgrade regularly. We could start by mimicing the |
Thanks for the reply. Did you try the
|
FWIW I don't get the issue with this example:
I've tried several different versions of the |
Alright, I did the following: $ docker save fastapi-test:latest > fastapi-test-latest.tar
$ kind load docker-image fastapi-test:latest -v 1000
ERROR: failed to load image: command "docker exec --privileged -i kind-control-plane ctr --namespace=k8s.io images import -" failed with error: exit status 1
Command Output: unpacking docker.io/library/fastapi-test:latest (sha256:a6b359e4e43b14667f91079f2bbc102c9f228f376f37de627415de287b1890b5)...ctr: content digest sha256:87c8a1d8f54f3aa4e05569e8919397b65056aa71cdf48b7f061432c98475eee9: not found
$ tar -xvzf fastapi-test-latest.tar
$ ls | grep 87c8a1d8f54f3aa4e05569e8919397b65056aa71cdf48b7f061432c98475eee9
# returns nothing So I guess the resulting image is missing some layer that Kind expects? Full expanded tar if you're interested:
|
Sorry I had not yet and missed that bit. I've tested it now and sadly at least initially it does not repro: + kind load docker-image fastapi-test:latest
Image: "fastapi-test:latest" with ID "sha256:6ecf0858e5af96b455496e303381cc2198dc58a636236a05a99953920294a3ec" not yet present on node "kind-control-plane", loading...
+ kind load docker-image fastapi-test:latest
Image: "fastapi-test:latest" with ID "sha256:6ecf0858e5af96b455496e303381cc2198dc58a636236a05a99953920294a3ec" found to be already present on all nodes.
+ rm -r app
+ rm Dockerfile
+ docker image rm fastapi-test
Untagged: fastapi-test:latest
Deleted: sha256:6ecf0858e5af96b455496e303381cc2198dc58a636236a05a99953920294a3ec
Well not kind per se, but containerd. Tentatively this smells like a bug in docker. This may not repro for me due to the docker version or some other reason, currently I have |
Interestingly I downgraded to 20.10.6 and still get the same error 🤯
|
Also, are there two issues here?
|
kind believes the image exists because containerd reports it exists. I think that does seem like a containerd bug but I don't think there's much we can reasonably do here about it, we'd need to fix that upstream. I'd like to repro either before filing bugs against those projects myself but I haven't been able to yet and I'll be on some other tasks for a bit (like the Kubernetes release today). If we can identify those for reproducing with their developers we can then upgrade containerd in kind when it is patched, though we continually keep up to date anyhow. |
Had the same issue using M1 Mac. I'm guessing that building from an image that's not built for arm corrupts something. After I built my image from the locally built base image, load went through fine. |
We've seen this error loading images on M1 Mac as well. Our colleagues on amd64 hardware are not able to reproduce using the same apps/configurations. |
Do you maybe have an amd64 layer in the image? Containerd validates images more strictly than docker has, AIUI. |
I have the same issue on my M1 Mac. I use rosetta 2 to open my terminal. However the kind node I create use |
kind could perhaps inspect images for the correct architecture itself to warn when there is a bad layer. We already have something related to workaround problems with older kubernetes builds when creating the node images, but, this is also something I'd hope would clear up with time as the issue isn't limited to kind (images with this issue should also affect real containerd-on-arm64 clusters pulling them e.g. say in EKS on graviton) |
I'm running into the same problem. I'm building an image for amd64 on my M1 Mac and I'm unable to import the image. This same image runs fine in the kind cluster when it's pulled from a registry. Anyone looking for a quick workaround, follow the instructions here to run kind with a local docker registry and just tag and push to that as you're doing local dev. Really curious what the underlying bug is here. I imagine it's in the |
I have same error too when I use M1 Mac, I solved this problem by compiling my own image of the arm64 architecture. |
As alternative you could also import the the image manually using:
This worked fine on my M1 Mac machine. |
My mac mini M1 is having the same issue. Followed your tips and found out indeed the cluster built is |
I can consistently reproduce this locally on x86_64 with kind 0.17.0 (also on 0.14.0) and
-> could it be a race condition importing these images somewhere? I found two workarounds:
|
This is due to containerd garbage collecting layers it doesn't think it needs since the platform is non-native to the system. Either kind should just assume we want to keep all layers and handle that accordingly or provide a |
We use |
I can take a look, but I just fired up 0.17 yesterday and ran into the issue. I knew immediately the problem but not if there was some way to deal with it in kind and wound up here. As a work-around I am manually execing into the node container and running ctr myself. |
I can see what's tagged in v0.17 should be using |
Oh, I am absolutely silly... I was fetching a local version of kind to make sure it was the version I want for my pipeline but then was just using the This should be totally fixed in v0.17. |
Thanks for investigating this and confirming! |
What happened:
I was following the quickstart and attempting to load an image I built locally into my Kind cluster. When I run
kind load docker-image my-image:tag
the command fails withhowever, subsequent runs of
kind load docker-image my-image:tag
result in the image being "present on all nodes":Attempting to apply a manifest using this image results in
CreateContainerError
:What you expected to happen:
Kind should load my local image successfully.
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
I was also trying to use
skaffold
with Kind and it was producing the same error. Runningskaffold dev
the first time would fail atand then the next run would succeed at getting past this step but would fail during container creation:
Environment:
kind version
): kind v0.11.1 go1.16.4 darwin/arm64kubectl version
):docker info
):/etc/os-release
): macOS 11.4The text was updated successfully, but these errors were encountered: