Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error: failed to solve: failed to read dockerfile: failed to mount lhzivmrs3pheot21kx3b24aix: snapshot lhzivmrs3pheot21kx3b24aix does not exist: not found #2288

Open
Qoooooooooooo opened this issue Jul 30, 2021 · 6 comments

Comments

@Qoooooooooooo
Copy link

Qoooooooooooo commented Jul 30, 2021

service start:

Jul 30 18:06:13 hbase systemd[1]: Starting buildkitd.service...
Jul 30 18:06:13 hbase buildkitd[11991]: time="2021-07-30T18:06:13+08:00" level=warning msg="using host network as the default"
Jul 30 18:06:13 hbase buildkitd[11991]: time="2021-07-30T18:06:13+08:00" level=info msg="found worker "2shqc0nug0fd56wyrj8ylc9z7", labels=map[foo:bar org.mobyproject.buildkit.worker.containerd.namespace:k8s.io org.mobyproject.bui...
Jul 30 18:06:13 hbase buildkitd[11991]: time="2021-07-30T18:06:13+08:00" level=warning msg="platform linux/arm64 cannot pass the validation, kernel support for miscellaneous binary may have not enabled."
Jul 30 18:06:13 hbase buildkitd[11991]: time="2021-07-30T18:06:13+08:00" level=info msg="found 1 workers, default="2shqc0nug0fd56wyrj8ylc9z7""
Jul 30 18:06:13 hbase buildkitd[11991]: time="2021-07-30T18:06:13+08:00" level=warning msg="currently, only the default worker can be used."
Jul 30 18:06:13 hbase buildkitd[11991]: time="2021-07-30T18:06:13+08:00" level=info msg="running server on /run/buildkit/buildkitd.sock"
Jul 30 18:06:13 hbase systemd[1]: Started buildkitd.service.
Hint: Some lines were ellipsized, use -l to show in full.

buildctl build
--frontend=dockerfile.v0
--local context=.
--local dockerfile=.
--output type=image,name=docker.io/username/image:tag

[+] Building 0.0s (1/2)
ERROR [internal] load build definition from Dockerfile 0.0s

[internal] load build definition from Dockerfile:

error: failed to solve: failed to read dockerfile: failed to mount lhzivmrs3pheot21kx3b24aix: snapshot lhzivmrs3pheot21kx3b24aix does not exist: not found

use systemd
ExecStart=/usr/local/bin/buildkitd --oci-worker=false --containerd-worker=true

@jamescook
Copy link

Seeing the same/similar error occasionally on Buildkit 0.9.0

#1 [internal] load build definition from Dockerfile
#1 sha256:22f92f0378c90a3920b1d29bf5901009012d76f28a9ddc6cb7e61669c2d15904
#1 DONE 0.0s
 
#2 [internal] load .dockerignore
#2 sha256:4e6d76c4707252898ade1a212bddb23426d826774aa9fba7c2f18b878a6d2ace
#2 DONE 0.0s
 
#2 [internal] load .dockerignore
#2 sha256:4e6d76c4707252898ade1a212bddb23426d826774aa9fba7c2f18b878a6d2ace
#2 ...
 
#1 [internal] load build definition from Dockerfile
#1 sha256:22f92f0378c90a3920b1d29bf5901009012d76f28a9ddc6cb7e61669c2d15904
#1 transferring dockerfile: 2.73kB done
#1 DONE 0.7s
 
#2 [internal] load .dockerignore
#2 sha256:4e6d76c4707252898ade1a212bddb23426d826774aa9fba7c2f18b878a6d2ace
#2 transferring context: 2B done
#2 DONE 1.0s
 
#3 [internal] load metadata for docker.io/library/node:14
#3 sha256:14632244d97b4bf292e1cd4fe957f41b9fab16d8b3cda342bc4c906db56791f6
#3 DONE 0.3s
 
#5 [base 1/7] FROM docker.io/library/node:14@sha256:cd98882c1093f758d09cf6821dc8f96b241073b38e8ed294ca1f9e484743858f
#5 sha256:dbefa5453b815ffdd5af215f1b20105f038add85eac8265964301efc909575f8
#5 resolve docker.io/library/node:14@sha256:cd98882c1093f758d09cf6821dc8f96b241073b38e8ed294ca1f9e484743858f
#5 ...
 
#7 [internal] load build context
#7 sha256:682626fd47d224a5e24945dcf398b0f5d0be6dc5ec397846d9cb4bf78c436a9a
#7 DONE 0.0s
 
#4 importing cache manifest from zzzz/cache:production-zzzz-yyyy-xxxx-master
#4 sha256:d76773a140cae22f8d7e7de82cfa97cde906e7e56f3a126af7bccaeeeea7d50c
#4 DONE 0.2s
 
#5 [base 1/7] FROM docker.io/library/node:14@sha256:cd98882c1093f758d09cf6821dc8f96b241073b38e8ed294ca1f9e484743858f
#5 sha256:dbefa5453b815ffdd5af215f1b20105f038add85eac8265964301efc909575f8
#5 resolve docker.io/library/node:14@sha256:cd98882c1093f758d09cf6821dc8f96b241073b38e8ed294ca1f9e484743858f 1.6s done
#5 DONE 1.7s
 
#7 [internal] load build context
#7 sha256:682626fd47d224a5e24945dcf398b0f5d0be6dc5ec397846d9cb4bf78c436a9a
#7 transferring context: 23B
#7 transferring context: 9.93MB 3.6s done
#7 DONE 10.2s
error: failed to solve: rpc error: code = Unknown desc = failed to mount m6q7l7jeufrz1nokxmqi29148: snapshot m6q7l7jeufrz1nokxmqi29148 does not exist: not found

@tonistiigi
Copy link
Member

@sipsma Any ideas?

@sipsma
Copy link
Collaborator

sipsma commented Aug 11, 2021

@sundong1982 @jamescook can you share any code that reproduces the error? Or at least more details on when it happens?

The mount error is coming from here. The Dockerfile error is happening here.

I looked into whether the change in solver/llbsolver/bridge.go to no longer call Finalize could be related as it gets hit when the Dockerfile is being read. The call to Finalize itself was a no-op due to the bug present in previous code, so I don't believe it could be a direct cause of these new errors, but I'm wondering if it could have had a side-effect by waiting for the cache record mutex to be locked, which may have inadvertently prevented a different race condition that has now been revealed. This is just speculation though, can't say for sure until it's reproducible.

@jamescook
Copy link

@sipsma I haven't noticed this particular error in the past week, but we get either buildkit errors like this issue and #2088 or crashes (#2296 and #2303) almost daily. We're building many images in parallel as part of our CI pipeline. The pipeline is building 4 docker images per architecture (amd64 and arm64) via buildctl over two buildkitd workers (1 for each architecture). Problems occur on either architecture.

My general observation is that the errors typically happen when there are multiple branches being built at once in the CI pipeline. I'm not sure what code I can/should share - can you clarify what you're looking for? Thanks for reaching out.

@markmandel
Copy link

Just here to say that I'm seeing a very similar issue. We are also building amd64, arm64 and windows images concurrently, and get the following issues: on Docker 20.x.x (doesn't seem to be an issue on Docker 19.x.x)

DOCKER_CLI_EXPERIMENTAL=enabled docker buildx build --platform windows/amd64 -f /home/mark/workspace/agones/cmd/sdk-server/Dockerfile.windows --tag=us-docker.pkg.dev/agones-mark-dev/images/agones-sdk:1.22.0-3f00aa6-windows_amd64-ltsc2019 --build-arg WINDOWS_VERSION=ltsc2019  /home/mark/workspace/agones/cmd/sdk-server/
WARNING: No output specified for docker-container driver. Build result will only remain in the build cache. To push result image into registry use --push or to load image into docker use --load
[+] Building 10.2s (2/3)
 => [internal] booting buildkit                                                                                                                9.7s
 => => pulling image moby/buildkit:buildx-stable-1                                                                                             8.8s
 => => creating container buildx_buildkit_windows-builder0                                                                                     0.9s
 => ERROR [internal] load build definition from Dockerfile.windows                                                                             0.0s
------
 > [internal] load build definition from Dockerfile.windows:
------
error: failed to solve: failed to read dockerfile: snapshot  does not exist: not found
make: *** [Makefile:442: build-agones-sdk-image-windows-ltsc2019] Error 1

If you want to attempt to replicate, grab a copy of https://github.com/googleforgames/agones and go to the build folder and run make -j 4 build-images, and it should fail for you. (it passes once the images are cached though).

Happy to share more info if useful.

@rcmelendez
Copy link

Sharing my solution if someone experiences the same error.

I noticed that my buildkit image was a bit outdated, so I just pulled the latest version (docker pull moby/buildkit:buildx-stable-1). Then I reran my build process and it worked!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants