-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mount=type=cache more in-depth explanation? #1673
Comments
And is it affected by the |
If I may hijack this, I like to ask if |
confused for me too.
|
I have quite a few cache examples here https://github.com/FernandoMiguel/Buildkit I suspect your issue is GC which by default is very small. |
@FernandoMiguel but some ci like Travis or GitHub Workflows buildkit is always created. |
Yes for any ephemeral host, cache is lost. You can use cache-from to pull layers from a image repo in that case |
close docker/buildx#399 and discussing here. any suggestion in this case?. but the buildx not provide way to mount host path for |
No idea what your Dockerfile looks like or what ci you are using |
here example workflow.yml https://github.com/querycap/istio/blob/master/.github/workflows/istio-pilot.yml
i understands how i have no idea to cache files in buildkit container |
Type cache won't obviously work on github actions since every run is done a new host. |
@FernandoMiguel |
What |
docker buildx create --use --name localbuild
docker buildx inspect localbuild --bootstrap
# recreate buildkit with host path
docker rm -f buildx_buildkit_localbuild0
docker run --privileged -d --name=buildx_buildkit_localbuild0 -v=/tmp/buildkit:/var/lib/buildkit moby/buildkit:buildx-stable-1 now @FernandoMiguel it's not work with lots of permission error for whole path. hope or add a option to or replace the snapshot id of dirname with the |
/tmp is a special folder and may have weird permissions |
github actions use a non-root user, but and |
Since this thread has been hijacked for a different issue, @peci1's questions are still unanswered. What exactly does |
It's notable that the docs in https://docs.docker.com/engine/reference/commandline/buildx_build/ don't mention the cache, too. |
I just spent some time testing cache sharing modes (private, locked, shared) in buildkitd/buildctl - I do assume the behavior is the same with buildx. Here is my understanding:
Note that the cache object is the same even if you change the mount path... only the ID+mode matters... So, same ID and mode, the cache object should not "disappear" on its own... Furthermore, in the OP question, it looks to me like the sharing mode is shared (which is the default behavior). Of course, if you are running a lot (a few?) concurrent builds on that buildkitd, Whether or not locked (or private) is enough in the apt world... If you have multiple different apt versions in different images using the same mount (id+sharingmode) for your apt, you might be in for trouble... So, given OP used "shared": Furthermore, the OP mounts Finally, mounting and reusing In a shell, if you want to use cache properly for
Assuming you get the above right, the only remaining reasons for cache to disappear are a. b. or c from above (pretty much explicit prune or cache bust, or garbage collection). Hope that helps. |
Thought I would clarify what happens when you do NOT use an id. It will default to the target (eg: mount path).
|
Thanks for the analysis.
I have this at the beginning of the dockerfile:
So this should not interfere with the cache. |
And I do not build concurrently, so private/shared is not a problem for me. |
Thanks @spacedub! That is really good to know that:
I think it would be interesting to experiment to see if sharing Why would you not cache
I believe this is also circumvented by the one liner that @peci1 shared |
UID/GID I have not tested so I do not know (I will check later today) About the "lists" part, if your purpose is to avoid having to rm, a tmpfs should give you the same benefits without the downsides of locking / waiting. About the package cache - what happens if a different package with that same name has been downloaded in that location? Will apt verify the checksum of that (and then what happens next) or is apt verifying the checksum before, at download time? Will different Debian based distributions (or different versions of the same distribution) use different package names and versions for the same things, or can they conflict? |
Thanks, but this doesn't help me. I do want to use cache for better performance -- specifically for local maven repository. Here is what I am trying to do: use a multi stage Dockerfile to first build my Without any optimization, I would be rebuilding all of the images, and downloading all of my maven dependencies on every build. First optimization, which is not relevant to this discussion, is to publish the build image to a container registry, and build using cache-from. This will cache the layers, however if I change my pom.xml (maven's dependency file for those who don't know), the layer will be invalidated, and it will have to rebuild it, and redownload all the dependencies from maven central, instead of just one. The solution for this is using --mount=type=cache target=/root/.m2 Unfortunately, since the host is ephemeral, the cache will be destroyed. I want to persist the cache between builds by saving it to s3, something that codebuild supports, by asking only which directory in the host to store between builds. So which directory do I give it? |
Thanks for clarifying. Pretty much gives you what you are asking for, and then you can just save that specific folder anyway you want. |
because it's for reading from host, not writing. Even if you make it read-write, the writes are discarded |
Yep, you are right. Mixing things up with LLB... To your question, in my case here using buildkit + the containerd worker without docker, the state is probably kept in Location will clearly depend on how you are launching buildkitd, and obviously which worker you are using (eg: you might want to look into buildkitd root argument: Hope that helps. |
For which duration is a (cache) mount valid and available in a Dockerfile? Is it possible to have it mounted just for one RUN command, |
A cache mount is only mounted for the |
@Aposhian Do you happen to know in how far the contents of the mount |
I don't think so. I don't think that checksums for the contents of cache mounts are computed. They are more just like volume mounts for building, for caching common things like package/asset downloading or compiler caching, in my experience. |
downside is this doesn't play well with BuildKit's cache-to, cache-from. affecting our ability to push cache to GH moby/buildkit#1673
downside is this doesn't play well with BuildKit's cache-to, cache-from. affecting our ability to push cache to GH moby/buildkit#1673
Why isn't it possible to have a build bind mount that persists and keeping the |
+1, Layer caching does not work very well for optimizing small changes to a build process. Ex. if one new package must be installed, layer caching is entirely invalidated; mount=type=cache allows persisting the package store so the incremental install is much cheaper. Or similarly, with nextjs there is a build artifact ".next/cache" which is supposed to be persisted in future runs to reduce client bundle thrash and improve build speed. It sounds like most people expected --cache-from type=regsitry to allow loading these mounts from a cache image. I'm not sure why that wouldn't be supported. |
Yes, but then you can't keep that cache between purges, it's maintained by docker, you can't back it up or move it... |
Switching from |
It looks like there's at least 1 more dimension included in the "true" cache ID, the The value given appears to effectively be translated to the underlying stage/image's hash before being used in the ID tuple. This means if you give different names pointing to the same stage/image, the cache will be shared. I've attached my playground directory which demonstrates the result. One could probably tweak this to dig around and reverse engineer more relevant dimensions. It would be great if this was documented rather than requiring reverse engineering the behavior. It would feel less dodgy depending on documented behavior rather than discovered behavior. @spacedub Gripe: |
Mounting folders from the host to the builder (just like using Being able to use a persistent cache that could be stored on EFS or similar would be so cool! I mean most people build their production images on a CI service. Unfortunately the only slightly comparable option is to add a cache image for the build stages which makes things overly complex and which also needs to be re-built from time to time 🤷♂️ Anyone having experiences on using a dedicated builder container running on ECS or similar that could be used for caching the Really looking for some build file caching option in addition to the regular layer cache option! |
How does |
@FeryET yup, that's correct, the cache is not part of any layer in the final image - it's just a local helper to speed up builds. |
FYI about
And, to understand the above specific behaviors, I just tested three cache sharing modes (shared, locked, private).
I hope those are useful to someone. |
I think this short answer from SO explains it clearly: |
On Docker for Desktop on macOS, I found
|
I've just tested changing the |
Hi, I'm trying to use cache mounts to speed up things like
apt install
,pip install
orccache
when rebuilding my containers.For each of these, I prefix each RUN command using apt/pip/ccache... with things like:
That seems to work sometimes, but I found out I can't understand the cache invalidation rules. The cache just from time to time disappears and I have to download all the cached packages again. I made explicit effort to remove all
apt clean
commands from the dockerfile, so the cache is not deleted programatically. And it also happens to CCache, which is not autodeleted.Could somebody please explain how does it work and when exactly can the cache be reused, when (if at all) is it discarded, where is it saved...? What is the exact effect of the
id
parameter?I also found out that if I want to create a cache directory for a non-root user, I can create it with cache mount option
uid=1000
. But if I useuid=1000,gid=1000
, the folder gets ownershiproot:root
, which is really weird. Can it be connected with me not usingid
and setting some caches withuid=0
and some withuid=1000
? Does this create some kind of conflict?And is there actually a good reason for setting the
mode
to anything else than777
? If it's only used during build, I don't get why anybody would care about permissions...Thank you for helping.
Maybe some of the answers could be added to https://github.com/moby/buildkit/blob/master/frontend/dockerfile/docs/experimental.md ?
The text was updated successfully, but these errors were encountered: