Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mount=type=cache more in-depth explanation? #1673

Open
peci1 opened this issue Sep 5, 2020 · 47 comments
Open

mount=type=cache more in-depth explanation? #1673

peci1 opened this issue Sep 5, 2020 · 47 comments

Comments

@peci1
Copy link

peci1 commented Sep 5, 2020

Hi, I'm trying to use cache mounts to speed up things like apt install, pip install or ccache when rebuilding my containers.

For each of these, I prefix each RUN command using apt/pip/ccache... with things like:

RUN --mount=type=cache,target=/var/cache/apt,rw --mount=type=cache,target=/var/lib/apt,rw  \

That seems to work sometimes, but I found out I can't understand the cache invalidation rules. The cache just from time to time disappears and I have to download all the cached packages again. I made explicit effort to remove all apt clean commands from the dockerfile, so the cache is not deleted programatically. And it also happens to CCache, which is not autodeleted.

Could somebody please explain how does it work and when exactly can the cache be reused, when (if at all) is it discarded, where is it saved...? What is the exact effect of the id parameter?

I also found out that if I want to create a cache directory for a non-root user, I can create it with cache mount option uid=1000. But if I use uid=1000,gid=1000, the folder gets ownership root:root, which is really weird. Can it be connected with me not using id and setting some caches with uid=0 and some with uid=1000? Does this create some kind of conflict?

And is there actually a good reason for setting the mode to anything else than 777? If it's only used during build, I don't get why anybody would care about permissions...

Thank you for helping.

Maybe some of the answers could be added to https://github.com/moby/buildkit/blob/master/frontend/dockerfile/docs/experimental.md ?

@peci1
Copy link
Author

peci1 commented Sep 5, 2020

And is it affected by the --no-cache build option?

@YesYouKenSpace
Copy link

If I may hijack this, I like to ask if RUN --mount=type=cache can use the cache image in --cache-from ?

@morlay
Copy link
Contributor

morlay commented Sep 24, 2020

confused for me too.

--cache-from and --cache-to do nothing for --mount=type=cache docker/buildx#399

--mount=type=cache cached dir in buildkit container /var/lib/buildkit/ until gc cleanup.

@FernandoMiguel
Copy link
Collaborator

I have quite a few cache examples here https://github.com/FernandoMiguel/Buildkit

I suspect your issue is GC which by default is very small.
also --no-cache will not use cache for that run

@morlay
Copy link
Contributor

morlay commented Sep 24, 2020

@FernandoMiguel
use remote buildkit could resolve by increasing gc limits. i setup it for my private projects.

but some ci like Travis or GitHub Workflows

buildkit is always created. --mount=type=cache cached files in container, will be lost.

@FernandoMiguel
Copy link
Collaborator

Yes for any ephemeral host, cache is lost.
That is expected and the correct behaviour.

You can use cache-from to pull layers from a image repo in that case

@morlay
Copy link
Contributor

morlay commented Sep 24, 2020

close docker/buildx#399 and discussing here.

@FernandoMiguel

any suggestion in this case?.
i need cache /go/pkg/mod for multi workflows in build stage.

but the --cache-to couldn't expose the cached files (/var/lib/buildkit) to host.
and buildkit recreated when each workflow starts, all cached files in /var/lib/buildkit losts

buildx not provide way to mount host path for /var/lib/buildkit when create buildkit

@FernandoMiguel
Copy link
Collaborator

No idea what your Dockerfile looks like or what ci you are using
That makes all the difference
Some ci will allow you to cache host data

@morlay
Copy link
Contributor

morlay commented Sep 24, 2020

@FernandoMiguel
Copy link
Collaborator

Type cache won't obviously work on github actions since every run is done a new host.
You can run your own host and control that, or hack a way to store the cache (github limits are too low for any real use)

@morlay
Copy link
Contributor

morlay commented Sep 24, 2020

@FernandoMiguel
mount host path like /tmp/buildkit to buildkit to /var/lib/buildkit, then i add /tmp/buildkit to actions cache.
could this be possible?

@FernandoMiguel
Copy link
Collaborator

What

@morlay
Copy link
Contributor

morlay commented Sep 24, 2020

docker buildx create --use --name localbuild
docker buildx inspect localbuild --bootstrap

# recreate buildkit with host path
docker rm -f buildx_buildkit_localbuild0
docker run --privileged -d --name=buildx_buildkit_localbuild0 -v=/tmp/buildkit:/var/lib/buildkit moby/buildkit:buildx-stable-1         

now /tmp/buildkit in host contains buildkit files /var/lib/buildkit
i think i could cache this folder /tmp/buildkit

@FernandoMiguel it's not work with lots of permission error for whole path.

hope --cache-from --cache-to could support --mount=cache.

or add a option to path to assign the snapshots host path --mount=type=cache,path=/tmp/gomod,target=/go/pkg/mod

or replace the snapshot id of dirname with the id of --mount=type=cache,id=gomod,target=/go/pkg/mod
/var/lib/buildkit/runc-overlayfs/snapshots/snapshots/2/go/pkg/mod
/var/lib/buildkit/runc-overlayfs/snapshots/snapshots/gomod/go/pkg/mod

@FernandoMiguel
Copy link
Collaborator

/tmp is a special folder and may have weird permissions

@morlay
Copy link
Contributor

morlay commented Sep 25, 2020

github actions use a non-root user, but /var/lib/buildkit/runc-overlayfs only for root user.

and /var/lib/buildkit/runc-overlayfs always in changing. i give up to use this way.
need find other hacks.

@Aposhian
Copy link

Since this thread has been hijacked for a different issue, @peci1's questions are still unanswered. What exactly does id do? What are the cache invalidation rules?

@ringerc
Copy link

ringerc commented May 25, 2022

It's notable that the docs in https://docs.docker.com/engine/reference/commandline/buildx_build/ don't mention the cache, too.

@spacedub
Copy link

spacedub commented Oct 1, 2022

@Aposhian

I just spent some time testing cache sharing modes (private, locked, shared) in buildkitd/buildctl - I do assume the behavior is the same with buildx.

Here is my understanding:

  • different ID or different mode -> different cache object - if you change the id (or the mode), you get a new empty cache mount
  • same ID and same mode -> by default, same cache object (including across unrelated, different Dockerfiles), EXCEPT in the following circumstances:
    • you are building with --no-cache, in which case you get a new clean mount, that will then be used for that id moving forward
    • you are using mode private and there is already another build already running using that exact mount, in which case you also get a new one (that similarly will be used by subsequent builds using that mount ID) - that last part is especially confusing

Note that the cache object is the same even if you change the mount path... only the ID+mode matters...

So, same ID and mode, the cache object should not "disappear" on its own...
... that is, unless:
a. you do a buildctl prune against your buildkitd, which destroys the cache objects
b. or you run your build with --no-cache, in which case all cache objects used by that build will be reset (including for other concurrent builds that are still in-flight)
c. or garbage collection decided to evict that cache entry
d. or you are using a "private" mount and starting your build while another build is already accessing the mount
e. OR... the apt-get configuration itself, inside your build is purposefully deleting the cache entries after a run

Furthermore, in the OP question, it looks to me like the sharing mode is shared (which is the default behavior).
For use with apt, this is probably wrong.
According to Docker documentation "apt needs exclusive access to its data" and their documentation suggests using locked instead https://github.com/moby/buildkit/blob/master/frontend/dockerfile/docs/reference.md#run---mounttypecache

Of course, if you are running a lot (a few?) concurrent builds on that buildkitd, locked will definitely slow them all down and make you loose some / most of the caching benefits in the first place...

Whether or not locked (or private) is enough in the apt world... If you have multiple different apt versions in different images using the same mount (id+sharingmode) for your apt, you might be in for trouble...

So, given OP used "shared":
f. maybe concurrent apt access to the cache mount makes apt drop all the data in some cases?

Furthermore, the OP mounts /var/cache/apt.
But if you look at cat /etc/apt/apt.conf.d/docker-clean inside the Debian official image, it is clear that this actually prevents any caching of the packages.
So, solely mounting into /var/cache/apt is useless.

Finally, mounting and reusing /var/lib/apt also seems useless to me.
The cache benefit is about 3 seconds on Debian (3.7s on empty cold start, vs. 0.7s once /var/lib/apt is populated), in the rare case where the build instruction itself is not cached.
And since you cannot trust the content of the cache folder... you cannot save yourself the apt-get update operation anyway in further instructions...

In a shell, if you want to use cache properly for apt:

  • use at least locked, possibly private but do NOT expect it to be actually private to this specific image you are building
  • use an ID that is truly unique to your Dockerfile or build process, unless you absolutely understand which other images you may be building that are using the same ID are not going to mess it up for your usage (different apt version for example)
  • do not cache /var/lib/apt - the unlikely cache benefit does not make much / any sense - use a tmpfs instead, if the objective is to minimize the amount of stuff that gets baked into your final image
  • do cache /var/cache/apt if you want (you do get the best bang for the buck here, by caching actual packages downloads), but be sure to ALSO configure apt to use it, as it is disabled in apt-conf.d in the official images (eg: that is option Dir::Cache)

Assuming you get the above right, the only remaining reasons for cache to disappear are a. b. or c from above (pretty much explicit prune or cache bust, or garbage collection).

Hope that helps.

@spacedub
Copy link

spacedub commented Oct 1, 2022

Thought I would clarify what happens when you do NOT use an id.

It will default to the target (eg: mount path).
So, in OP case, the cache for apt is shared, meaning:

  • it can be accessed concurrently by many different process
  • any other mount binding into /var/cache/apt with the same sharing mode will reuse that SAME cache object

@peci1
Copy link
Author

peci1 commented Oct 2, 2022

Thanks for the analysis.

But if you look at cat /etc/apt/apt.conf.d/docker-clean inside the Debian official image, it is clear that this actually prevents any caching of the packages.

I have this at the beginning of the dockerfile:

RUN sudo rm -f /etc/apt/apt.conf.d/docker-clean; echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' | sudo tee /etc/apt/apt.conf.d/keep-cache

So this should not interfere with the cache.

@peci1
Copy link
Author

peci1 commented Oct 2, 2022

And I do not build concurrently, so private/shared is not a problem for me.

@Aposhian
Copy link

Aposhian commented Oct 3, 2022

Thanks @spacedub!

That is really good to know that:

  • cache mounts are identified by id and mode only (not uid/gid?)
  • --no-cache creates a new cache mount that is used going forward (I had assumed it used no cache mount at all). That seems like something that would be good to document...

I think it would be interesting to experiment to see if sharing /var/cache/apt mounts between OS versions would cause problems. My first guess is not since the package downloads are specified by version and codename: for example: docker-ce_5%3a20.10.18~3-0~ubuntu-focal_amd64.deb

Why would you not cache /var/lib/apt, even if the gain is smaller relative to /var/cache/apt? Specifically /var/lib/apt/lists? Isn't that mainly caching work from apt-get update, fetching package lists from repositories? I personally like just making it a cache mount, and then I can skip the step of doing something like rm -rf /var/lib/apt/lists in my Dockerfile.

but be sure to ALSO configure apt to use it, as it is disabled in apt-conf.d in the official images (eg: that is option Dir::Cache)

I believe this is also circumvented by the one liner that @peci1 shared

@spacedub
Copy link

spacedub commented Oct 3, 2022

UID/GID I have not tested so I do not know (I will check later today)
That being said I would be weary of using the same id and sharing mode with different uids in different places - seems quite error prone / confusing to me.

About the "lists" part, if your purpose is to avoid having to rm, a tmpfs should give you the same benefits without the downsides of locking / waiting.
But then, if you are mostly building one thing at a time and mostly the same thing, "shared" is probably acceptable.

About the package cache - what happens if a different package with that same name has been downloaded in that location? Will apt verify the checksum of that (and then what happens next) or is apt verifying the checksum before, at download time? Will different Debian based distributions (or different versions of the same distribution) use different package names and versions for the same things, or can they conflict?
Short of having clarity on these... I would be careful using the same share across unrelated Dockerfiles...

@fkhantsi
Copy link

fkhantsi commented Jan 20, 2023

Thanks, but this doesn't help me. I do want to use cache for better performance -- specifically for local maven repository.

Here is what I am trying to do: use a multi stage Dockerfile to first build my .war in a maven container, and then deploy it to a tomcat container. The build process should be identical on my dev machine, and in my CodeBuild CI/CD environment.

Without any optimization, I would be rebuilding all of the images, and downloading all of my maven dependencies on every build.

First optimization, which is not relevant to this discussion, is to publish the build image to a container registry, and build using cache-from. This will cache the layers, however if I change my pom.xml (maven's dependency file for those who don't know), the layer will be invalidated, and it will have to rebuild it, and redownload all the dependencies from maven central, instead of just one.

The solution for this is using --mount=type=cache target=/root/.m2 Unfortunately, since the host is ephemeral, the cache will be destroyed. I want to persist the cache between builds by saving it to s3, something that codebuild supports, by asking only which directory in the host to store between builds.

So which directory do I give it? /var/lib/buildkit? /var/lib/docker/buildkit/cache.db or something else?

@spacedub
Copy link

Thanks for clarifying.
Why not use --mount=type=bind then (instead of --mount=type=cache)?

Pretty much gives you what you are asking for, and then you can just save that specific folder anyway you want.

@fkhantsi
Copy link

fkhantsi commented Jan 20, 2023

because it's for reading from host, not writing. Even if you make it read-write, the writes are discarded

@spacedub
Copy link

spacedub commented Jan 20, 2023

Yep, you are right. Mixing things up with LLB...

To your question, in my case here using buildkit + the containerd worker without docker, the state is probably kept in /var/lib/containerd/io.containerd.snapshotter.v1.overlayfs or more likely I would have to copy the entire /var/lib/containerd and /var/lib/buildkit folder...

Location will clearly depend on how you are launching buildkitd, and obviously which worker you are using (eg: you might want to look into buildkitd root argument: --root value path to state directory (default: "/var/lib/buildkit")).

Hope that helps.

@creative-resort
Copy link

For which duration is a (cache) mount valid and available in a Dockerfile?

Is it possible to have it mounted just for one RUN command,
or to unmount it a couple of layers later?

@Aposhian
Copy link

For which duration is a (cache) mount valid and available in a Dockerfile?

A cache mount is only mounted for the RUN command that you specify it for. Subsequent RUN commands without the cache mount specified will just have an empty folder where it was mounted, if I remember correctly.

@codethief
Copy link

codethief commented Apr 14, 2023

@Aposhian Do you happen to know in how far the contents of the mount cache affect the caching of the RUN command's layer? Are they used as a cache "key", i.e. whenever the mount cache changes, the image layer cache will be busted and the RUN command will be executed anew?

@Aposhian
Copy link

I don't think so. I don't think that checksums for the contents of cache mounts are computed. They are more just like volume mounts for building, for caching common things like package/asset downloading or compiler caching, in my experience.

miraclx added a commit to near/mpc that referenced this issue Apr 17, 2023
downside is this doesn't play well with

BuildKit's cache-to, cache-from. affecting our ability to push cache to GH

moby/buildkit#1673
miraclx added a commit to near/mpc that referenced this issue Apr 18, 2023
downside is this doesn't play well with

BuildKit's cache-to, cache-from. affecting our ability to push cache to GH

moby/buildkit#1673
@ye7iaserag
Copy link

Why isn't it possible to have a build bind mount that persists and keeping the persist defaulting to false like the rw option?
This would be so helpful especially for pip packages like torch that can reach a couple of GBs with cuda also it would solve a long living issue with all WSL2 users where the virtual disk size can only grow bigger and at some point are forced to purge and rebuild images which then will not find any caches to use.

@alex-statsig
Copy link

+1, Layer caching does not work very well for optimizing small changes to a build process. Ex. if one new package must be installed, layer caching is entirely invalidated; mount=type=cache allows persisting the package store so the incremental install is much cheaper. Or similarly, with nextjs there is a build artifact ".next/cache" which is supposed to be persisted in future runs to reduce client bundle thrash and improve build speed.

It sounds like most people expected --cache-from type=regsitry to allow loading these mounts from a cache image. I'm not sure why that wouldn't be supported.

@ye7iaserag
Copy link

ye7iaserag commented Oct 28, 2023

Yes, but then you can't keep that cache between purges, it's maintained by docker, you can't back it up or move it...
I explained things more in here moby/moby#15771 (comment)

@homerjam
Copy link

homerjam commented Nov 1, 2023

Switching from docker build to docker buildx build and it seems like caching is completely absent now 🤷 ie. every time I build it's downloading the FROM image and executing every RUN - previously a build would be cached and finish in seconds

@brianbraunstein
Copy link

It looks like there's at least 1 more dimension included in the "true" cache ID, the from field.

The value given appears to effectively be translated to the underlying stage/image's hash before being used in the ID tuple. This means if you give different names pointing to the same stage/image, the cache will be shared.

I've attached my playground directory which demonstrates the result. One could probably tweak this to dig around and reverse engineer more relevant dimensions.
true_cache_id.tar.gz

It would be great if this was documented rather than requiring reverse engineering the behavior. It would feel less dodgy depending on documented behavior rather than discovered behavior.

@spacedub
Did you get around to checking uid and gid?

Gripe:
I find this behavior a little bit of a pity. If you want a different cache, then just provide a different ID, right? This behavior is more confusing, particularly when not documented. Plus I already have a use case where 2 separate images could properly seed the same cache. It's easy enough to work around though.

@mfittko
Copy link

mfittko commented Nov 30, 2023

Mounting folders from the host to the builder (just like using --volume when running a container) would be the killer feature, no? Guess that's at least what some people here expected to find but were disappointed that RUN --mount doesn't exactly provide this.

Being able to use a persistent cache that could be stored on EFS or similar would be so cool! I mean most people build their production images on a CI service. Unfortunately the only slightly comparable option is to add a cache image for the build stages which makes things overly complex and which also needs to be re-built from time to time 🤷‍♂️

Anyone having experiences on using a dedicated builder container running on ECS or similar that could be used for caching the RUN --mounts for a longer period of time so it could be re-used for subsequent builds in ephemeral CI environments?

Really looking for some build file caching option in addition to the regular layer cache option!

@FeryET
Copy link

FeryET commented Dec 14, 2023

How does --mount=type=cache affect the final image size? Will the mounted cache target get removed from the created container once the image is built?

@jedevc
Copy link
Member

jedevc commented Dec 19, 2023

How does --mount=type=cache affect the final image size? Will the mounted cache target get removed from the created container once the image is built?

@FeryET yup, that's correct, the cache is not part of any layer in the final image - it's just a local helper to speed up builds.

@yukinakanaka
Copy link

FYI about sharing

The buildkit document says

One of shared, private, or locked. Defaults to shared. A shared cache mount can be used concurrently by multiple writers. private creates a new mount if there are multiple writers. locked pauses the second writer until the first one releases the mount.

And, to understand the above specific behaviors, I just tested three cache sharing modes (shared, locked, private).

I hope those are useful to someone.

@kuchaguangjie
Copy link

I think this short answer from SO explains it clearly:
https://stackoverflow.com/a/76351422/1568658

@ryanotella
Copy link

ryanotella commented Aug 8, 2024

On Docker for Desktop on macOS, I found id and mode ineffective, but uid worked correctly.

# pip example
RUN useradd -u 1001 -m app -d /app
...
USER app
RUN --mount=type=cache,target=/app/.cache/pip,uid=1001 pip install -r requirements.txt

@rkarp
Copy link

rkarp commented Aug 13, 2024

I've just tested changing the uid, gid and mode parameters to --mount=type=cache, and modifying either of them causes the cache to be empty, indicating that you get a new cache. So they do seem to be part of the effective ID.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests