Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Save our build times" aka Add support for mounts during docker build #31499

Closed
wants to merge 2 commits into from

Conversation

rdsubhas
Copy link

@rdsubhas rdsubhas commented Mar 2, 2017

What this does

  • Allows --mounts to be specified during docker build
  • Example: docker build --mount type=bind,source=/hostpath,target=/containerpath ...
  • Various different use cases have been documented in build time only -v / --volume option #14080

Checklist

  • Fixes build time only -v / --volume option #14080 (eventually)
  • Initial barebones commit, adds and passes through mount flags in client & server
  • Dockerfile is able to ls the mount only while building, and does not exist in image metadata or when running
  • Use --mount instead of -v
  • Existing test suite passes, no breakage in any existing behavior
  • Design review and feedback, /build api, cli, etc
  • TBD: Add test cases
  • TBD: Add build cache invalidation (equivalent to --no-cache) if mount definitions are changed
  • TBD: Add docs, mount options are less documented in general, difference between docker run -v and docker build --mount, general warnings and disclaimers, how to invalidate cache, etc
  • TBD: Edge cases, file permissions when copying from volume, post-build leftovers (ideally none except an empty folder is present), running inside a container with docker.sock bind-mounted, volume drivers, mounts overlapping with ADD/COPY inside build context/workdir, etc

Signed-off-by: Subhas Dandapani [email protected]

Obligatory picture of how it feels like in the beginning. Slowly warming up. Great developer docs 👍 It was really shocking how simple this was to get working though. All these years of orchestrating docker build as much as (or sometimes more than) docker run for sake of performance and developer productivity feels like a lie. Hope we can move forward...

that works?

@rdsubhas rdsubhas changed the title "Save our networks" aka Add support for build-time volumes "Save our build times" aka Add support for build-time volumes Mar 2, 2017
@rdsubhas rdsubhas changed the title "Save our build times" aka Add support for build-time volumes "Save our build times" aka Add support for volumes during docker build Mar 2, 2017
@rdsubhas rdsubhas force-pushed the docker-build-volumes branch 2 times, most recently from c9fea15 to 66c8a12 Compare March 2, 2017 22:52
@thaJeztah
Copy link
Member

ping @tonistiigi @cpuguy83 PTAL

@rdsubhas
Copy link
Author

rdsubhas commented Mar 2, 2017

@thaJeztah this is not ready yet, I opened only for running the specs. Also I believe I misunderstood @cpuguy83's comments on #14080, will change from using volume to mounts. I'm on a different timezone so responses could come delayed changed -v to --mount, ready for design review

@AkihiroSuda
Copy link
Member

also, consideration for build cache is needed

@GordonTheTurtle GordonTheTurtle added the dco/no Automatically set by a bot when one of the commits lacks proper signature label Mar 3, 2017
@rdsubhas rdsubhas force-pushed the docker-build-volumes branch from 780daa0 to 0df7664 Compare March 3, 2017 11:25
@GordonTheTurtle GordonTheTurtle removed the dco/no Automatically set by a bot when one of the commits lacks proper signature label Mar 3, 2017
@rdsubhas rdsubhas changed the title "Save our build times" aka Add support for volumes during docker build "Save our build times" aka Add support for mounts during docker build Mar 3, 2017
@rdsubhas rdsubhas force-pushed the docker-build-volumes branch 2 times, most recently from 446f717 to f872798 Compare March 3, 2017 11:55
@rdsubhas
Copy link
Author

rdsubhas commented Mar 3, 2017

@thaJeztah @cpuguy83 @AkihiroSuda @tonistiigi this is now ready for design review please, before I go deep into covering test cases and code conventions, I'd like to make sure I'm on the right track 👍 This was the quickest path to opening up for feedback. Not sure that's how PRs to docker work, if it needs to be more feature complete before design review please let me know, I can do that as well. (btw - complete noob to docker codebase here)

@thaJeztah
Copy link
Member

Thanks! I added this for the upcoming maintainers meeting, but perhaps @tonistiigi has time to look before that. He's working on some other improvements to the builder, so we should verify this addition doesn't "conflict" with the other enhancements

@pyrossh
Copy link

pyrossh commented Mar 7, 2017

Don't know if this is useful. But it seems these guys have already solved it https://github.com/grammarly/rocker.

@GordonTheTurtle GordonTheTurtle added the dco/no Automatically set by a bot when one of the commits lacks proper signature label Mar 7, 2017
@rdsubhas rdsubhas force-pushed the docker-build-volumes branch from 9de69a6 to f4b7de5 Compare March 7, 2017 13:42
@GordonTheTurtle GordonTheTurtle removed the dco/no Automatically set by a bot when one of the commits lacks proper signature label Mar 7, 2017
@rdsubhas rdsubhas force-pushed the docker-build-volumes branch from f4b7de5 to 62a3984 Compare March 7, 2017 13:44
rdsubhas added 2 commits March 8, 2017 01:14
Signed-off-by: Subhas Dandapani <[email protected]>
Signed-off-by: Subhas Dandapani <[email protected]>
@rdsubhas rdsubhas force-pushed the docker-build-volumes branch from 62a3984 to f29e2ef Compare March 8, 2017 00:14
@alexellis
Copy link
Contributor

Rocker could be a good option for build-time mounts. Have you had a chance to check it out @rdsubhas ?

@tonistiigi
Copy link
Member

Hi @rdsubhas,

As you have probably already read, this has been discussed many times previously. There are many reasons why I would feel hesitant about adding such a feature. It makes builds not self-contained, encourages creating broken dockerfiles, builds can't be tracked to immutable sources, invalidates caching, breaks remote usage of Docker(as bind mounts are not part of API) etc.

Let's look at the actual use-cases behind this solutions:

Performance, mounting is faster than copying context when there are lots of files

I'm working on a prototype of incremental contexts sends and can share more details soon. This means that repeated builds should not perform worse when the size of the context increases.

Once this is in place and we have an immutable reference to the input data in daemon, we could think about mounting it to commands read-only if that makes more sense than ADD/COPY in some cases. This wouldn't create any of the problems raised earlier.

There are many other cases how the build performance could be improved. If you are interested in helping out in some of them we should discuss more.

In an earlier issue, there was also a comment about this being a workaround for smaller images. I've proposed #31257 for that.

Need to share secret data to a running process

There is an open PR #30637. The benefit of that PR instead of this one is that it is clearly limiting the use case to secret data what makes it harder to have a completely broken Dockerfile or misuse it in another way. Secrets are clearly defined and we could combine them with a Dockerfile directive in the future so that required secrets could be defined in a similar way ARG are defined atm. With this method, secrets are also immutable.

Another way to fix the secret problem would be to use an external service that provides trusted information (from swarm cluster, client, etc).

Need to include directory outside of build context

While this is also encouraging broken Dockerfiles, sometimes, especially while developing and overriding other projects, it seems helpful to not require everything to be in the same folder. A solution for that has been discussed in #30101 (comment) . It is quite similar to this PR but "mounts" the data to context instead of the container.

@tianon
Copy link
Member

tianon commented Mar 9, 2017

The main use case for something like this that sticks out to me (and was mentioned in #14080 but not explicitly here) is the case of Gentoo-based images.

For Gentoo, the "packaging" files are stored in /usr/portage, take up a very large amount of space, change often, and mirrors are unfriendly to (and will sometimes actively ban) folks who update too often. Being able to mount that directory from the host or from another image would be really useful for making images based on Gentoo actually feasible (and it's prohibitively large, so putting it in the context somehow is going to be painful too -- on my current host, /usr/portage is ~21GB, but if I exclude distfiles it's ~735MB).

Relatedly, while building packages, the source of the packages being built gets downloaded to /usr/portage/distfiles by default.

So in summary, the main use cases I see for "outside content" (based mostly on @tonistiigi's list above) are:

  1. cache data (gem/npm/pip cache files, Gentoo's distfiles, etc)

  2. secrets (SSH keys for fetching privileged code, etc)

  3. extra directories outside the build context (which I'd put into the following two sub-use-cases, which is the main divergence from @tonistiigi's list)

    1. more files needed by the build, such as shared common files (could be transmitted via context somehow)

    2. things like Gentoo's /usr/portage, which shouldn't necessarily be included in the final image, but are needed while building (and can be prohibitively large)

@pyrossh
Copy link

pyrossh commented Mar 10, 2017

@tonistiigi I agree that making docker build mutable and inconsistent is not right but no one in their right might would do such a build for a production image. In our projects we have a ton of node_modules and each time someone in the team changes one of our private packages we need to do a clean npm install again in our development docker container and that takes almost 30 minutes and we also get rate limited by npm many times for doing too many updates in a day. I tried all sorts of ways to cache the node_modules and even tried yarn to save the packages offline so that they will be available in the build context but to no avail. Even tried rocker and that also didn't work.

The only other way I see of solving this is running the npm install command in our existing container to update the packages and commit the container as the new latest image and that would solve our developer woes. So its okay if its not immutable since its development but can we have a command that does this or I need to write a python script to do this from now on.

Docker has eased our devops work but substantially increased the developers work and we get pounded by the developers all the time because of this.

@rdsubhas
Copy link
Author

rdsubhas commented Mar 10, 2017

Sorry for responding late, different timezones,

@alexellis Yes indeed we have checked out rocker, but wherever possible we'd like to be compliant with normal upstream docker, especially since its moved to shorter release cycles.

@tonistiigi I think @tianon's comment and others in #14080 have really collated a long list of issues. Some of us work in the corporate world where we don't really have the simplicity of doing short and sweet npm install/run or simple mvn package and java -jar. Some of us work with huge distributions. Some on low bandwidth internet connections, or low powered laptops and so on. There is a wealth of problems out here, and while we all appreciate idempotent builds, it doesn't seem to be a reality. Even normal docker builds are not really idempotent. Incremental build contexts is really nice, but again its only one part.

Maybe I don't have the right words to summarize this, but for us, this problem has grown beyond categorizing and into the "free-form-tagging" zone (most in #14080)

What we see is simple: "a clean Dockerfile builds image from sources". i.e. Docker Hub compatible is our benchmark. All the hacks, like a build orchestrator to do multi-container docker builds, rocker, docker-compose orchestration, etc - have all already broken the Dockerfile - i.e. one just can't do docker build on them anymore.

Atleast, docker with mounts gives us a more "predictable" stateless docker build process where we can improvise if a folder is empty (whether its bind-mounted or not), and we can have one Dockerfile that works idempotently whether a folder was mounted or not (i.e. trust developers with the last mile idempotency).

@thaJeztah
Copy link
Member

There's a proposal for --mount and secrets in multi-stage build; for those following this issue, please participate in the discussions there; #32507 #33343

@rdsubhas
Copy link
Author

@thaJeztah thanks! will close this...

@rdsubhas rdsubhas closed this Jun 15, 2017
@thaJeztah
Copy link
Member

@rdsubhas thanks! I didn't want to close because no decision was made yet, but would definitely appreciate your input on those proposals 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants