Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Documentation]: Recommended way to handle yarn and docker #749

Closed
neophob opened this issue Oct 11, 2016 · 24 comments
Closed

[Documentation]: Recommended way to handle yarn and docker #749

neophob opened this issue Oct 11, 2016 · 24 comments

Comments

@neophob
Copy link

neophob commented Oct 11, 2016

Goal of this issue is to update the readme file how to use yarn and docker.

I want a docker image to build my node projects - and use yarn to install the packages as the installation is much faster (and more deterministic) than using npm.

One reason why yarn is fast is of course the local yarn cache. So the docker image needs to mount the yarn cache directory when building the projects. Any other hints how to use docker and yarn?

@Daniel15
Copy link
Member

For actually installing Yarn in a Docker or Vagrant image, you can use the Debian package repo (assuming you're using a Ubuntu or Debian Docker image). Enabling the package repo then doing apt-get install yarn will also install Node.js as a dependency.

As for mounting the cache, that's a pretty good idea, I'm not too sure how to do it though (I'm not very familiar with Docker myself).

@kyteague
Copy link

You wouldn't mount the Yarn cache directory. Instead, you should make sure you take advantage of Docker's image layer caching.

These are the commands I am using:

COPY package.json yarn.lock ./
RUN yarn --pure-lockfile

@ChristopherBiscardi
Copy link

It would depend on the environment and approach you want to take to building your assets.

repeated docker build

@kyteague's approach is one you could take if you didn't want to use a global cache and instead just cache the project's dependencies in a higher docker layer. (ie: if your development is going to be running docker build over and over without changing dependencies). If you change the package.json, you lose the cache at the higher layer and have to do a full reinstall.

manual docker run

A more sophisticated approach for development is to run a development container (node:6 or similar with yarn installed) and mount the cache in to do the install. note that the following uses a .docker-yarn-cache intended to be used with docker because libs like node-sass can have c-lib issues if you install them on OSX and then try to use them on Debian, etc. Something like:

docker run -itv ~/.docker-yarn-cache:/root/yarn-cache -v `pwd`:/opt/project --workdir /opt/project node:6 yarn

Typically I combine the docker run approach with some bash and volume mount caches in. Then at the end I copy my assets out of the container and ship them to S3, etc or COPY them into my production docker image.

yarn on host

You could also yarn on a host if it's similar to your container OS (debian/debian for example) and then write your Dockerfile to COPY the node_modules folder in with the rest of the project. This would allow you to have access to you host's .yarn-cache for speed and then you don't have to deal with installing in the docker build.

docker-compose (development)

If your project has a "watch mode" script, you can use a docker-compose file to alleviate some of the concerns of the "repeated docker build" approach by running something along the lines of yarn && yarn run watch as the command with the same volume mounts as the "manual docker run" approach.


So really it depends on your goals and build environment ("watch" development with Docker for Mac/CI from a dev image to an alpine-based prod image/etc).

@Daniel15
Copy link
Member

@kyteague and @ChristopherBiscardi, great comments! I think it would be valuable to add a page to the documentation around best practices for using Yarn in Docker. Would you like to write a page about "Using Yarn in Docker" for our documentation? The website is in a separate repo: https://github.com/yarnpkg/website

@daveisfera
Copy link

Ideally, docker could use an external cache but there's some resistance to that ( moby/moby#17745 ), so adding a --no-cache option would be good so the docker image isn't made needlessly large by a cache that won't ever be used.

@rstuven
Copy link

rstuven commented Oct 13, 2016

This is the fastest setup I've get so far as yarn-cache can be reused by many containers:

Dockerfile

FROM node:6.7.0

RUN curl -o- -L https://yarnpkg.com/install.sh | bash

RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app

ARG NODE_ENV
ENV NODE_ENV $NODE_ENV

COPY . /usr/src/app

EXPOSE 8000

ENTRYPOINT ["sh", "./entrypoint.sh"]

CMD ["node", "./server"]

entrypoint.sh

$HOME/.yarn/bin/yarn install --pure-lockfile
exec "$@"

docker-compose.yml

app:
  build: .
  volumes_from:
    - yarn-cache

yarn-cache:
  image: busybox
  volumes:
    - /root/.yarn-cache

@daveisfera
Copy link

@rstuven That will do the yarn install at run time rather than build time and means that the docker image is not self contained/fully reproducible (which is the reason why we and I believe many others use docker).

@rstuven
Copy link

rstuven commented Oct 13, 2016

@daveisfera Yes, that's why I stressed on its "fastest" quality. I missed to point out this is rather for development workflow where fast iterations matter most. On the other hand, the yarn.lock file should guarantee the reproducible aspect, but yes, it's not enough.

@rstuven
Copy link

rstuven commented Oct 14, 2016

@jeffijoe
Copy link

Am I wrong to assume that Yarn's global cache stores the package zips (which contain cross-platform code by which I mean it will be compiled at install-time)?

If the global cache only contains the downloaded archives and no build artifacts, would we not be able to at least mount the host's global cache so that the Docker container wouldn't have to download them?

@elaijuh
Copy link

elaijuh commented Mar 6, 2017

@kyteague --pure-lockfile will not generate yarn.lock which means if I have changed package.json and rebuild the image, the old yarn.lock will be copied into image and not sync with the modified package.json?

@zoidyzoidzoid
Copy link

We use rocker which allows build-time mounting of a yarn-cache with a custom MOUNT command that doesn't commit the cache to the final Docker image, while using the smart Docker build layer caching as well.

@bestander
Copy link
Member

Looks like there are plenty of ways now.
If anyone wants to submit a good way to do this, feel free to send a PR for the docs website https://github.com/yarnpkg/website

@komlevv
Copy link

komlevv commented Aug 27, 2017

another way is to mitm yarn traffic with caching proxy and self-signed cert using cafile option
here's a crude example: https://github.com/komlevv/docker-squid-cache
it has 2 services: caching proxy and root certificate server

  • this works during build stage - you don't lose the cache when package.json changes
  • other services can also take advantage of the cache, not limited to yarn

@koistya
Copy link

koistya commented Nov 16, 2017

For production

COPY package.json yarn.lock ./
RUN yarn install --frozen-lockfile --no-cache --production

Note: You don't want dev dependencies in a production image, also you need to make sure that Yarn's cache folder is not bundled into the image.

For test and CI

COPY package.json yarn.lock ./
RUN yarn install --frozen-lockfile

Note: In a test / CI environment you still want to install NPM modules via Docker builder in order to utilize Docker layer caching. The next time your image is being built on a CI server, these two steps will be skipped in favor of using an existing layer, unless either package.json or yarn.lock was changed.

For local development

COPY package.json yarn.lock ./

Note: In development mode (locally) it would be faster to install NPM modules at run-time, this way you can attach a volume with Yarn cache to your container.


The approach above can be implemented by using a single Dockerfile:

FROM node:8.9.1-alpine

ARG NODE_ENV=production
ENV NODE_ENV=$NODE_ENV

# Set a working directory
WORKDIR /usr/src/app

# Install native dependencies
# RUN set -ex; \
#   apk add --no-cache ...

# Install Node.js dependencies
COPY package.json yarn.lock ./
RUN set -ex; \
  if [ "$NODE_ENV" = "production" ]; then \
    yarn install --no-cache --frozen-lockfile --production; \
  elif [ "$NODE_ENV" = "test" ]; then \
    yarn install --no-cache --frozen-lockfile; \
  fi;

...

Note: It's better to install native dependencies, if any, via a separate RUN command coming before yarn install.

docker-compose.yml:
version: '3'

volumes:
  yarn:

services:
  api:
    image: api
    build:
      context: ./
      args:
        NODE_ENV: "development"
    volumes:
      - yarn:/home/node/.cache/yarn
      - ./src:/usr/src/app/src
      - ./package.json:/usr/src/app/package.json
      - ./yarn.lock:/usr/src/app/yarn.lock
    ...

Source Node.js API Starter Kit - Node.js ❤ GraphQL

@zoidyzoidzoid
Copy link

You can also do it in a multi stage Dockerfile for production, like the following for something that runs as a static front end and doesn't need node/yarn at runtime:

FROM node:alpine
WORKDIR /usr/src/app
COPY . /usr/src/app/

# We don't need to do this cache clean, I guess it wastes time / saves space: https://github.com/yarnpkg/rfcs/pull/53
RUN set -ex; \
  yarn install --frozen-lockfile --production; \
  yarn cache clean; \
  yarn run build

FROM nginx:alpine
WORKDIR /usr/share/nginx/html
COPY --from=0 /usr/src/app/build/ /usr/share/nginx/html

Note: Maybe with --no-cache since that seems to be added now and then we can skip the cache clean.

Source

@johnculviner
Copy link

The only way I can find to not have an extra 100MB of cache is to do this on latest version of yarn (1.5.1).

RUN yarn install --frozen-lockfile --production && yarn cache clean

@alecmev
Copy link

alecmev commented Jun 4, 2018

Just in case, there's no --no-cache, not yet. So yarn cache clean for now.

@ezpuzz
Copy link

ezpuzz commented Aug 2, 2018

yarnpkg/rfcs#53 (comment)

from this comment I like the concise nature of using /dev/shm as a volatile storage of the cache

chrisroos added a commit to Crown-Commercial-Service/crown-marketplace that referenced this issue Oct 8, 2018
This is a modified version of the changes in PR #2.

I copied the installation arguments from this GitHub issue[1].

[1]: yarnpkg/yarn#749 (comment)
@jedwards1211
Copy link

jedwards1211 commented Jan 14, 2019

@jeremejevs to prevent the yarn cache from winding up in docker layers, we would need to have them together in a single RUN yarn install && yarn cache clean command in our Dockerfile, right?
I don't know for sure but I assume after a RUN yarn cache clean command on its own would just mark the cache dir as deleted in a new docker layer, but the earlier RUN yarn install layer would still contain the entire cache.

@alecmev
Copy link

alecmev commented Jan 26, 2019

@jedwards1211 That is correct, yes.

@worldspawn
Copy link

A tad off topic but I'm doing RUN yarn install --frozen-lockfile --production --no-cache && yarn build and I get a big delay before it transitions to the next layer. If I appended && rm -rf ./node_modules to the command (as I have no need of them after build) would that reduce the delay/produce a leaner layer?

@jedwards1211
Copy link

AFAICT yarn still doesn't have a --no-cache option

@daveisfera
Copy link

Yes, yarn cache clean is the only way to avoid putting the cache in your layer. There's also an experimental feature that allows you to mount a cache directory during build (but I haven't had much success with making it work effectively yet, and multi-stage builds have a lot of promise but are still difficult to use

alee added a commit to virtualcommons/port-of-mars that referenced this issue Apr 28, 2020
we may want to add --frozen-lockfile to our yarn installs as well as run
yarn clean cache to reduce image size

e.g., in production:

`yarn install --frozen-lockfile --production && yarn cache clean`

https://stackoverflow.com/questions/44552348/should-i-commit-yarn-lock-and-package-lock-json-files

yarnpkg/yarn#749
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests