Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pull the image given in CACHE_FROM argument #885

Closed

Conversation

sgibson91
Copy link
Member

@sgibson91 sgibson91 commented Apr 28, 2020

Summary

Write a function that takes in a list of docker images that are
desired cache sources for the build phase. Check each image name
for a provided tag, if not found then default to 'latest'. Pull
each image from Docker Hub, they then will be available locally
to be used as a cache during the build phase.

This was the behaviour I expected to see when using the CACHE_FROM flag, but it appears that this is not the case. Issue #130

I have written a function that consumes the list of images, checks for tags and then pull them. I'm not sure where to put this within the repo2docker codebase though, so any guidance on that would be super appreciated ✨

It may also be useful to combine with the find_image function so we don't pull what we already have? https://github.com/jupyter/repo2docker/blob/bbc3ee02c0755b15ea456f9ae18dd76b904568e7/repo2docker/app.py#L613-L625

Outstanding TODOs

  • Move the code snippet to an appropriate place with the repo2docker codebase
  • Swap print statements for the custom logger
  • We may also want to do some handling of the pull output, like in push_image

https://github.com/jupyter/repo2docker/blob/bbc3ee02c0755b15ea456f9ae18dd76b904568e7/repo2docker/app.py#L458-L499

Write a function that takes in a list of docker images that are
desired cache sources for the build phase. Check each image name
for a provided tag, if not found then default to 'latest'. Pull
each image from Docker Hub, they then will be available locally
to be used as a cache during the build phase.
@betatim
Copy link
Member

betatim commented Apr 29, 2020

This looks nice!

Not super sure where we could put things. Like you say it is related to find_image and push_image. Adding the new method to the app class is probably easiest. However that file (and class) is huge already. Maybe it is time to move these things out to a "docker utilities" file. From a quick read these three methods all create a docker client, take a (bunch of) strings as inputs and produce some log output.

If we can make them three separate functions that take their inputs as arguments and then do their thing that would be nice I think. It would also make them easier to test because you don't need to setup the whole app first.

@manics
Copy link
Member

manics commented Apr 29, 2020

@betatim would #848 help since it moves the Docker API calls to a separate class?

@betatim
Copy link
Member

betatim commented Apr 29, 2020

I'll take a look at #848, I had lost track of that PR

@sgibson91
Copy link
Member Author

Just following up on this and/or #848 :)

@manics manics marked this pull request as draft January 26, 2022 19:21
@consideRatio consideRatio changed the title [WIP] Pull the image given in CACHE_FROM argument Pull the image given in CACHE_FROM argument Oct 30, 2022
@consideRatio
Copy link
Member

consideRatio commented Oct 30, 2022

I have written a function that consumes the list of images, checks for tags and then pull them. I'm not sure where to put this within the repo2docker codebase though, so any guidance on that would be super appreciated sparkles

I agree its not so intuitive that --cache-from doesn't pull for you, I've run into this as well!

It is a passthrough option to the docker build command though (or for whatever --engine used that could possibly accept it). I think that for repo2docker to add a functionality on top of a passthrough option is to add more complexity than we can sustainable maintain in repo2docker atm.

I'll go for a close on this for now to help triage PRs in this project, please don't see that as a final decision or similar!

Update - Did we all misunderstand --cache-from?

Oh actually, I think if --cache-from is specified, it means something entirely different than we all have been thinking. Check out https://docs.docker.com/engine/reference/commandline/build/#specifying-external-cache-sources.

I think it can be relevant if you have built an image like this, where you have a --mount=type=cache, and have built the image with BUILDKIT_INLINE_CACHE=1.

FROM python:3.9-bullseye

# install wheels built in the build-stage
COPY requirements.txt /tmp/requirements.txt
ARG PIP_CACHE_DIR=/tmp/pip-cache
RUN --mount=type=cache,target=${PIP_CACHE_DIR} \
    --mount=type=cache,from=build-stage,source=/tmp/wheels,target=/tmp/wheels \
    pip install \
        --find-links=/tmp/wheels/ \
        -r /tmp/requirements.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants