Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS Private ECR Pull, Build, Push - not pulling in CI #566

Closed
ShedPlant opened this issue Mar 24, 2023 · 7 comments · Fixed by #601
Closed

AWS Private ECR Pull, Build, Push - not pulling in CI #566

ShedPlant opened this issue Mar 24, 2023 · 7 comments · Fixed by #601
Assignees
Labels
awaiting-feedback Blocked on input from the author kind/bug Some behavior is incorrect or out of spec resolution/fixed This issue was fixed

Comments

@ShedPlant
Copy link

ShedPlant commented Mar 24, 2023

What happened?

I have:

  • Application with multi-stage Dockerfile, takes 5-10m to build from scratch
  • Pulumi resource for AWS Private ECR, and Docker image
  • GitHub Actions workflow to call Pulumi up

I upgraded to pulumi-docker 4.0.0 because I saw it enables Buildkit by default and I was hoping it would fix my caching problem 😁 .

But in the GHA CI, even though v4 seems to be quicker than v3, the build still always takes a few minutes and I can see logs (thanks for improved logging!) showing it's compiling. I expected that the 2nd time CI ran, this would be cached.

On my local machine, running pulumi up a second time is very quick to build the image 👍 .

I think the image isn't being pulled first in CI.
I see resolve docker.io/amazon/aws-lambda-python:3.9@sha256:96385e8762b8ef280957fdc33963f18a6af3b38a61530dc5b58ffd0e46afbb3d
but not anything about pulling the my_app.

Do I have to do the pull explicitly first with RemoteImage? I tried this but it wasn't obvious how to make that work with a private ECR 😕 .

Am I doing it wrong 🙂 ?

Expected Behavior

I want:

  1. pull from private ECR
  2. build fast
  3. push to private ECR

Steps to reproduce

My Dockerfile looks something like:

FROM amazon/aws-lambda-python:3.9 as base_image
# yum install some dependency required for both build and runtime

FROM base_image as installed_deps
# yum install some build dependencies
# make install on a source dependency

FROM installed_deps as installed_python_deps
# pip install

FROM base_image as prod
COPY --from=installed_python_deps_prod $VENV_PATH $VENV_PATH
COPY --from=installed_deps $LD_LIBRARY_PATH $LD_LIBRARY_PATH
COPY my_app my_app

My pulumi resource looks something like:

repo = aws.ecr.Repository(
            resource_name="my_app",
        )

# Copied _get_registry_info from
# https://github.com/pulumi/pulumi-docker/blob/master/examples/aws-container-registry/py/__main__.py
registry = repo.registry_id.apply(_get_registry_info)

image = docker.Image(
            "my_app:latest",
            build=docker.DockerBuildArgs(
                context="/full/path/to/my_app",
                dockerfile="Dockerfile",
                args={"BUILDKIT_INLINE_CACHE": "1"},  # may not be necessary?
                cache_from=docker.CacheFromArgs(
                    images=[
                        pulumi.Output.concat(repo.repository_url, ":", "latest")
                    ]
                ),
                platform="linux/amd64",
            ),
            registry=registry,
            image_name=pulumi.Output.concat(self.repo.repository_url, ":", "latest"),
        )

Output of pulumi about

I don't want to copy/paste this from a work project. I'll try to provide specific info as requested if relevant.

Additional context

No response

Contributing

Vote on this issue by adding a 👍 reaction.
To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).

@ShedPlant ShedPlant added kind/bug Some behavior is incorrect or out of spec needs-triage Needs attention from the triage team labels Mar 24, 2023
@ShedPlant ShedPlant changed the title AWS Private ECR Pull, Build, Push AWS Private ECR Pull, Build, Push - not pulling in CI Mar 24, 2023
@guineveresaenger
Copy link
Contributor

hi @ShedPlant - thank you for filing this issue.

To answer your first question: you should not need to pull anything using RemoteImage, especially given that your. code works locally as expected 👍 .

This does bring up the difficulty in reproducing this bug, because we do not know how your CI is set up, and this could be an authentication issue.

However, we did in fact fix a few issues around inline cacheing in #537, so we would encourage you to give v4.1.0 a try!

@guineveresaenger guineveresaenger added awaiting-feedback Blocked on input from the author and removed needs-triage Needs attention from the triage team labels Mar 27, 2023
@ShedPlant
Copy link
Author

ShedPlant commented Mar 28, 2023

I've upgraded from pulumi-docker 4.0.0 to pulumi-docker 4.1.0 (and raised a separate bug #572 ).

To clarify, this isn't just a CI only problem, I see the same locally.
The pulumi docker build uses the cache, if present, but doesn't pull the image first.

On first build:

  • docker system prune -a (because in CI I know this will be the starting point)
  • changed a single line of code in the app, but all the dependencies from the cached build would still be valid
  • ran pulumi up --yes --non-interactive (to get better view of logs)

Expected behaviour:

  • pull the app image
  • build, caching most early layers
  • build only the later layer

Actual behaviour:

  • pull the app's base image
  • build all layers from scratch 👎

Remove that single line code change and run pulumi up a second time.
The log doesn't print 'caching' anywhere but I can see that the early slow compiling steps are not being run 👍 .

Of course, in CI, the environment is freshly created each time so it's always the first run.

@ShedPlant
Copy link
Author

If I pull the docker image manually as a prior step, the docker build caches as expected:

  • docker system prune -a
  • aws ecr get-login-password --region=my_aws_region | docker login --username AWS --password-stdin my_aws_account_number.dkr.ecr.my_aws_region.amazonaws.com
  • docker pull my_aws_account_number.dkr.ecr.my_aws_region.amazonaws.com/my-app:latest
  • change a single line of code
  • pulumi up --yes --non-interactive only builds the later layer

@ShedPlant
Copy link
Author

Btw I'm not alleging it worked differently in v3. Maybe this is a feature request, or how it's supposed to work 🤷 .

My use case is I want my builds to happen as fast as possible. I realise that caching apt or yum will potentially keep old versions but I'm fine with that.

@DominicRoyStang
Copy link

I'm also seeing this when using cacheFrom with an image in ECR.

Could it have something to do with this? aws/containers-roadmap#876

From the resource description

A list of image names to use as build cache. Images provided must have a cache manifest. Must provide authentication to cache registry

Side note: these details are missing from the official docs https://www.pulumi.com/registry/packages/docker/api-docs/image/#cachefrom

@guineveresaenger
Copy link
Contributor

Thank you for the link @DominicRoyStang. ❤️

@ShedPlant - in order for us to help you out better, since this runs in Actions, can you please give us a link to a minimal repro in a github repository? We, too, would like everyone's builds to be as fast as possible.

@ShedPlant
Copy link
Author

https://github.com/curoo/pulumi-docker-investigations/tree/master/issues/issue_566

https://github.com/curoo/pulumi-docker-investigations/actions/runs/4577136784/jobs/8082179426

   ~  docker:index:Image issue_566:image updating (10s) [diff: ~build]; a long running process starts...
  
   ~  docker:index:Image issue_566:image updating (10s) [diff: ~build]; digest: sha256:65645c4e766bd4a5ff86ac23c70d03a44632905030e902df45b750e85b07b282
  
   ~  docker:index:Image issue_566:image updating (10s) [diff: ~build]; digest: sha256:c5cf2507900accae0fb5f682d95c04bdf275ea71c01a8cfefe086b32f3a75df6
  
  @ Updating....
  .
  .
  .
  .
  .
  .
  .
  .
  .
  .
  
   ~  docker:index:Image issue_566:image updating (20s) [diff: ~build]; digest: sha256:c5cf2507900accae0fb5f682d95c04bdf275ea71c01a8cfefe086b32f3a75df6
  
   ~  docker:index:Image issue_566:image updating (20s) [diff: ~build]; digest: sha256:116328b5c42ad802e1e8d440ab3660a14dd99c07bd9226b6b8ab5b5af0cae947
  
   ~  docker:index:Image issue_566:image updating (20s) [diff: ~build]; a long running process ends

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting-feedback Blocked on input from the author kind/bug Some behavior is incorrect or out of spec resolution/fixed This issue was fixed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants