Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-stage COPY --from uses wrong cache #1262

Closed
ronnn opened this issue May 11, 2020 · 15 comments
Closed

Multi-stage COPY --from uses wrong cache #1262

ronnn opened this issue May 11, 2020 · 15 comments

Comments

@ronnn
Copy link

ronnn commented May 11, 2020

Actual behavior

We're using kaniko to build our images in a GitLab runner within OpenShift. When the first COPY expression in a Dockerfile is a COPY --from expression, the cache is used, even though a previous stage produced a layer which is different from the one in the cache.

I created https://github.com/ronnn/kaniko-caching as a reference to reproduce the bug.

Also see: False-positive cache-hit on COPY --from #870

Expected behavior
We expect the date that is echoed to always be the date of the corresponding build image.

To Reproduce
Steps to reproduce the behavior:

  1. Set some variables
export KANIKO_IMAGE="kaniko-fixed:latest"
export REGISTRY_TOKEN="FOOBAR"
export REGISTRY="url.to.registry.org"
export NAMESPACE="mynamespace"
# only needed on Windows machines
export MSYS_NO_PATHCONV=1 
  1. Start an interactive kaniko container. This builds a new kaniko image to fix uclibc BusyBox in Distroless debug images fails with "ls: Value too large for defined data type" distroless#225 (comment).
./run-kaniko.sh
  1. Execute the build
./app/build.sh

This produces the following output. See how the date that should be copied from the second build image (Mon May 11 08:53:46 UTC 2020) is the one from the first build image instead (Mon May 11 08:52:10 UTC 2020). The third run without the cache works as expected.

If the COPY expressions in Dockerfile_release are switched (see commented lines) the cache is working as expected. We'll try to uses this as a workaround for now.

Kaniko version :  v0.22.0
INFO[0001] Retrieving image manifest busybox:latest
INFO[0035] Retrieving image manifest busybox:latest
INFO[0042] Built cross stage deps: map[]
INFO[0042] Retrieving image manifest busybox:latest
INFO[0053] Retrieving image manifest busybox:latest
INFO[0066] Executing 0 build triggers
INFO[0066] Checking for cached layer url.to.registry.org/mynamespace/cache:9e7fbbeb8c17d40dbb7dc6cc44d1d7b158c145bbe10eb7906bf11b59b2981c95...
INFO[0067] No cached layer found for cmd COPY ./date.txt /tmp/date.txt
INFO[0067] Unpacking rootfs as cmd COPY ./date.txt /tmp/date.txt requires it.
INFO[0069] COPY ./date.txt /tmp/date.txt
INFO[0069] Resolving 1 paths
INFO[0069] Taking snapshot of files...
INFO[0069] RUN cat /tmp/date.txt
INFO[0069] Taking snapshot of full filesystem...
INFO[0069] Pushing layer url.to.registry.org/mynamespace/cache:9e7fbbeb8c17d40dbb7dc6cc44d1d7b158c145bbe10eb7906bf11b59b2981c95 to cache now
INFO[0069] Resolving 421 paths
INFO[0069] cmd: /bin/sh
INFO[0069] args: [-c cat /tmp/date.txt]
INFO[0069] Running: [/bin/sh -c cat /tmp/date.txt]
Mon May 11 08:52:10 UTC 2020
INFO[0069] Taking snapshot of full filesystem...
INFO[0069] Resolving 421 paths
INFO[0070] No files were changed, appending empty layer to config. No layer added to image.
INFO[0070] Pushing layer url.to.registry.org/mynamespace/cache:1e9b48d0ce22cc721bb6bc8320bc0c7829b8de1edf9d91fd5f1e7fd8898e279a to cache now
INFO[0072] Deleting filesystem...
INFO[0001] Resolved base name url.to.registry.org/mynamespace/cache-debug-build to build
INFO[0001] Retrieving image manifest url.to.registry.org/mynamespace/cache-debug-build
INFO[0002] Retrieving image manifest url.to.registry.org/mynamespace/cache-debug-build
INFO[0003] Retrieving image manifest url.to.registry.org/mynamespace/alpine:3.11
INFO[0004] Retrieving image manifest url.to.registry.org/mynamespace/alpine:3.11
INFO[0005] Built cross stage deps: map[0:[/tmp/date.txt]]
INFO[0005] Retrieving image manifest url.to.registry.org/mynamespace/cache-debug-build
INFO[0005] Retrieving image manifest url.to.registry.org/mynamespace/cache-debug-build
INFO[0006] Executing 0 build triggers
INFO[0008] Saving file tmp/date.txt for later use
INFO[0008] Deleting filesystem...
INFO[0008] Retrieving image manifest url.to.registry.org/mynamespace/alpine:3.11
INFO[0009] Retrieving image manifest url.to.registry.org/mynamespace/alpine:3.11
INFO[0010] Executing 0 build triggers
INFO[0010] Checking for cached layer url.to.registry.org/mynamespace/cache:b870ad7d7ca4ac6f65ee602878be784609d8c08d0581b936a63afb6c38da0b23...
INFO[0010] No cached layer found for cmd COPY --from=build /tmp/date.txt /tmp/date.txt
INFO[0010] Unpacking rootfs as cmd COPY --from=build /tmp/date.txt /tmp/date.txt requires it.
INFO[0014] COPY --from=build /tmp/date.txt /tmp/date.txt
INFO[0014] Resolving 1 paths
INFO[0014] Taking snapshot of files...
INFO[0014] Pushing layer url.to.registry.org/mynamespace/cache:b870ad7d7ca4ac6f65ee602878be784609d8c08d0581b936a63afb6c38da0b23 to cache now
INFO[0014] COPY ./date.txt /tmp/date-direct.txt
INFO[0014] Resolving 1 paths
INFO[0014] Taking snapshot of files...
INFO[0014] RUN cat /tmp/date-direct.txt
INFO[0014] Taking snapshot of full filesystem...
INFO[0014] Pushing layer url.to.registry.org/mynamespace/cache:54e55c1b901e4ac0fd885f94a708961a1f76d633a600c846c1659fb0568523a8 to cache now
INFO[0014] Resolving 484 paths
INFO[0014] cmd: /bin/sh
INFO[0014] args: [-c cat /tmp/date-direct.txt]
INFO[0014] Running: [/bin/sh -c cat /tmp/date-direct.txt]
Mon May 11 08:52:10 UTC 2020
INFO[0014] Taking snapshot of full filesystem...
INFO[0014] Resolving 484 paths
INFO[0014] No files were changed, appending empty layer to config. No layer added to image.
INFO[0014] RUN cat /tmp/date.txt
INFO[0014] cmd: /bin/sh
INFO[0014] args: [-c cat /tmp/date.txt]
INFO[0014] Running: [/bin/sh -c cat /tmp/date.txt]
INFO[0014] Pushing layer url.to.registry.org/mynamespace/cache:efe4fad5f5de6c7ce71d85b86c116b494a519c4db721893dd139036c2b48970c to cache now
Mon May 11 08:52:10 UTC 2020
INFO[0014] Taking snapshot of full filesystem...
INFO[0014] Resolving 484 paths
INFO[0014] No files were changed, appending empty layer to config. No layer added to image.
INFO[0014] Pushing layer url.to.registry.org/mynamespace/cache:7cac43ba931e4c0124bf82c06acb0d1f2613658bb935429034f1cb8f8b82ea6f to cache now
INFO[0017] Deleting filesystem...
INFO[0001] Retrieving image manifest busybox:latest
INFO[0018] Retrieving image manifest busybox:latest
INFO[0041] Built cross stage deps: map[]
INFO[0041] Retrieving image manifest busybox:latest
INFO[0053] Retrieving image manifest busybox:latest
INFO[0057] Executing 0 build triggers
INFO[0057] Checking for cached layer url.to.registry.org/mynamespace/cache:b5bde338858da0340a654d585b9d68bba909b416e780fcbb911dad73a27d706a...
INFO[0058] No cached layer found for cmd COPY ./date.txt /tmp/date.txt
INFO[0058] Unpacking rootfs as cmd COPY ./date.txt /tmp/date.txt requires it.
INFO[0059] COPY ./date.txt /tmp/date.txt
INFO[0059] Resolving 1 paths
INFO[0059] Taking snapshot of files...
INFO[0059] RUN cat /tmp/date.txt
INFO[0059] Taking snapshot of full filesystem...
INFO[0059] Pushing layer url.to.registry.org/mynamespace/cache:b5bde338858da0340a654d585b9d68bba909b416e780fcbb911dad73a27d706a to cache now
INFO[0059] Resolving 421 paths
INFO[0060] cmd: /bin/sh
INFO[0060] args: [-c cat /tmp/date.txt]
INFO[0060] Running: [/bin/sh -c cat /tmp/date.txt]
Mon May 11 08:53:46 UTC 2020
INFO[0060] Taking snapshot of full filesystem...
INFO[0060] Resolving 421 paths
INFO[0060] No files were changed, appending empty layer to config. No layer added to image.
INFO[0060] Pushing layer url.to.registry.org/mynamespace/cache:c62bc874cd44c94c39288675b7c6726b9914ff9765d95d9385f6e745bc617f30 to cache now
INFO[0062] Deleting filesystem...
INFO[0001] Resolved base name url.to.registry.org/mynamespace/cache-debug-build to build
INFO[0001] Retrieving image manifest url.to.registry.org/mynamespace/cache-debug-build
INFO[0002] Retrieving image manifest url.to.registry.org/mynamespace/cache-debug-build
INFO[0003] Retrieving image manifest url.to.registry.org/mynamespace/alpine:3.11
INFO[0004] Retrieving image manifest url.to.registry.org/mynamespace/alpine:3.11
INFO[0005] Built cross stage deps: map[0:[/tmp/date.txt]]
INFO[0005] Retrieving image manifest url.to.registry.org/mynamespace/cache-debug-build
INFO[0005] Retrieving image manifest url.to.registry.org/mynamespace/cache-debug-build
INFO[0006] Executing 0 build triggers
INFO[0008] Saving file tmp/date.txt for later use
INFO[0008] Deleting filesystem...
INFO[0008] Retrieving image manifest url.to.registry.org/mynamespace/alpine:3.11
INFO[0009] Retrieving image manifest url.to.registry.org/mynamespace/alpine:3.11
INFO[0010] Executing 0 build triggers
INFO[0010] Checking for cached layer url.to.registry.org/mynamespace/cache:b870ad7d7ca4ac6f65ee602878be784609d8c08d0581b936a63afb6c38da0b23...
INFO[0011] Using caching version of cmd: COPY --from=build /tmp/date.txt /tmp/date.txt
INFO[0011] Checking for cached layer url.to.registry.org/mynamespace/cache:4d89e113f9e0be1474ad548ee7754d0891bed3f857bcaa33bdc61852c899087b...
INFO[0011] No cached layer found for cmd COPY ./date.txt /tmp/date-direct.txt
INFO[0011] Unpacking rootfs as cmd COPY ./date.txt /tmp/date-direct.txt requires it.
INFO[0015] COPY --from=build /tmp/date.txt /tmp/date.txt
INFO[0015] Found cached layer, extracting to filesystem
INFO[0015] COPY ./date.txt /tmp/date-direct.txt
INFO[0015] Resolving 1 paths
INFO[0015] Taking snapshot of files...
INFO[0015] RUN cat /tmp/date-direct.txt
INFO[0015] Taking snapshot of full filesystem...
INFO[0015] Pushing layer url.to.registry.org/mynamespace/cache:4d89e113f9e0be1474ad548ee7754d0891bed3f857bcaa33bdc61852c899087b to cache now
INFO[0015] Resolving 484 paths
INFO[0015] cmd: /bin/sh
INFO[0015] args: [-c cat /tmp/date-direct.txt]
INFO[0015] Running: [/bin/sh -c cat /tmp/date-direct.txt]
Mon May 11 08:53:46 UTC 2020
INFO[0015] Taking snapshot of full filesystem...
INFO[0015] Resolving 484 paths
INFO[0015] No files were changed, appending empty layer to config. No layer added to image.
INFO[0015] RUN cat /tmp/date.txt
INFO[0015] cmd: /bin/sh
INFO[0015] args: [-c cat /tmp/date.txt]
INFO[0015] Pushing layer url.to.registry.org/mynamespace/cache:57f0c28085ae7ff04514f70465ba0efc79c824f5c9d15447b61f1167809bd7be to cache now
INFO[0015] Running: [/bin/sh -c cat /tmp/date.txt]
Mon May 11 08:52:10 UTC 2020
INFO[0015] Taking snapshot of full filesystem...
INFO[0015] Resolving 484 paths
INFO[0015] No files were changed, appending empty layer to config. No layer added to image.
INFO[0015] Pushing layer url.to.registry.org/mynamespace/cache:aefc67efc9c272b140335083e47e906c357ce7454e6c841dd6a2d242d06d6613 to cache now
INFO[0018] Deleting filesystem...
INFO[0001] Retrieving image manifest busybox:latest
INFO[0035] Retrieving image manifest busybox:latest
INFO[0050] Built cross stage deps: map[]
INFO[0050] Retrieving image manifest busybox:latest
INFO[0062] Retrieving image manifest busybox:latest
INFO[0066] Executing 0 build triggers
INFO[0066] Checking for cached layer url.to.registry.org/mynamespace/cache:98d8765d3fd91b4a83d7aff1661c902968b313bc78a65791d3f7dd9a1684fb00...
INFO[0067] No cached layer found for cmd COPY ./date.txt /tmp/date.txt
INFO[0067] Unpacking rootfs as cmd COPY ./date.txt /tmp/date.txt requires it.
INFO[0068] COPY ./date.txt /tmp/date.txt
INFO[0068] Resolving 1 paths
INFO[0068] Taking snapshot of files...
INFO[0068] RUN cat /tmp/date.txt
INFO[0068] Taking snapshot of full filesystem...
INFO[0068] Pushing layer url.to.registry.org/mynamespace/cache:98d8765d3fd91b4a83d7aff1661c902968b313bc78a65791d3f7dd9a1684fb00 to cache now
INFO[0068] Resolving 421 paths
INFO[0069] cmd: /bin/sh
INFO[0069] args: [-c cat /tmp/date.txt]
INFO[0069] Running: [/bin/sh -c cat /tmp/date.txt]
Mon May 11 08:55:12 UTC 2020
INFO[0069] Taking snapshot of full filesystem...
INFO[0069] Resolving 421 paths
INFO[0069] No files were changed, appending empty layer to config. No layer added to image.
INFO[0069] Pushing layer url.to.registry.org/mynamespace/cache:b42ab708063218355042f3efafd2b1e4d3401226cc5ad6b007ea6c87adcf3fcc to cache now
INFO[0071] Deleting filesystem...
INFO[0001] Resolved base name url.to.registry.org/mynamespace/cache-debug-build to build
INFO[0001] Retrieving image manifest url.to.registry.org/mynamespace/cache-debug-build
INFO[0002] Retrieving image manifest url.to.registry.org/mynamespace/cache-debug-build
INFO[0003] Retrieving image manifest url.to.registry.org/mynamespace/alpine:3.11
INFO[0004] Retrieving image manifest url.to.registry.org/mynamespace/alpine:3.11
INFO[0005] Built cross stage deps: map[0:[/tmp/date.txt]]
INFO[0005] Retrieving image manifest url.to.registry.org/mynamespace/cache-debug-build
INFO[0006] Retrieving image manifest url.to.registry.org/mynamespace/cache-debug-build
INFO[0006] Executing 0 build triggers
INFO[0008] Saving file tmp/date.txt for later use
INFO[0008] Deleting filesystem...
INFO[0008] Retrieving image manifest url.to.registry.org/mynamespace/alpine:3.11
INFO[0009] Retrieving image manifest url.to.registry.org/mynamespace/alpine:3.11
INFO[0010] Executing 0 build triggers
INFO[0010] Unpacking rootfs as cmd COPY --from=build /tmp/date.txt /tmp/date.txt requires it.
INFO[0013] COPY --from=build /tmp/date.txt /tmp/date.txt
INFO[0013] Resolving 1 paths
INFO[0013] Taking snapshot of files...
INFO[0013] COPY ./date.txt /tmp/date-direct.txt
INFO[0013] Resolving 1 paths
INFO[0013] Taking snapshot of files...
INFO[0013] RUN cat /tmp/date-direct.txt
INFO[0013] Taking snapshot of full filesystem...
INFO[0013] Resolving 484 paths
INFO[0013] cmd: /bin/sh
INFO[0013] args: [-c cat /tmp/date-direct.txt]
INFO[0013] Running: [/bin/sh -c cat /tmp/date-direct.txt]
Mon May 11 08:55:12 UTC 2020
INFO[0013] Taking snapshot of full filesystem...
INFO[0013] Resolving 484 paths
INFO[0013] No files were changed, appending empty layer to config. No layer added to image.
INFO[0013] RUN cat /tmp/date.txt
INFO[0013] cmd: /bin/sh
INFO[0013] args: [-c cat /tmp/date.txt]
INFO[0013] Running: [/bin/sh -c cat /tmp/date.txt]
Mon May 11 08:55:12 UTC 2020
INFO[0013] Taking snapshot of full filesystem...
INFO[0013] Resolving 484 paths
INFO[0013] No files were changed, appending empty layer to config. No layer added to image.
INFO[0013] Deleting filesystem...

Triage Notes for the Maintainers

Description Yes/No
Please check if this a new feature you are proposing
Please check if the build works in docker but not in kaniko
Please check if this error is seen when you use --cache flag
Please check if your dockerfile is a multistage dockerfile
@swist
Copy link

swist commented May 12, 2020

We've hit that too. Our workaround was to modify the Dockerfiles to ensure the "from IMAGE" is cache bust when needed

@alex1989hu
Copy link
Contributor

We've hit that too. Our workaround was to modify the Dockerfiles to ensure the "from IMAGE" is cache bust when needed

Could you provide an example, please?

@wanderrful
Copy link

wanderrful commented Jul 7, 2020

I wonder if an even dirtier workaround could be to rename your stages in the Dockerfile for each Kaniko build. Obviously that'd be too much trouble and defeat the purpose, though.

@jamil-s
Copy link

jamil-s commented Apr 19, 2021

Hey folks, is there any plans to address this?
In our use case, the --cache-copy-layers flag causes us to reuse an existing layer. Would there be a timeline for this? in most case the --cache-copy-layers provides a huge build improvement than compared to without it, but these small edge cases makes it hard to keep it even though it affects a small percentage of our engineers

@jc-lab
Copy link

jc-lab commented Jun 2, 2021

+1
I think that this is a big bug. It still happens.

@AH012
Copy link

AH012 commented Jul 12, 2021

+1

Still happens to me on your latest release

@Leletir
Copy link

Leletir commented Jul 12, 2021

Hello,
I had this issue this weekend. We are running Kaniko 1.6.0.
Since all our images use the multistage build, we ended up removing the cache, which is not a solution.

I know it's not easy to maintain a project like that, but is it possible to have an ETA on this issue ?

Have a good day.

@KasnocknDave
Copy link

We are running into the very same issue, described here #1348.

@alexey-sveshnikov
Copy link

alexey-sveshnikov commented Aug 12, 2021

Hello,

After long hours of debugging pipelines, I've ended up with this hack to "parametrize" COPY command so it has a new hash every run:

Dockerfile.template:

...
FROM nginx
RUN addgroup --gid $CI_PIPELINE_ID cache_bursting_group
COPY --chown=0:$CI_PIPELINE_ID --from=build /src/dist /usr/share/nginx/html/

Build script:

sed -e "s/\$CI_PIPELINE_ID/$CI_PIPELINE_ID/" .cicd/Dockerfile.template > .cicd/Dockerfile
kaniko/executor <....>   

@zoidyzoidzoid
Copy link

Is anyone working on a fix to this? I assume from isn't considered in the cache keys or something. I can try help contribute a fix.

@zoidyzoidzoid
Copy link

zoidyzoidzoid commented Sep 24, 2021

I think we need to do something like also store the digest of the from image in the cache key.

https://sourcegraph.com/github.com/GoogleContainerTools/kaniko@7e3954ac734534ce5ce68ad6300a2d3143d82f40/-/blob/pkg/executor/build.go?L193

Edit: Actually we seem to maybe already do that, so not sure what the issue is.

@hxtmdev
Copy link

hxtmdev commented Apr 1, 2022

I reproduced the error with v1.6.0-debug@sha256:fcccd2ab9f3892e33fc7f2e950c8e4fc665e7a4c66f6a9d70b300d7a2103592f and was able to fix it by upgrading to v1.8.0-debug.

@zoidyzoidzoid
Copy link

Any idea which patches solved it?

I was having some similar issues so I removed my use of caching, but should test if it's resolved.

@hxtmdev
Copy link

hxtmdev commented Apr 8, 2022

Any idea which patches solved it?

No idea. If it helps with further kaniko development or some users are still encountering a variant of this problem with 1.8.0 I can offer to bisect my particular case. Otherwise let's just use 1.8.0.

@zoidyzoidzoid
Copy link

I tested my reproduction case and it works too! 🎉

I think we can close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests