Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multistage Dockerfile with copy and glob uses cached layer when it shouldn't #589

Closed
pselden opened this issue Feb 26, 2019 · 14 comments
Closed
Assignees
Labels
area/caching For all bugs related to cache issues area/multi-stage builds issues related to kaniko multi-stage builds

Comments

@pselden
Copy link

pselden commented Feb 26, 2019

Actual behavior
When using multistage builds and with copy and globs, later layers that rely on the globbed files erroneously use the cache, leading to the wrong files being referenced in the final image.

Expected behavior
Should not use the cache for the later layer since the files in the previous layer have changed.

To Reproduce
Steps to reproduce the behavior:
See https://github.com/pselden/kaniko-cache-bug for full repro steps.

  1. Build docker image with cache on.
  2. Update versions.txt (1 -> 2)
  3. Build docker image with cache on.
  4. Run ls in the container and observe that the symlink is still pointing to file-1.txt even though file-2.txt is in the filesystem.

Additional Information

FROM alpine:3.6 as builder
COPY version.txt version.txt
RUN touch file-$(cat ./version.txt).txt

FROM alpine:3.6
COPY --from=builder file-*.txt /
RUN ln -s $(find /file-*.txt) /file.txt
  • Build Context:
    A versions.txt file with the following contents:
1
  • Kaniko Image: Using gcr.io/kaniko-project/executor:latest (sha256:d9fe474f80b73808dc12b54f45f5fc90f7856d9fc699d4a5e79d968a1aef1a72)

Note: this bug doesn't happen if I were to use a build arg for version instead of reading from a file.

@laverite
Copy link

seeing this issue as well

@RXminuS
Copy link

RXminuS commented May 27, 2019

We're actually seeing this behaviour as well without any globbing. Simply copying a file from a base image to another image makes it use an old cached layer. I'll attach the interesting part of our logs. Specifically the file build/api is rebuilt (I can see in the logs that it did in-fact rebuild) but then in a the later COPY --from=build it thinks that nothing has changed. I actually saw this issue that seemed to suggest that caching "should" only happen by path and base-image? Could this be related?

Also worth noting that I'm running the DEBUG container on Google Conainer Builder. The debug is a workaround for not being able to set the USER properly otherwise.

p #0: github.com/donna-legal/api
Step #0: INFO[0444] Taking snapshot of full filesystem... 
Step #0: INFO[0462] Pushing layer eu.gcr.io/rational-camera-143519/api/cache:f16bfe491095e57197b1aee0e023b26139f86d732f7f6ae2ea220caf74c0da45 to cache now 
Step #0: INFO[0462] RUN wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/v0.2.0/grpc_health_probe-linux-amd64 
Step #0: INFO[0462] cmd: /bin/sh 
Step #0: INFO[0462] args: [-c wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/v0.2.0/grpc_health_probe-linux-amd64] 
Step #0: INFO[0463] Taking snapshot of full filesystem... 
Step #0: 2019/05/27 12:08:50 pushed blob: sha256:08d8188742515dc468633877005263017d493c584ff25dfd4792a248eeb7ad69
Step #0: 2019/05/27 12:08:59 pushed blob: sha256:f27f95abef001f57b8bec3397b11c1ff2ebb0b139ad2c4888fab632ffba0a4ef
Step #0: 2019/05/27 12:09:00 eu.gcr.io/rational-camera-143519/api/cache:f16bfe491095e57197b1aee0e023b26139f86d732f7f6ae2ea220caf74c0da45: digest: sha256:e60afe41434bad178ce74bc94c0d45d3eed868fb0d137254ec6351b8f4ced02c size: 429
Step #0: INFO[0479] Pushing layer eu.gcr.io/rational-camera-143519/api/cache:82b4b89ca4b938dd0e166651be9c962d6d313cc44e9d3f7f6df81854ce86c719 to cache now 
Step #0: 2019/05/27 12:09:07 pushed blob: sha256:192a418bbb1147a53645e298d14f5a1f3342c5240e1b057831050a51d6d49273
Step #0: 2019/05/27 12:09:08 pushed blob: sha256:1abd05e5bb6863bc6239e6bbcad3caa4babb2738c544d4c0d553122d73d75408
Step #0: 2019/05/27 12:09:09 eu.gcr.io/rational-camera-143519/api/cache:82b4b89ca4b938dd0e166651be9c962d6d313cc44e9d3f7f6df81854ce86c719: digest: sha256:1ca5710fa353de5034264c47fa779293d52933170cad60965179876b7c84a96f size: 428
Step #0: INFO[0485] Saving file /bin/grpc_health_probe for later use. 
Step #0: INFO[0485] Saving file /build/api for later use. 
Step #0: INFO[0485] Deleting filesystem... 
Step #0: INFO[0488] Downloading base image alpine:3.9 
Step #0: 2019/05/27 12:09:12 No matching credentials were found, falling back on anonymous
Step #0: INFO[0489] Error while retrieving image from cache: getting file info: stat /cache/sha256:bf1684a6e3676389ec861c602e97f27b03f14178e5bc3f70dce198f9f160cce9: no such file or directory 
Step #0: INFO[0489] Downloading base image alpine:3.9 
Step #0: 2019/05/27 12:09:13 No matching credentials were found, falling back on anonymous
Step #0: INFO[0490] Checking for cached layer eu.gcr.io/rational-camera-143519/api/cache:779e49f221df013bdcec2fd0ff0799b30980b9f769b523fb8bb3142dc91778be... 
Step #0: INFO[0490] Using caching version of cmd: RUN apk update && apk add ca-certificates && rm -rf /var/cache/apk/* 
Step #0: INFO[0490] Checking for cached layer eu.gcr.io/rational-camera-143519/api/cache:10ce00a5a580a030294ea3f325c68097f64fb6456a71ba46142d1cd7209e85a1... 
Step #0: INFO[0491] Using caching version of cmd: RUN chmod +x /bin/grpc_health_probe 
Step #0: INFO[0491] Checking for cached layer eu.gcr.io/rational-camera-143519/api/cache:e560ce51e6f15372bcd9fc30a0ab3b3c4bc41cf248884d35b24296a8b8b8000e... 
Step #0: INFO[0492] Using caching version of cmd: RUN mkdir /usr/app 
Step #0: INFO[0492] Using files from context: [/workspace/config.json] 
Step #0: INFO[0492] Checking for cached layer eu.gcr.io/rational-camera-143519/api/cache:1d6f60d9859d0503ce4de8b6990ba382a99c57571b24d79e991fcb1c8019c08a... 
Step #0: INFO[0493] Using caching version of cmd: RUN addgroup -S app && adduser -S -g app app 
Step #0: INFO[0493] Checking for cached layer eu.gcr.io/rational-camera-143519/api/cache:08da4402cb4fafaf79f7f426e486b2c92d3135c66e6c5a32a69244d0ee7ad07d... 
Step #0: INFO[0493] Using caching version of cmd: RUN chown -R app:app /usr/app 
Step #0: INFO[0493] cmd: USER 
Step #0: INFO[0493] cmd: EXPOSE 
Step #0: INFO[0493] Adding exposed port: 50051/tcp 
Step #0: INFO[0493] Skipping unpacking as no commands require it. 
Step #0: INFO[0493] Taking snapshot of full filesystem... 
Step #0: INFO[0494] RUN apk update && apk add ca-certificates && rm -rf /var/cache/apk/* 
Step #0: INFO[0494] Found cached layer, extracting to filesystem 
Step #0: INFO[0495] Taking snapshot of files... 
Step #0: INFO[0495] COPY --from=build /bin/grpc_health_probe /bin/grpc_health_probe 
Step #0: INFO[0495] Taking snapshot of files... 
Step #0: INFO[0495] RUN chmod +x /bin/grpc_health_probe 
Step #0: INFO[0495] Found cached layer, extracting to filesystem 
Step #0: INFO[0496] Taking snapshot of files... 
Step #0: INFO[0496] RUN mkdir /usr/app 
Step #0: INFO[0496] Found cached layer, extracting to filesystem 
Step #0: INFO[0497] Taking snapshot of files... 
Step #0: INFO[0497] WORKDIR /usr/app 
Step #0: INFO[0497] cmd: workdir 
Step #0: INFO[0497] Changed working directory to /usr/app 
Step #0: INFO[0497] No files changed in this command, skipping snapshotting. 
Step #0: INFO[0497] COPY --from=build /build/api . 
Step #0: INFO[0497] Taking snapshot of files... 

@dwirz
Copy link

dwirz commented Jul 9, 2019

We are seeing this issue as well. 🙈 Does anyone of you have a "workaround" or solved the problem? ⛑

@pselden
Copy link
Author

pselden commented Jul 9, 2019

My workaround is to not use --cache :(

@priyawadhwa priyawadhwa added the area/multi-stage builds issues related to kaniko multi-stage builds label Jul 25, 2019
@kfox1111
Copy link

We have seen this too. Any updates?

@duythinht
Copy link

We too, any update for this?

@Lawouach
Copy link

Seeing this issue too unfortunately.

@cvgw cvgw added the area/caching For all bugs related to cache issues label Nov 23, 2019
@cvgw cvgw self-assigned this Nov 23, 2019
@afirth
Copy link

afirth commented Dec 2, 2019

this completely breaks caching for multi-stage docker files. since we are pushing distroless so hard, this sucks. please fix <3

[edit] As noted by previous comments, globbing or not has no effect. Any copying of source code in previous stages will be cached and ignored.

@cvgw
Copy link
Contributor

cvgw commented Dec 2, 2019

@afirth we've got some PRs in review that will hopefully fix this. Fingers crossed for getting this fixed in the next week or two

@cvgw
Copy link
Contributor

cvgw commented Dec 17, 2019

This should be fixed as of master@a675098

Closing, but please reopen if it is not resolved

@cvgw cvgw closed this as completed Dec 17, 2019
@pselden
Copy link
Author

pselden commented Dec 17, 2019

@cvgw I tried with kaniko-executor:debug and it works!

@cbartz
Copy link

cbartz commented Jul 7, 2021

This bug seems to be fixed at least with kaniko v.1.3.0, but since v.1.5.0 this bug has appeared again, as I can reproduce the steps of the bug creator with version 1.5.0 and upwards, but not with v1.3.0 .

I reproduced the bug slightly different with following Dockerfile:

FROM alpine:3.6 as builder
COPY version.txt version.txt

FROM alpine:3.6
COPY --from=builder version.txt /

RUN touch file-$(cat ./version.txt).txt

Since v.1.5.0 the RUN command from the second stage always uses the cache, regardless of the content of version.txt .

@koshkarov
Copy link

@cbartz You might need to open a new bug, not sure if they're looking at the old issues.

@awoodobvio
Copy link

This bug seems to be fixed at least with kaniko v.1.3.0, but since v.1.5.0 this bug has appeared again, as I can reproduce the steps of the bug creator with version 1.5.0 and upwards, but not with v1.3.0 .

I reproduced the bug slightly different with following Dockerfile:

FROM alpine:3.6 as builder
COPY version.txt version.txt

FROM alpine:3.6
COPY --from=builder version.txt /

RUN touch file-$(cat ./version.txt).txt

Since v.1.5.0 the RUN command from the second stage always uses the cache, regardless of the content of version.txt .

We just ran into this with v1.6.0 as well and rolling back to v1.3.0 fixes it. In our case, we were building the "backstage.io" project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/caching For all bugs related to cache issues area/multi-stage builds issues related to kaniko multi-stage builds
Projects
None yet
Development

Successfully merging a pull request may close this issue.