Skip to content
This repository has been archived by the owner on Feb 6, 2024. It is now read-only.

Dependency on volatile-status.txt breaks remote cache #536

Open
mariusgrigoriu opened this issue Mar 6, 2020 · 8 comments
Open

Dependency on volatile-status.txt breaks remote cache #536

mariusgrigoriu opened this issue Mar 6, 2020 · 8 comments

Comments

@mariusgrigoriu
Copy link
Contributor

The Bazel docs state:

Bazel pretends that the volatile file never changes

However, that is untrue when it comes to the remote cache.

Our CI environment runs e2e tests that depend on a k8s_object on every run even when there are no changes due to the interaction between rules_k8s, stamping, and the remote_cache.

I'd like to propose dropping the dependency on the volatile file, while retaining the stable file, or at least making it optional as a workaround since the Bazel maintainers are pushing back on fixing the underlying issue.

@mariusgrigoriu
Copy link
Contributor Author

I found that depending on the .substituted.yaml output file is a workaround to this remote caching issue. The downside is that there may be cases when the test doesn't run due to some changes in the rule related to stamping.

@fejta
Copy link
Contributor

fejta commented Jun 20, 2020

General best practice here is to avoid stamping -- stamping is by definition volatile.

Another practice we use over in kubernetes/test-infra is to separate out the stamp layer into its own final layer.

AKA instead of making the final stamped layer include a large go binary in it, we put this into an intermediary layer.

https://github.com/kubernetes/test-infra/blob/35cfa7b7d66098fabedb8fb1c0e15be29088971e/prow/def.bzl#L26-L49

In other words instead of using the rules_docker container_image directly we wrap this into a prow_image macro which separates out the stamped layer from the large binary layer.

This way 99% of the image layers remain reproducible (and therefore cached) and just the tiny final layer with stamp info changes between builds (assuming the image hasn't changed)

@mmikitka
Copy link
Contributor

@mariusgrigoriu and @fejta I created #592 as another attempt at addressing this issue. It exposes the stamp_srcs attribute as a means to override volatile-status.txt (and stable-status.txt if desired).

@mariusgrigoriu
Copy link
Contributor Author

That looks like a creative solution!

We've settled into adding the .substituted.yaml output and a custom resolver as data dependencies on e2e tests for our project. The test uses those dependencies only to trigger e2e execution. The custom resolver substitutes values like cluster and namespace, which are different per pipeline run so we don't run e2e with every commit.

I think the PR might simplify things a bit so we can just have the k8s_object rule as the data dependency as long as our resolver gets its values from sources other than the stable and volatile status files. If we did want to use those files, we might be stuck either way, but it would require some testing.

@mariusgrigoriu
Copy link
Contributor Author

Found a use case where depending on the .substituted.yaml is problematic and that's with k8s_objects. There isn't a single file to depend upon so the cleanest and easiest way would be to depend on the rule itself, so looks like #592 could be useful with that.

@mariusgrigoriu
Copy link
Contributor Author

Just tested PR #592. I wasn't able to find a way to include only the stable status file as I don't know what the label for the generated stabel-status.txt is. However, I was able to not depend on any stamp files by referencing an empty filegroup. That said, it would be nicer if you could set stamp_srcs=[] or stamp=False to accomplish that goal. All in all, this seems to work if you don't want to rely on any of bazel's stamps, including the stable ones.

@mmikitka
Copy link
Contributor

@mariusgrigoriu I am using the following approach, which concatenates and filters out values from stable-status and volatile-status:

genrule(
    name = "all-status",
    visibility = ["//visibility:public"],
    outs = [
        "all-status.txt",
    ],
    cmd = ("cat bazel-out/stable-status.txt bazel-out/volatile-status.txt" +
           "| grep -Ev '(BRANCH_TAIL|BUILD_TIMESTAMP|BUILD_HOST|BUILD_NUMBER|GIT_COMMIT_BODY)' > $(location :all-status.txt) "),
    stamp = True,
)

k8s_object(
    name = "foo",
    ...
    stamp_srcs = ["//:all-status"],
)

All the best!

@mariusgrigoriu
Copy link
Contributor Author

The above example helps a lot. I think this works good enough. What do you think about including stamp_srcs in the k8s_default? I'm finding that we want every rule to have it, so it would be nice if we could set it as a default. Happy to submit PR / issue on it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants