-
Notifications
You must be signed in to change notification settings - Fork 791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ci] Bazel cache not being used #20844
Comments
I also think we need to add |
I did a small experiment: I created this PR that only builds a few SW artifacts (and none of them rely on the bitstreams). I modified the pipeline to save and upload the bazel execution log. I then ran the pipeline twice using Azure to make sure that the two runs used the same code and merge commit:
The first difference appears here:
In fact, many |
I think this is the underlying issue: bazelbuild/rules_python#1761 |
According to https://peps.python.org/pep-0552/ |
The rules_python does contain provisions: However we are building python wheels ourselves. |
I have run another test after adding the reproducibility fixes that @nbdd0121 identified for Python. The runs are here and there. The python issue is gone but there are more:
What do those numbers stand for?
|
I have identified that the problem with stamping is that we use the workspace command in As an experiment, in bazelbuild/bazel#20318, I have tried to override this workspace command in CI to use constant values: There are now no more differences between the stamping files, however we still have a few offenders to track:
|
I think we need to discuss and agree on how to handle bazel stamping since at the moment we are not using it consistenly:
Bazel supports Current users of stamping:
|
Correction: this isn't needed as
|
Description
We believe the Bazel cache that we store in a GCP bucket and use for CI is not working correctly.
The
sw_build
job is set up to use the cache by runningci/bazelisk.sh
for its build (which connects Bazel to the GCP bucket as its cache) and loading the GCP write credentials when run on the master branch so that it can write to the cache. As far as we can tell, thesw_build
job never loads from the cache and always rebuilds from scratch.Locally reproducing
I tried to set up a local environment that replicates the
sw_build
job to read from the cache too, but no success.These are the (convoluted) steps I used (thanks to @nbdd0121 for helping with this):
~/.bazelrc
(and~root/.bazelrc
for good measure) to replicate whatci/bazelisk.sh
builds:rules/nonhermetic.bzl
:HOME="/root"
- this is what the CI uses and theHOME
env var is non-hermetic, so may have influenced the cache?PATH=/tools/verilator/v4.210/bin:/tools/verible/bin:/root/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
which is whatsw_build
uses - we extracted this by runningenv
in this PR's CI run.$XILINXD_LICENSE_FILE
already matched, and the other vars were unset.sw_build
does but using./bazelisk.sh
instead ofci/bazelisk.sh
to ensure our own.bazelrc
is used.We did not observe the cache being used at all. There must be something else contributing to the cache that we didn't account for.
Related
Not every job in the CI workflow uses the cache: some aren't using
ci/bazelisk.sh
so won't read from it, and some aren't loading the GCP write credentials and won't write to it.I've opened #20836 to address these issues, but they won't make much difference if the cache isn't working.
The text was updated successfully, but these errors were encountered: