-
-
Notifications
You must be signed in to change notification settings - Fork 14.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
build-bazel-package: switch hash mode to “flat” #87314
Conversation
pkgs/development/python-modules/tensorflow-probability/default.nix
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great; working on building and testing now!
pkgs/development/python-modules/tensorflow-probability/default.nix
Outdated
Show resolved
Hide resolved
Should this be mentioned in the change log? |
flat hashes can be substituted through hashed-mirrors, while recursive hashes can’t. This is especially important for Bazel since the bazel fetch dependencies can come from multiple different methods (git, http, ftp, etc.). To do this, we create tar archives from the output/external directory, which is then extracted to build. All of the Bazel hashes are all updated.
Can you elaborate more on this? I'm not sure that I know the difference aside from file vs directory. |
Yes, happy to elaborate, as this is something I've spent a lot of time looking at! I've been meaning to write up some more documentation about hashed mirroring, its advantages, and its limitations, but alas somehow one never finds the time to write docs ... anyways. Currently, we of course have cache.nixos.org, which is the full Hydra binary cache. It has both source packages and binary packages, since it has everything Hydra needs. We also have tarballs.nixos.org, which just has source packages mirrored. One constraint is that it is just holds simple content-addressed files, like those grabbed by
where Concretely: $ wget tarballs.nixos.org/sha256/0ssi1wpaf7plaswqqjwigppsg5fyh99vdlb9kzl7c9lng89ndq1i
$ nix-hash --type sha256 --flat --base32 0ssi1wpaf7plaswqqjwigppsg5fyh99vdlb9kzl7c9lng89ndq1i
0ssi1wpaf7plaswqqjwigppsg5fyh99vdlb9kzl7c9lng89ndq1i In our nix expressions, we actually reference
This makes it trivially simple to substitute in an alternate fetcher that's just a basic S3 bucket or webserver with a directory of tarballs addressed by their hashes in a straightforward manner, and the fetchers will look there first. You can see that what we have there does in fact have this hash: $ nix-build -A hello.src
/nix/store/3x7dwzq014bblazs7kq20p9hyzz0qh8g-hello-2.10.tar.gz
$ nix-hash --type sha256 --to-base32 $(sha256sum /nix/store/3x7dwzq014bblazs7kq20p9hyzz0qh8g-hello-2.10.tar.gz | awk '{print $1}')
0ssi1wpaf7plaswqqjwigppsg5fyh99vdlb9kzl7c9lng89ndq1i This is great for mirroring, because it's simple, secure, efficient, and easy. Moreover, even users who are not using the Hydra binary cache or who are building nix stores at different store prefixes can still use tarballs.nixos.org or mirror it directly onto their own enterprise FTP servers/Artifactor/Nexus/etc., without a hash change. While it is certainly possible to setup a spoofing mechanism by which tarballs are addressed both by their direct hashes and by the hash that they would have if you were to unpack them and recursively hash the directory, NixPkgs currently has no real mechanism for doing this, and it's a little dubious, since you pollute your hashed mirror with things that are addressed not by their direct content but by a proxy of content. The Software Heritage has a notion of "cooking" a tarball where you ask for something and they see if they have something that could satisfy that with some mutation, but it's a more complicated process and a little harder for users to understand/verify themselves. That said, there are many other advantages to using a flat tar.gz file for the source package, enough so that even if the above were solved [1] we'd still want to go this route regardless. I've recently updated all of the cargo fetchers this way in #79975. All six points at the top of that PR description are just as relevant for Bazel (if not moreso). For instance, I have some remote builders on AWS, and right now they completely choke while they replicate the [1] I'd argue solving the above is a case where the cure is worse than the disease, because you end up with more complicated infra that's harder to reason about, but reasonable minds may disagree. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming the hashes are correct and builds complete, this looks like an improvement to me 👍
<literal>flat</literal> hash mode is now used for dependencies | ||
instead of <literal>recursive</literal>. This is to better allow | ||
using hashed mirrors where needed. As a result, these hashes | ||
will have changedv. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will have changedv. | |
will have changed. |
Note you'll have to manually trigger the builds for the derivations that use |
@@ -146,14 +147,15 @@ in stdenv.mkDerivation (fBuildAttrs // { | |||
preConfigure = '' | |||
mkdir -p "$bazelOut" | |||
|
|||
test "${bazel.name}" = "$(<$deps/.nix-bazel-version)" || { | |||
(cd $bazelOut && tar xfz $deps) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note you'll have to manually trigger the builds for the derivations that use buildBazelPackage to make sure they all build correctly.
Unlike normal FOD staleness scenarios, this one happens to be safe, because if you forget to change the FOD and end up with a stale hash, this line will fail in the builder, since your input will be a directory that can't be extracted with tar
.
Yes - this is an annoying part of Nix's caching mechanism. I think we can do something to make this better by validating the hash afterwards. Something similar is done in buildRustPackage: nixpkgs/pkgs/build-support/rust/default.nix Lines 105 to 142 in 4d66a37
|
flat hashes can be substituted through hashed-mirrors, while recursive
hashes can’t. This is especially important for Bazel since the bazel
fetch dependencies can come from multiple different methods (git,
http, ftp, etc.). To do this, we create tar archives from the
output/external directory, which is then extracted to build. All of
the Bazel hashes are all updated.
Motivation for this change
Things done
sandbox
innix.conf
on non-NixOS linux)nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
./result/bin/
)nix path-info -S
before and after)