Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A function rebuildDependencyClosure that produces an efficient closure for offline rebuilds #180529

Open
roberth opened this issue Jul 7, 2022 · 10 comments
Assignees
Labels
0.kind: enhancement Add something new 6.topic: closure size The final size of a derivation, including its dependencies 6.topic: hygiene 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 6.topic: testing Tooling for automated testing of packages and modules

Comments

@roberth
Copy link
Member

roberth commented Jul 7, 2022

Project description

Some tests need to build configuration files and such in the sandbox. Currently, we manually list packages as extra dependencies in such tests, e.g.

system.extraDependencies = with pkgs; [
brotli
brotli.dev
brotli.lib
desktop-file-utils
docbook5
docbook_xsl_ns
kmod.dev
libarchive.dev
libxml2.bin
libxslt.bin
nixos-artwork.wallpapers.simple-dark-gray-bottom
ntp
perlPackages.ListCompare
perlPackages.XMLLibXML
python3Minimal
shared-mime-info
sudo
texinfo
unionfs-fuse
xorg.lndir
# add curl so that rather than seeing the test attempt to download
# curl's tarball, we see what it's trying to download
curl
]
++ optional (bootLoader == "grub" && grubVersion == 1) pkgs.grub
++ optionals (bootLoader == "grub" && grubVersion == 2) (let
zfsSupport = lib.any (x: x == "zfs")
(extraInstallerConfig.boot.supportedFilesystems or []);
in [
(pkgs.grub2.override { inherit zfsSupport; })
(pkgs.grub2_efi.override { inherit zfsSupport; })
]);

This tends to bitrot.

A current alternative is to provide all of the build closure's outputs and sources. This doesn't bitrot, but it includes many paths that aren't used, such as ~10G of unpacked llvm sources.

Instead, I propose to write a function that produces a closure that is specifically suited for a particular rebuild.

rebuildDependencyClosure =
  { base # derivation representing a current or old version of a package, configuration `toplevel`, etc
  , next # derivation representing a new version of a package, an altered configuration, etc
  }:
    # ...
    # 1. compute which derivations are in next but not in base (by .drv store path)
    # 2. write all the referenced outputs of those derivation to a file in `$out`

While this building block may have a use case in the package set, the main use case will be in NixOS tests.
This can be facilitated by a new NixOS module that adds the required dependencies to system.extraDependencies.

Here's a draft of the interface:

{ extendModules, lib, ... }:
let
  fixupRecursion = {
    # avoid infinite recursion in the variations below
    system.extraDependencies = lib.mkForce [];
  };

  variation = { ... }: {
    options = {
      base = mkOption {
        type = extendModules { modules = [ fixupRecursion ]; };
        # no default; use regular system if unset
        visible = "shallow";
        description = mdDoc ''
          Optional configuration for the purpose of minimizing the build closure that this system retains.

          Often the regular system configuration is a suitable base configuration and you only need to set `next`.

          For example, if your configuration (the parent of this option) does include some package, you may want to avoid adding the build dependencies of that package. By adding it not just in `next` but also in `base`, the algorithm knows that it doesn't have to rebuild the package. Note that you should add the package in `next` as well. If you need the system to _reconfigure_ itself offline, also add some configuration to `next`.

          For testing a service this may looks as follows

          ```nix
            {
              # regular config does not enable the service
              system.extraDependencyVariations.default.base = {
                services.foo.enable = true;
              };
              system.extraDependencyVariations.default.next = {
                services.foo.enable = true;
                services.foo.settings.xyz = "dummy value";
              };
            }
          ```
        '';
      };
      next = mkOption {
        type = extendModules { modules = [ fixupRecursion ]; }; # or extend `base`?
        # no default, because unchanged config is not useful. Some config must be set by configuration author
        description = ''
          Additions to the configuration for which the build dependencies are computed so that similar configuration changes can be built offline.

          For testing a service, this may look as follows:
          ```nix
            {
              # regular config
              services.foo.enable = true;
              system.extraDependencyVariations.default.next = {
                services.foo.settings.xyz = "dummy value";
              };
            }
          ```
        '';
        visible = "shallow";
      };
    };
  };
in
{
  system.extraDependencyVariations = mkOption {
    type = lazyAttrsOf (submodule variation);
    description = ''
      A set of updated configurations to ensure that the system can regenerate itself offline in certain ways.

      Each variation comes with evaluation overhead, so we recommend to make as many changes as possible in `system.extraDependencyVariations.default.next`
    '';
  };
}

Metadata

  • homepage URL:
  • source URL:
  • license: mit, bsd, gpl2+ , ...
  • platforms: unix, linux, darwin, ...
@roberth roberth added 0.kind: enhancement Add something new 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 6.topic: closure size The final size of a derivation, including its dependencies 6.topic: hygiene 6.topic: testing Tooling for automated testing of packages and modules labels Jul 7, 2022
@roberth roberth self-assigned this Jul 7, 2022
bors bot added a commit to NixOS/nixops that referenced this issue Jul 9, 2022
1526: Integration test r=roberth a=roberth

TODO
 
 - [x] #1484
 - [x] provide a Hercules CI agent for running this VM test (requires KVM)

Scoped out:
 - reduce build dependency outputs closure? (currently adds all deps all the way up to bootstrap deps) # NixOS/nixpkgs#180529


Co-authored-by: Robert Hensing <[email protected]>
bors bot added a commit to NixOS/nixops that referenced this issue Jul 9, 2022
1526: Integration test r=roberth a=roberth

TODO
 
 - [x] #1484
 - [x] provide a Hercules CI agent for running this VM test (requires KVM)

Scoped out:
 - reduce build dependency outputs closure? (currently adds all deps all the way up to bootstrap deps) # NixOS/nixpkgs#180529


Co-authored-by: Robert Hensing <[email protected]>
bors bot added a commit to NixOS/nixops that referenced this issue Jul 9, 2022
1526: Integration test r=roberth a=roberth

TODO
 
 - [x] #1484
 - [x] provide a Hercules CI agent for running this VM test (requires KVM)

Scoped out:
 - reduce build dependency outputs closure? (currently adds all deps all the way up to bootstrap deps) # NixOS/nixpkgs#180529


Co-authored-by: Robert Hensing <[email protected]>
@roberth
Copy link
Member Author

roberth commented Sep 29, 2022

rebuildDependencyClosure =
  { base # derivation representing a current or old version of a package, configuration `toplevel`, etc
  , next # derivation representing a new version of a package, an altered configuration, etc
  }:
    # ...
    # 1. compute which derivations are in next but not in base (by .drv store path)
    # 2. write all the referenced outputs of those derivation to a file in `$out`

I have some doubts whether this is feasible without pulling all of the build closure's outputs as dependencies of the produced derivation (which in turn produces the optimized list)

We'll need to

  • builtins.unsafeDiscardOutputDependency in order to avoid downloading every output all the way up to bootstrap blobs;
  • then compute the paths;
  • then treat them as proper dependencies once more.

It seems that we'll need RFC92 outputOf to produce a dynamic derivation that recovers the selected output dependencies.
@Ericson2314 does that make sense? How's the RFC92 implementation progress?

@Ericson2314
Copy link
Member

@roberth and I talked about, but for the record 92 just needs review/rebase, and some things can be cleaned up based upon us exploring what = does in string contexts.

@roberth
Copy link
Member Author

roberth commented Jun 10, 2023

  • builtins.unsafeDiscardOutputDependency in order to avoid downloading every output all the way up to bootstrap blobs;

I think this can be achieved by querying the closure in the same derivation that computes the closure difference stuff.
That way the deluge of transitive dependencies is not scanned as output, and not everything needs to be downloaded.

  • then compute the paths;
  • then treat them as proper dependencies once more.

I've previously found that Nix is eager to check for paths from the whole derivation closure, even if those should in theory not be known to the current derivation. Therefore this item may not be an obstacle?

@baloo
Copy link
Member

baloo commented Oct 20, 2023

write all the referenced outputs of those derivation to a file in $out

Let's assume I got the #1 working and I have a list of missing derivations, how do you get the outputs referenced by a derivation? I'm missing something.

@roberth
Copy link
Member Author

roberth commented Oct 20, 2023

@baloo I'm not sure either.

Maybe exportReferencesGraph on the .drvPath, though NixOS/nix#9146 could be a problem. Not sure if the outputs are included in the first place, and recovering string contexts for them isn't allowed in pure mode at this time (if we need to?).

builtins.outputOf is still experimental, though that doesn't list the outputs.

Maybe import the .drv file? Not something you would normally do. I think it's really old functionality from the time the expression language was new, but it might do the job.

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/more-airgap-questions/38748/12

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/more-airgap-questions/38748/11

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/pre-rfc-implement-dependency-retrieval-primitive/43418/6

@roberth
Copy link
Member Author

roberth commented Nov 2, 2024

I'm leaning towards implementing this as a function in the language, because this is an operation on "derivable paths", and not an operation an realised paths. By performing this little graph computation in the sandbox, we cause too many inputs to be added, and for them to be realised, which is all unnecessary and slow. Realisation is only needed when this set of "derivable paths" - ie strings with context - are used in the final derivation (e.g. toplevel) where they're no different from other derivation inputs.
It could either be a single primop that implements the exact logic described in this issue, or a generic function that exposes derivation inputDrvs, outputs, etc (import "/nix/store/...-foo.drv" doesn't return references).

As a practical benefit of this approach, it works without fixing Nix's behavior of assuming "DrvDeep" behavior in a few places, which arguably should be done, but will be slow to trickle through. The expression language simply runs on the client, without any need to upgrade the daemon, remote builders, etc. That way it's simple: if you have the builtin, it works; otherwise update your client, which you may even do temporarily without upgrading your system.
A small possible disadvantage is that this primop must eagerly compute all hashes, as the structure of the output (e.g. length of list, or attrNames) depends on equality on derivation hash values. However, we don't do async hashing yet, and when we do, the operation in this issue is rare enough, and is likely to be used in a high-level context where blocking this computation on would actually make good use of resources anyway.

@roberth
Copy link
Member Author

roberth commented Nov 2, 2024

In case of RFC 92 dynamic derivations:

  • Getting the inputDrvs from a normal derivation that has dependencies using outputOf, those are "derivable paths" and can be returned just fine
  • Getting the inputDrvs from a derivation that is itself an outputOf another derivation, this would constitute IFD, as it reads from an output. Users of this new function will want to avoid that. In principle the base derivation should be good enough, but if the base derivation produces FODs, those won't be visible to the rebuildDependencyClosure implementation, which could be a problem if the set of FODs in new is not a subset of those in old, and the purpose of the closure is offline rebuilds.

I don't know how much of a problem that would be, but if we come up with a stable protocol for ingesting derivations, we actually have the tech to do "recursive nix"-style FODs dynamically to cover this for the quasi-offline use case: hermetic building and testing in the Nix sandbox.

Perhaps then, to avoid IFD and recursive-nix-like weirdness, it should be something in the builder after all?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: enhancement Add something new 6.topic: closure size The final size of a derivation, including its dependencies 6.topic: hygiene 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 6.topic: testing Tooling for automated testing of packages and modules
Projects
None yet
Development

No branches or pull requests

4 participants