Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime-only dependencies #1080

Open
Ericson2314 opened this issue Oct 5, 2016 · 20 comments
Open

Runtime-only dependencies #1080

Ericson2314 opened this issue Oct 5, 2016 · 20 comments
Assignees
Labels
feature Feature request or proposal

Comments

@Ericson2314
Copy link
Member

Ericson2314 commented Oct 5, 2016

I've been thinking it would be nice to have runtime-only dependencies in order to avoid spurious rebuilds. The basic mechanism would be building with a dummy path, and then replacing it post-build.

If this sounds like self-references with the intentional store, that is no coincidence: self-references are indeed but a simple kind of runtime-only dep. Obviously a package can't actually use itself while building, it can only hard-code where it will be installed.

Finally, I think we should consider allowing larger cycles than just self-refs. This would solve the libc-sh problem where each has a runtime dep on the other.

@Ericson2314
Copy link
Member Author

Ericson2314 commented Nov 2, 2016

@nbp you might be interested in this---as I understand it, this would be moving a good chunk of Shipping Security Updates into Nix itself.

@nbp
Copy link
Member

nbp commented Nov 3, 2016

@Ericson2314 I don't think you can make a runtime-only deps into Nix with the hope that this does not change. For example, if we upgrade gcc, with should not expect gcc to show up in the final binary, but still influence a lot the generated code. So I do not think it would make sense to prevent rebuilds by ignoring non-runtime dependencies.

Also, the biggest problem of Shipping Security Updates is to be able to map dependencies from the base-nixpkgs to the secure-nixpkgs. Currently I basing this work on the list of arguments provided by callPackage, because we write conditions after the list of arguments and not before. Making uses of something which is listed after conditions might lead to variations while we are trying to zip 2 sets of packages.

@Ericson2314
Copy link
Member Author

Ericson2314 commented Nov 3, 2016

Hmm? GCC is used only at build-time, ideally. I assume if a build-only dep has a security flaw, either we manually ignore it or rebuild its world--there is no (automatic) middle ground because there's no path to patch at runtime. I don't expect this feature to change that.

What is feature would do is basically let nix do the patching for you: Nix builds derivations with runtime dep paths normalized, so assuming you still have that normalized-binary cached, only the nix-driven renaming part exists. Manually patching binaries would only be needed for deps used at build- and run-time (which generally indicates poor packaging anways).

In the case of the intensional store, this normalization is needed for transmission and/or storage. But even without that, its still good to avoid rebuilds---your usecase.

@Ericson2314
Copy link
Member Author

In NixOS/nixpkgs#21268 I create a buildPackages package set just used for build-time/native deps. It's that change + this that I envision would give us security updates for free.

When cross-compiling. deps must be strictly run-time-only or build-time-only---machine code cannot be used in both phases without emulation in one of them. Similarly, security updates are both cheap and sound when deps are also strictly run-time-only or build-time-only---build-time deps never need to be replaced (unless we have a tool emitting insecurities, in which case a mass-rebuild is unavoidable), and run-time deps are never inspected until run-time, so we know it's sound to sed paths (barring encrypted / compressed paths we'll miss or similar nonsenical design).

With this feature, and that PR, and the rigorous separation of run-time and build-time deps, security updates would be as simple as setting up another "stdenv extension" like https://github.com/NixOS/nixpkgs/pull/21268/files#diff-1640a202e3ce4fd5a204d6c39e252787 using entending old, insecure, stages with an additional selfBuild = false; stage with the new secure packages.

@eternaleye
Copy link

@Ericson2314: Personally, I'm deeply unconvinced by sed. The issue isn't that it may miss compressed paths; it's that it may corrupt files that store strings as (say) length, bytes (such as Rust binaries).

An alternate approach, which avoids such issues, is to build against the normalized path and then (instead of sedding to replace the normalized path), place a symlink to the store of the runtime dependency at the normalized path.

This requires slightly more care in defining the general form of a normalized path, (since they must be globally unique) but entirely avoids issues around compressed files, formats that can't handle the length of such a path changing, etc.

@wmertens
Copy link
Contributor

wmertens commented May 2, 2019

Another approach would be to make a virtualenv per set of runtime dependencies, and pass that as "/usr" to builds, instead of passing each dep by full path. So the builds would be more like regular distros do it.

<handwaving> Ideally, that path uses the hash of the dependency names during build, and at runtime that path is resolved to the $CAS-ified actual dependencies with versions. </handwaving>

@Ericson2314
Copy link
Member Author

I'd be way about a /usr for the simple reason that any /usr path that persists into the finished build requires more mounting of things. It's important that dependencies are kept under unique prefixes so any combination of nix builds is a valid dependency set (no conflicts).

@wmertens
Copy link
Contributor

wmertens commented May 2, 2019

I meant quote-unquote usr, so you have e.g. /nix/build/$hash_of_runtime_dep_names and then through some appropriate mechanism that gets resolved at runtime to /nix/store/$cas_of_runtime_deps_virtualenv. The by-name path won't exist at runtime.

(hoping you never build with the same deps but different versions on the same machine unless you have sandboxing)

@Ericson2314
Copy link
Member Author

OK that works, but if your dependencies to their runtime virtualens, you are left with loads of cruft in the closure.

@spwhitt
Copy link

spwhitt commented May 10, 2019

It seems to me that, in some sense, support for runtime only dependencies already exists in nixpkgs. After all, if a dependency is truly runtime only, there is no reason to add it to buildInputs. Presumably the program will look for it on the PATH when it is needed. You merely need to set PATH correctly at runtime, which can be done with a wrapper derivation in the style of the firefox package. When one of firefox's runtime dependencies are updated, only the wrapper derivation needs to rebuild, the firefox binary derivation is hopefully unaffected. Or at least that's the case if I've groked the firefox package correctly, which is far from certain.

The common wrapProgram pattern in nixpkgs effectively turns runtime only dependencies into buildInputs. Some benefit in reducing rebuilds might be obtained simply by using wrapper derivations in the style of firefox instead of wrapProgram.

Even if a package has hardcoded paths to runtime dependencies it could be patched to point at a script which looks up the appropriate runtime dependency on PATH. Build systems which search for runtime executables at build time could effectively be tricked by the same method: make a hidden directory of wrapper scripts which find and execute the appropriate executable on PATH, then let the build system find these scripts instead.

For dlopened dynamic libraries I presume you could set LD_LIBRARY_PATH in the wrapper derivation, though I suspect this would usually provide little benefit as you may need to add the package to buildInputs in order to provide the headers anyway. Though once we have intensional store this would allow us to maintain a consistent $cas, as long as our dependencies headers don't change.

I'm struggling to see if this proposal provides benefit beyond simply reducing the amount of wrapping required. What do you think?

@wmertens
Copy link
Contributor

@spwhitt much of the benefit from intensional store would come from runtime-only dependencies, as that would allow shortcutting builds and better deduplication. See NixOS/rfcs#17 (comment) for an analysis

@stale
Copy link

stale bot commented Feb 15, 2021

I marked this as stale due to inactivity. → More info

@stale stale bot added the stale label Feb 15, 2021
@stale
Copy link

stale bot commented May 2, 2022

I closed this issue due to inactivity. → More info

@stale stale bot closed this as completed May 2, 2022
@Ericson2314 Ericson2314 reopened this May 3, 2022
@stale stale bot removed the stale label May 3, 2022
@Ericson2314
Copy link
Member Author

Still care.

@fricklerhandwerk fricklerhandwerk added the feature Feature request or proposal label Sep 12, 2022
@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/nixpkgs-architecture-team-meeting-13-agenda/22379/2

@mschwaig
Copy link
Member

This is my take on what an implementation of this idea could look like and how it would impact how we obtain derivation outputs.
I am not sure if it is what @Ericson2314 had in mind as well.

For each derivation alongside the regular input hash we generate and store an additional relaxed input hash generated with all of the runtime-only dependencies replaced with entries from a list of dummy paths (which could potentially take length into account).

Now we go through the following list of bullet points to obtain our derivation until we succeed.

  • First we check if we already have the path in our local store, using the regular input hash.
  • We check if we have a similar output in our store, using the relaxed input hash. If we find a similar store path this way we can rewrite it to the version we need by replacing the specific runtime-only paths that it uses with ours.
  • We check if we can substitute the output by querying our substituters by regular input hash.
  • We check if we can substitute a similar output by querying our substituters by relaxed input hash and then rewriting what we got as described above.
  • We build the derivation ourselves.

By always using dummy hashes during the build and then immediately replacing them, the rewriting mechanism seems quite safe, since if a relaxed input hash exists at all we know we already did the same thing once when producing the original output. We can avoid the length issue by factoring length into the calculation of our dummy paths or hope for fixed-length store paths.
Hashes in packed or compressed binaries being an issue is something that Nix already accepts elsewhere, and even if this happens for a specific derivation it can be easily avoided by not marking the offending dependency as runtime-only.

This is conceptually similar to, but different in proposed execution from what is proposed in NixOS/rfcs#17 (comment). That post is wonderful and also has some data to quantify the possible benefits.

@stale stale bot added the stale label May 21, 2023
@mschwaig
Copy link
Member

mschwaig commented Sep 21, 2023

I still care about this idea and I think for correctness reasons rewriting runtime dependencies would be a good alternative to rewriting one possible output path that could be generated from a specific derivation into another (which is part of the design for CA derivations).

Let me try to offer an explanation why.

The design for the intensional store (CA derivations) from Elco's thesis puts all of the possible output paths that could be generated from a specific derivation into an output equivalence class and treats all of the members of said class as functionally equivalent, since they were generated from the same input address.

It then introduces equivalence class collisions which is when for example firefox depends on two different libraries which in turn depend on two different store paths containing realisations of the same glibc derivation (assuming the build for glibc is not reproducible) and introduces an algorithm for rewriting one realisation of this derivation into the other so that we end up with a sane firefox which depends on only one specific glibc - see also https://www.tweag.io/blog/2020-11-18-nix-cas-self-references/.

I am not sure content addressed derivations today already implement this rewriting algorithm or still get out of that situation by rebuilding more than required if they encounter it (a behavior which I think might be preferable).

Doing this kind of rewriting has an observable effect if a derivation does something like include a hash of a non-reproducible dependency in its output, since as the result of a rewrite it can lead to a mismatch between the hash and the hashed output.
Such a thing can never happen when rewriting runtime-only dependencies. I think this is an instance of what the what the CloudBuild: Microsoft’s Distributed and Caching Build Service[1] and Build Systems à la Carte [2] call Frankenbuilds. Nix allows this on the basis that it considers all members of an output equivalence class the same. If that's the case I think adding a hash of your non-reproducible dependency to your output should be considered a bug in your Nix code.

I think both approaches can be used to tackle this problem of equivalence class collisions, with Elco's approach being a more complete and precise solution to the problem at the cost of its effects being more observable and introducing and rewriting runtime-only dependencies being a more limited solution to the problem as it can only help with runtime-only dependencies, but being more correct and broadly applicable since it can rewrite all derivations with the same relaxed input hash, not only all possible output paths of a specific derivation.


[1] https://www.microsoft.com/en-us/research/publication/cloudbuild-microsofts-distributed-and-caching-build-service/
[2] https://www.microsoft.com/en-us/research/publication/build-systems-la-carte/

@stale stale bot removed the stale label Sep 21, 2023
@mschwaig
Copy link
Member

mschwaig commented Oct 4, 2023

Maybe instead of limiting rewriting to runtime-only dependencies re-writing could even be allowed after the build for all runtime dependencies.

If a mechanism existed that enables this it could be used

  • in code, to manually specify security fixes by including the inputs that should be re-written as an attribute in the derivation:
    rewrite = [ {from = pkgs.openssl; to = otherpkgs.openssl; } ]
    
  • by Nix itself as part of resolving the derivation to document and implement the resolution of SOME equivalence class collisions (it would be nice if that resolution was recorded in the resolved derivation) and
  • by Nix itself to generate one output from another that has the same relaxed input hash.

@Ericson2314
Copy link
Member Author

Just so you know @mschwaig, I haven't really thought about this issue in a while because the first problem to be solved is that only a few braver experimenters are using content-addressing derivations at all!

I fear that worrying about additional features on top (as I did when first writing this issue) is a bit premature until Hydra supports CA derivations.

@mschwaig
Copy link
Member

@Ericson2314 thanks for letting me know, that's very kind.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Feature request or proposal
Projects
None yet
Development

No branches or pull requests