-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Runtime-only dependencies #1080
Comments
@nbp you might be interested in this---as I understand it, this would be moving a good chunk of Shipping Security Updates into Nix itself. |
@Ericson2314 I don't think you can make a runtime-only deps into Nix with the hope that this does not change. For example, if we upgrade gcc, with should not expect gcc to show up in the final binary, but still influence a lot the generated code. So I do not think it would make sense to prevent rebuilds by ignoring non-runtime dependencies. Also, the biggest problem of Shipping Security Updates is to be able to map dependencies from the base-nixpkgs to the secure-nixpkgs. Currently I basing this work on the list of arguments provided by |
Hmm? GCC is used only at build-time, ideally. I assume if a build-only dep has a security flaw, either we manually ignore it or rebuild its world--there is no (automatic) middle ground because there's no path to patch at runtime. I don't expect this feature to change that. What is feature would do is basically let nix do the patching for you: Nix builds derivations with runtime dep paths normalized, so assuming you still have that normalized-binary cached, only the nix-driven renaming part exists. Manually patching binaries would only be needed for deps used at build- and run-time (which generally indicates poor packaging anways). In the case of the intensional store, this normalization is needed for transmission and/or storage. But even without that, its still good to avoid rebuilds---your usecase. |
In NixOS/nixpkgs#21268 I create a When cross-compiling. deps must be strictly run-time-only or build-time-only---machine code cannot be used in both phases without emulation in one of them. Similarly, security updates are both cheap and sound when deps are also strictly run-time-only or build-time-only---build-time deps never need to be replaced (unless we have a tool emitting insecurities, in which case a mass-rebuild is unavoidable), and run-time deps are never inspected until run-time, so we know it's sound to sed paths (barring encrypted / compressed paths we'll miss or similar nonsenical design). With this feature, and that PR, and the rigorous separation of run-time and build-time deps, security updates would be as simple as setting up another "stdenv extension" like https://github.com/NixOS/nixpkgs/pull/21268/files#diff-1640a202e3ce4fd5a204d6c39e252787 using entending old, insecure, stages with an additional |
@Ericson2314: Personally, I'm deeply unconvinced by sed. The issue isn't that it may miss compressed paths; it's that it may corrupt files that store strings as (say) An alternate approach, which avoids such issues, is to build against the normalized path and then (instead of sedding to replace the normalized path), place a symlink to the store of the runtime dependency at the normalized path. This requires slightly more care in defining the general form of a normalized path, (since they must be globally unique) but entirely avoids issues around compressed files, formats that can't handle the length of such a path changing, etc. |
Another approach would be to make a virtualenv per set of runtime dependencies, and pass that as "/usr" to builds, instead of passing each dep by full path. So the builds would be more like regular distros do it.
|
I'd be way about a |
I meant (hoping you never build with the same deps but different versions on the same machine unless you have sandboxing) |
OK that works, but if your dependencies to their runtime virtualens, you are left with loads of cruft in the closure. |
It seems to me that, in some sense, support for runtime only dependencies already exists in nixpkgs. After all, if a dependency is truly runtime only, there is no reason to add it to buildInputs. Presumably the program will look for it on the PATH when it is needed. You merely need to set PATH correctly at runtime, which can be done with a wrapper derivation in the style of the firefox package. When one of firefox's runtime dependencies are updated, only the wrapper derivation needs to rebuild, the firefox binary derivation is hopefully unaffected. Or at least that's the case if I've groked the firefox package correctly, which is far from certain. The common wrapProgram pattern in nixpkgs effectively turns runtime only dependencies into buildInputs. Some benefit in reducing rebuilds might be obtained simply by using wrapper derivations in the style of firefox instead of wrapProgram. Even if a package has hardcoded paths to runtime dependencies it could be patched to point at a script which looks up the appropriate runtime dependency on PATH. Build systems which search for runtime executables at build time could effectively be tricked by the same method: make a hidden directory of wrapper scripts which find and execute the appropriate executable on PATH, then let the build system find these scripts instead. For dlopened dynamic libraries I presume you could set LD_LIBRARY_PATH in the wrapper derivation, though I suspect this would usually provide little benefit as you may need to add the package to buildInputs in order to provide the headers anyway. Though once we have intensional store this would allow us to maintain a consistent $cas, as long as our dependencies headers don't change. I'm struggling to see if this proposal provides benefit beyond simply reducing the amount of wrapping required. What do you think? |
@spwhitt much of the benefit from intensional store would come from runtime-only dependencies, as that would allow shortcutting builds and better deduplication. See NixOS/rfcs#17 (comment) for an analysis |
I marked this as stale due to inactivity. → More info |
I closed this issue due to inactivity. → More info |
Still care. |
This issue has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/nixpkgs-architecture-team-meeting-13-agenda/22379/2 |
This is my take on what an implementation of this idea could look like and how it would impact how we obtain derivation outputs. For each derivation alongside the regular input hash we generate and store an additional relaxed input hash generated with all of the runtime-only dependencies replaced with entries from a list of dummy paths (which could potentially take length into account). Now we go through the following list of bullet points to obtain our derivation until we succeed.
By always using dummy hashes during the build and then immediately replacing them, the rewriting mechanism seems quite safe, since if a relaxed input hash exists at all we know we already did the same thing once when producing the original output. We can avoid the length issue by factoring length into the calculation of our dummy paths or hope for fixed-length store paths. This is conceptually similar to, but different in proposed execution from what is proposed in NixOS/rfcs#17 (comment). That post is wonderful and also has some data to quantify the possible benefits. |
I still care about this idea and I think for correctness reasons rewriting runtime dependencies would be a good alternative to rewriting one possible output path that could be generated from a specific derivation into another (which is part of the design for CA derivations). Let me try to offer an explanation why. The design for the intensional store (CA derivations) from Elco's thesis puts all of the possible output paths that could be generated from a specific derivation into an output equivalence class and treats all of the members of said class as functionally equivalent, since they were generated from the same input address. It then introduces equivalence class collisions which is when for example firefox depends on two different libraries which in turn depend on two different store paths containing realisations of the same glibc derivation (assuming the build for glibc is not reproducible) and introduces an algorithm for rewriting one realisation of this derivation into the other so that we end up with a sane firefox which depends on only one specific glibc - see also https://www.tweag.io/blog/2020-11-18-nix-cas-self-references/. I am not sure content addressed derivations today already implement this rewriting algorithm or still get out of that situation by rebuilding more than required if they encounter it (a behavior which I think might be preferable). Doing this kind of rewriting has an observable effect if a derivation does something like include a hash of a non-reproducible dependency in its output, since as the result of a rewrite it can lead to a mismatch between the hash and the hashed output. I think both approaches can be used to tackle this problem of equivalence class collisions, with Elco's approach being a more complete and precise solution to the problem at the cost of its effects being more observable and introducing and rewriting runtime-only dependencies being a more limited solution to the problem as it can only help with runtime-only dependencies, but being more correct and broadly applicable since it can rewrite all derivations with the same relaxed input hash, not only all possible output paths of a specific derivation. [1] https://www.microsoft.com/en-us/research/publication/cloudbuild-microsofts-distributed-and-caching-build-service/ |
Maybe instead of limiting rewriting to runtime-only dependencies re-writing could even be allowed after the build for all runtime dependencies. If a mechanism existed that enables this it could be used
|
Just so you know @mschwaig, I haven't really thought about this issue in a while because the first problem to be solved is that only a few braver experimenters are using content-addressing derivations at all! I fear that worrying about additional features on top (as I did when first writing this issue) is a bit premature until Hydra supports CA derivations. |
@Ericson2314 thanks for letting me know, that's very kind. |
I've been thinking it would be nice to have runtime-only dependencies in order to avoid spurious rebuilds. The basic mechanism would be building with a dummy path, and then replacing it post-build.
If this sounds like self-references with the intentional store, that is no coincidence: self-references are indeed but a simple kind of runtime-only dep. Obviously a package can't actually use itself while building, it can only hard-code where it will be installed.
Finally, I think we should consider allowing larger cycles than just self-refs. This would solve the libc-sh problem where each has a runtime dep on the other.
The text was updated successfully, but these errors were encountered: