ideas

Just general ideas, about things.

Nix

Nix should:

query substituters in parallel
support a bulk query endpoint to avoid requests for each individual dependency
use build times from Hydra to decide how to allocate jobs
improve its remote build protocol to be aware of system load
support GPUs as platforms -- enable Nix to understand GPUs as system platforms, with all the nuance that brings (can't necessarily run code built for one platform on another, forward compatability like with linux-v1, -v2, etc. but with NVIDIA PTX)
have something akin to brokenConditions and badPlatformsConditions -- instead of setting broken or badPlatforms directly, have attribute sets in meta which map strings to booleans. The keys are explanations of why something is broken or unsupported, and the boolean value indicates whether broken or badPlatforms should be set
- Existing art:
A way to guard against evaluation under OfBorg or nixpkgs-review, both of which allow broken, that does not involve setting badPlatforms

Issues and PRs to follow

Issues

PRs

NixOS/nix#11744
NixOS/nix#11746
NixOS/nix#11719
NixOS/nix#11506
NixOS/nix#11373
NixOS/nix#11294
NixOS/nix#11143
NixOS/nix#11130
NixOS/nix#10937
NixOS/nix#10590
NixOS/nix#10511
NixOS/nix#10505
NixOS/nix#10280
NixOS/nix#10218
NixOS/nix#10201
NixOS/nix#9967
NixOS/nix#9895
- pennae mentions reducing the size of a Value to a single tagged pointer
NixOS/nix#9551
NixOS/nix#9429
NixOS/nix#9287
NixOS/nix#9145
NixOS/nix#8585
NixOS/nix#8105
NixOS/nix#7247
NixOS/nix#6855

Query substituters in parallel

Nix currently queries each substituter in sequence. This is inefficient, as it requires a round trip to each substituter for each path in the closure of dependencies. Instead, Nix should query all substituters in parallel, and then wait for all responses before continuing.

Bulk query endpoint

A large amount of traffic is generated by the way Nix queries substituters for binaries. Currently, we iterate through the closure of dependencies and then through each configured substituter. Ideally, having computed the transitive closure of dependencies, we fire off requests to the bulk-query endpoint of each substituter in parallel. This would avoid a number of HTTP HEAD requests to HTTP binary caches, and potentially lessen the cost of maintaining a binary cache, assuming it is backed directly by S3 by reducing the number of API calls.

Prior art includes Attic (https://github.com/zhaofengli/attic), which has an endpoint to find out which paths are missing (https://github.com/zhaofengli/attic/blob/717cc95983cdc357bc347d70be20ced21f935843/server/src/api/v1/get_missing_paths.rs).

Sample tasklist:

Gain an understanding of how the HTTP binary store protocol currently works
Investigate prior art (Attic; potentially others)
Identify stakeholders (e.g., Cachix, Garnix, Flox, Determinate Systems, NixOS Archivists and those with visibility into bandwidth usage for the main NixOS cache)
Create an RFC for Nix, collaborating with stakeholders
Shepard RFC through to approval
Implement bulk API endpoint

Single-threaded evaluation speed

Idea one: allocate memory in bulk — since lists are strict in their length, and attribute sets are strict in their keys. So for lists for example, the list builder would allocate a contiguous block of memory for pointers to values, and another contiguous block of memory for the values themselves. To implement that, I’d introduce a new allocValues function in eval-inline.hh which allocates multiple values and then increments the global counter for number of values. (allocValue increments it for each call.)

Idea two: Looking at the builtins handling lists or attribute sets, it looks like there’s a fair amount of pointer arithmetic, referencing, and dereferencing going on inside for loops. With something like allocValues allocating memory ahead of time (and handling incrementing the global variable for number of elements), I thought using OpenMP’s SIMD pragma on some of the for loops concerning pointers in the builtins might improve performance — provided the length of the list or attribute set is larger than some threshold.

Idea three: Regardless of bulk memory/ value allocation, I understand cache lines in modern processor architectures are very important. The current Value structure weighs in at 24 bytes on my x64 machine: 8 bytes for the enum with padding and 16 bytes for the actual payload. I’m curious if there would be benefits to getting it down to 16 bytes. I had thought of either using something like tagged pointers, though that introduces the need for bit twiddling and much stronger encapsulation than what Value current has. Another alternative was to keep the enum, but change the payload to 8 bytes: everything that can fit in the 8 bytes is inlined (Null, Boolean, Integer, Float, string without context (which is just a char pointer), empty/singleton list, etc.), while everything else is represented as a reference to that larger structure (path, string with context, lists of two or more elements, attribute sets, etc.). This avoids the need for bit twiddling, but introduces additional dereferences — I’m not sure how expensive those are on modern processors.

nixpkgs-review

nixpkgs-review should:

be able to skip packages which use requireFile as their src instead of reporing them as broken every time

Attic

Attic should:

be refactored into a client, frontend (HTTP binary cache protocol), and backend (chunk store and NAR assembly)
be able to run on Cloudflare via Workers and D1
naive object storage (except maybe MinIO) will not be performant due to the number of small files being fetched simultaneously https://blog.min.io/challenge-big-data-small-files/

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ideas

Nix

Issues and PRs to follow

Issues

PRs

Query substituters in parallel

Bulk query endpoint

Single-threaded evaluation speed

nixpkgs-review

Attic

About

Releases

Packages

ConnorBaker/ideas

Folders and files

Latest commit

History

Repository files navigation

ideas

Nix

Issues and PRs to follow

Issues

PRs

Query substituters in parallel

Bulk query endpoint

Single-threaded evaluation speed

nixpkgs-review

Attic

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages