Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluating nixpkgs takes a huge RAM #7308

Closed
tobiasBora opened this issue Nov 15, 2022 · 4 comments
Closed

Evaluating nixpkgs takes a huge RAM #7308

tobiasBora opened this issue Nov 15, 2022 · 4 comments

Comments

@tobiasBora
Copy link

Describe the bug

All the following commands takes a lot of RAM (I'd say around 1G from looking at htop, time tells me 500M) and time (10mn, and it's even worse when the system starts to swap because the RAM is full…) to evaluate on a raspberry pi 3b, even if I they do a no-op (i.e. if the current system is already running this version):

$ nixos-rebuild switch --no-flake
$ nixos-rebuild switch --flake .
$ nix flake lock --update-input …

As far as I see this is not visible on my main laptop… no idea why.

I'm also confused, sometime nix run nixpkgs#… is really long, sometimes takes no time at all.

$ nix run nixpkgs#time # takes a lot of time
$ nix run nixpkgs#hello # takes no time (maybe nixpkgs#time has done some caching?)

Steps To Reproduce

  1. Start NixOs on the raspberry pi 3 (you may need to add a swap file or it will freeze when no more RAM is available)
  2. Run twice someting like that (tested on )
# env time -v sudo nixos-rebuild switch --flake github:cwi-foosball/foosball#foosballrasp

I also tried with the non-flake version and I got similar issues:

# /nix/store/jw4jjw6ml5vymjw0yhqg1i9dln12g9k4-time-1.9/bin/time -v sudo nixos-rebuild switch -I nixos-config=configuration.nix -I "nixpkgs=/nix/store/7mffl2hq695yjvgh18vgrpqqn9cr2i1f-source" --no-flake
building Nix...
building the system configuration...
activating the configuration...
setting up /etc...
reloading user units for pi...
setting up tmpfiles
	Command being timed: "sudo nixos-rebuild switch -I nixos-config=configuration.nix -I nixpkgs=/nix/store/7mffl2hq695yjvgh18vgrpqqn9cr2i1f-source --no-flake"
	User time (seconds): 96.84
	System time (seconds): 54.42
	Percent of CPU this job got: 30%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 8:22.45
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 504280
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 97086
	Minor (reclaiming a frame) page faults: 873927
	Voluntary context switches: 173974
	Involuntary context switches: 278897
	Swaps: 0
	File system inputs: 2584120
	File system outputs: 428560
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0

Expected behavior

I expect nix to take close to no RAM/time, especially for a no-op operation.

nix-env --version output

2.11.0

@roberth
Copy link
Member

roberth commented Dec 11, 2022

I'm also confused, sometime nix run nixpkgs#… is really long, sometimes takes no time at all.

$ nix run nixpkgs#time # takes a lot of time
$ nix run nixpkgs#hello # takes no time (maybe nixpkgs#time has done some caching?)

I can think of some effects.

  1. The mutable flake cache. nixpkgs is a mutable flakeref. It may have been downloaded the first time and cached on the second.

  2. Instantiation. Nix creates a bunch of .drv files. If they already exist, this process may be faster.

  3. Evaluation cache. This is basically per installable, and it doesn't cache any dependencies. If we make nixpkgs take dependencies from its self, this PR might help with this Cache the result of getFlake #4511

I expect nix to take close to no RAM/time, especially for a no-op operation.

(2) and (3) may be improved, perhaps.

Improving memory usage is by no means easy. By default, a large number of derivations must be evaluated, and cache invalidation is hard ;).
Nix expressions do tend to hold on to more values and thunks than you might expect it to, but these are usually required for computations that could be done, but won't be done. It'd be interesting to see how this could be improved by significantly changing the interpreter, but this would be a research project, not an easy fix.
That said, we've accepted a number of performance improvements since the 2.4 release (thanks pennae!), and perhaps some fresh eyes (or more of the same eyes) could find more of such incremental improvements.

@tobiasBora
Copy link
Author

Thanks a lot for the answer. So also, I did some tests to see if the problem was coming from flake or nix, and when I nixos-rebuild switch twice a system without flake twice, it also takes like 10mn the two times, even without change. So I guess it takes time to evaluate all the NixOs modules, even if nearly all of them are disabled. Don't know if it would be possible somehow to avoid evaluating useless modules.

@roberth
Copy link
Member

roberth commented Dec 11, 2022

if it would be possible somehow to avoid evaluating useless modules.

There's NixOS/rfcs#22 which I think is also good for actual modularity, and there's potential for a workaround, although that would tend to make the already complicated module system more complicated.

@stale stale bot added the stale label Jun 18, 2023
@roberth
Copy link
Member

roberth commented Jul 1, 2023

#8621 is a duplicate of this, but I'm closing this one, as most of the conversation here is about the specific observations that do not help with solving the problem.
So any further conversation can continue in the fresh issue

@roberth roberth closed this as not planned Won't fix, can't repro, duplicate, stale Jul 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants