Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive CPU/Memory utilization evaluating cross-compiled packages causing OOMs during evaluation #338231

Open
kjeremy opened this issue Aug 29, 2024 · 7 comments
Labels
0.kind: bug Something is broken 6.topic: cross-compilation Building packages on a different platform than they will be used on 6.topic: python

Comments

@kjeremy
Copy link
Contributor

kjeremy commented Aug 29, 2024

Describe the bug

Evaluating certain cross-compiled packages uses an order or two of magnitude more resources than their native counterparts. See the discourse post and subsequent nix investigation below.

Steps To Reproduce

Steps to reproduce the behavior:

  1. Use this flake:
{
  inputs.nixpkgs.url = "github:nixos/nixpkgs?rev=c374d94f1536013ca8e92341b540eba4c22f9c62";

  outputs = {
    self,
    nixpkgs,
  }: let
    # pkgsSet is written in a way that it also works as a non-flake
    # default.nix without modifications, issue persists in non flake context
    pkgsSet = {
      nixpkgs ? <nixpkgs>,
      localSystem ? builtins.currentSystem,
    }: let
      # ensure we're cross compiling
      crossSystem = {
          "x86_64-linux" = "aarch64-linux";
          "aarch64-linux" = "x86_64-linux";
        }.${localSystem};

      pkgs = import nixpkgs {
        system = localSystem;
      };

      pkgsCross = import nixpkgs {
        inherit crossSystem localSystem;
      };

      mkLargePython = pkgs:
        pkgs.python3.withPackages (ps:
          builtins.attrValues {
            inherit (ps) numpy matplotlib requests pandas;
          });
    in {
      python = mkLargePython pkgs; # native, no problem
      hello = pkgsCross.hello; # small cross, no problem
      pythonCross = mkLargePython pkgsCross; # big cross, ooms during evaluation
    };
  in {
    packages = nixpkgs.lib.genAttrs ["x86_64-linux" "aarch64-linux"] (system:
      pkgsSet {
        inherit nixpkgs;
        localSystem = system;
      });
  };
}
  1. nix eval --no-eval-cache .#python and look at memory usage
  2. nix eval --no-eval-cache .#pythonCross and look at memory usage (if this doesn't OOM)

Expected behavior

Both of these should evaluate in reasonable time.

Screenshots

On my machine I have the following results:

python pythonCross
1.84 secs 23.77 secs
232 MB 7,445 MB

Additional context

https://discourse.nixos.org/t/unexpected-massive-memory-usage-when-evaluating-a-derivation-from-a-cross-compiling-nixpkgs/51039/2
NixOS/nix#7698 (comment)

Notify maintainers

Metadata

Please run nix-shell -p nix-info --run "nix-info -m" and paste the result.

[user@system:~]$ nix-shell -p nix-info --run "nix-info -m"
 - system: `"x86_64-linux"`
 - host os: `Linux 6.6.45, NixOS, 24.11 (Vicuna), 24.11.20240814.c3aa7b8`
 - multi-user?: `no`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.24.2`
 - channels(root): `"nixos"`
 - channels(jkolb): `""`
 - nixpkgs: `/nix/store/sfycwi72zfjsspidinx56ajaiffpyh17-source`

Add a 👍 reaction to issues you find important.

@kjeremy kjeremy added 0.kind: bug Something is broken 6.topic: cross-compilation Building packages on a different platform than they will be used on labels Aug 29, 2024
@Artturin
Copy link
Member

Splicing is the cause, and probably the importing of multiple sets contributes too.

@mweinelt
Copy link
Member

mweinelt commented Sep 11, 2024

The cause is the override function introduced in #169475. Bypassing it makes cross evals of python apps much cheaper.

diff --git a/pkgs/development/interpreters/python/cpython/default.nix b/pkgs/development/interpreters/python/cpython/default.nix
index 56e0be3ea59c..4002c7b619bd 100644
--- a/pkgs/development/interpreters/python/cpython/default.nix
+++ b/pkgs/development/interpreters/python/cpython/default.nix
@@ -132,7 +132,7 @@ let
     # When we override the interpreter we also need to override the spliced versions of the interpreter
     # bluez is excluded manually to break an infinite recursion.
     inputs' = lib.filterAttrs (n: v: n != "bluez" && n != "passthruFun" && ! lib.isDerivation v) inputs;
-    override = attr: let python = attr.override (inputs' // { self = python; }); in python;
+    override = attr: attr
   in passthruFun rec {
     inherit self sourceVersion packageOverrides;
     implementation = "cpython";

@NickCao
Copy link
Member

NickCao commented Sep 28, 2024

The cause is the override function introduced in #169475. Bypassing it makes cross evals of python apps much cheaper.

Ah this is the culprit! Applying the patch fixes the unsolved mystery of evaluation failures plaguing my hydra for an entire year and counting.

@colemickens
Copy link
Member

colemickens commented Sep 29, 2024

@NickCao am I missing something? what patch?

edit: I'm not smart, you were likely referring to Martin's comment above.

@kjeremy
Copy link
Contributor Author

kjeremy commented Sep 30, 2024

The cause is the override function introduced in #169475. Bypassing it makes cross evals of python apps much cheaper.

Ah this is the culprit! Applying the patch fixes the unsolved mystery of evaluation failures plaguing my hydra for an entire year and counting.

But doesn't that break cross?

@NickCao
Copy link
Member

NickCao commented Sep 30, 2024

But doesn't that break cross?

If the interpreter is overridden, yes. But normally that's not the case?

@SFrijters
Copy link
Member

SFrijters commented Oct 5, 2024

FWIW, Just ran into this issue with nix build .#pkgsCross.aarch64-multiplatform.azure-cli (which is a Python package). Got to 30GB+ memory usage and no visible progress before I killed it and started looking at the issue tracker. Patching out the override function indeed bypasses this issue for me.

Edit: Not sure if this is related (or related to the workaround), but although eval is "fixed", the memory usage blows up completely during the installPhase of azure-cli in this cross configuration - default-builder.sh is calling itself recursively (and maybe even exponentially - note the downward line below - that one eventually spawns even more subtrees).

$ ps faux | grep default
nixbld1   255888  1.0  0.0   6020  5148 ?        Ss   23:15   0:00      \_ bash -e /nix/store/v6x3cs394jgqfbi0a42pam708flxaphh-default-builder.sh
nixbld1   256114  0.0  0.0   5968  4256 ?        S    23:16   0:00          \_ bash -e /nix/store/v6x3cs394jgqfbi0a42pam708flxaphh-default-builder.sh
nixbld1   256127  0.0  0.0   5968  4236 ?        S    23:16   0:00          |   \_ bash -e /nix/store/v6x3cs394jgqfbi0a42pam708flxaphh-default-builder.sh
nixbld1   256134  0.0  0.0   5968  4236 ?        S    23:16   0:00          |       \_ bash -e /nix/store/v6x3cs394jgqfbi0a42pam708flxaphh-default-builder.sh
nixbld1   256146  0.0  0.0   5968  4236 ?        S    23:16   0:00          |           \_ bash -e /nix/store/v6x3cs394jgqfbi0a42pam708flxaphh-default-builder.sh
nixbld1   256154  0.0  0.0   5968  4236 ?        S    23:16   0:00          |               \_ bash -e /nix/store/v6x3cs394jgqfbi0a42pam708flxaphh-default-builder.sh
nixbld1   256163  0.0  0.0   5968  4236 ?        S    23:16   0:00          |                   \_ bash -e /nix/store/v6x3cs394jgqfbi0a42pam708flxaphh-default-builder.sh
nixbld1   256171  0.0  0.0   5968  4236 ?        S    23:16   0:00          |                       \_ bash -e /nix/store/v6x3cs394jgqfbi0a42pam708flxaphh-default-builder.sh
nixbld1   256178  0.0  0.0   5968  4236 ?        S    23:16   0:00          |                           \_ bash -e /nix/store/v6x3cs394jgqfbi0a42pam708flxaphh-default-builder.sh
nixbld1   256186  0.0  0.0   5968  4236 ?        S    23:16   0:00          |                               \_ bash -e /nix/store/v6x3cs394jgqfbi0a42pam708flxaphh-default-builder.sh
nixbld1   256196  0.0  0.0   5968  4236 ?        S    23:16   0:00          |                                   \_ bash -e /nix/store/v6x3cs394jgqfbi0a42pam708flxaphh-default-builder.sh
nixbld1   256205  0.0  0.0   6104  4236 ?        S    23:16   0:00          |                                       \_ bash -e /nix/store/v6x3cs394jgqfbi0a42pam708flxaphh-default-builder.sh
nixbld1   256214  0.0  0.0   6104  4268 ?        S    23:16   0:00          |                                           \_ bash -e /nix/store/v6x3cs394jgqfbi0a42pam708flxaphh-default-builder.sh
nixbld1   256223  0.0  0.0   6104  4208 ?        S    23:16   0:00          |                                               \_ bash -e /nix/store/v6x3cs394jgqfbi0a42pam708flxaphh-default-builder.sh
nixbld1   256231  0.0  0.0   6104  4208 ?        S    23:16   0:00          |                                                   \_ bash -e /nix/store/v6x3cs394jgqfbi0a42pam708flxaphh-default-builder.sh
nixbld1   256242  0.0  0.0   6240  4364 ?        S    23:16   0:00          |                                                       \_ bash -e /nix/store/v6x3cs394jgqfbi0a42pam708flxaphh-default-builder.sh
nixbld1   256251  0.0  0.0   6240  4396 ?        S    23:16   0:00          |                                                           \_ bash -e /nix/store/v6x3cs394jgqfbi0a42pam708flxaphh-default-builder.sh
nixbld1   256260  0.0  0.0   6240  4344 ?        S    23:16   0:00          |                                                               \_ bash -e /nix/store/v6x3cs394jgqfbi0a42pam708flxaphh-default-builder.sh
nixbld1   256269  0.0  0.0   6240  4360 ?        S    23:16   0:00          |                                                                   \_ bash -e /nix/store/v6x3cs394jgqfbi0a42pam708flxaphh-default-builder.sh
nixbld1   256278  0.0  0.0   6376  4496 ?        S    23:16   0:00          |                                                                       \_ bash -e /nix/store/v6x3cs394jgqfbi0a42pam708flxaphh-default-builder.sh
nixbld1   256287  0.0  0.0   6376  4524 ?        S    23:16   0:00          |                                                                           \_ bash -e /nix/store/v6x3cs394jgqfbi0a42pam708flxaphh-default-builder.sh
nixbld1   256296  0.0  0.0   6376  4480 ?        S    23:16   0:00          |                                                                               \_ bash -e /nix/store/v6x3cs394jgqfbi0a42pam708flxaphh-default-builder.sh
nixbld1   256305  0.0  0.0   6376  4488 ?        S    23:16   0:00          |                                                                                   \_ bash -e /nix/store/v6x3cs394jgqfbi0a42pam708flxaphh-default-builder.sh
[...]

Edit2: This was actually related to #346715 , it doesn't explode like this with only the workaround on master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: bug Something is broken 6.topic: cross-compilation Building packages on a different platform than they will be used on 6.topic: python
Projects
None yet
Development

No branches or pull requests

6 participants