-
-
Notifications
You must be signed in to change notification settings - Fork 14.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ucx: enableCuda in an overlay causes infinite recursion #239182
Comments
Will take a look shortly -- that's odd! I was able to build blender just now, will try the others soon. |
Looks like nixpkgs/pkgs/development/compilers/cudatoolkit/redist/manifests/redistrib_11.8.0.json Lines 38 to 48 in 437d2a8
It’s also worth noting that it’s new to 11.8, maybe to help support older CUDA code on the ARM-based Grace Hopper chip? That’s why we didn’t see this failure before. |
Will be updated as I continue to build them. For packages which aren't available on
Flake reproducer (to be used with {
inputs.nixpkgs.url = "github:NixOS/nixpkgs/e5e5c5e2035f07dd73d0da1afe16a3ee22e35d6e";
nixConfig = {
extra-substituters = [
"https://cantcache.me"
"https://cuda-maintainers.cachix.org"
];
extra-trusted-substituters = [
"https://cantcache.me"
"https://cuda-maintainers.cachix.org"
];
extra-trusted-public-keys = [
"cantcache.me:Y+FHAKfx7S0pBkBMKpNMQtGKpILAfhmqUSnr5oNwNMs="
"cuda-maintainers.cachix.org-1:0dq3bujKpuEPMCX6U4WylrUDZ9JyUG0VpVZa7CNfq5E="
];
};
outputs = inputs: let
system = "x86_64-linux";
config = {
allowUnfree = true;
cudaSupport = true;
};
pkgs = import inputs.nixpkgs {inherit system config;};
in {
checks.${system} = inputs.self.packages.${system};
packages.${system} = {
inherit
(pkgs)
blender
colmapWithCuda
tts
;
# Both cuda_compat and libcudla are only available on `aarch64-linux`.
inherit
(pkgs.cudaPackages)
cutensor
;
inherit
(pkgs.python3Packages)
jax
jaxlib
tensorflowWithCuda
torch
torchvision
;
};
formatter.${system} = pkgs.alejandra;
};
} |
@SomeoneSerge I think I figured out the PyTorch failure. Not sure about the others/unable to reproduce. I suspect that your CI managed to grab a commit in-between when CUDA 11.8 was made the default (which broke dynamically linked Magma due to the binary size increase) and when I made Magma default to static builds for CUDA. |
Ok, it seems that the original issue does not belong in nixpkgs, but I don't know how to move issues between repos. nixpkgs/pkgs/development/compilers/cudatoolkit/common.nix Lines 127 to 131 in d409d42
This is exactly what happens at |
Temporary work-around is to update the overlays like so:
Mid-term, we can update |
@ConnorBaker thank you a lot for all the tests you've run! |
Describe the bug
A regression in the
config.cudaSupport = true
package set. Affected attributes:0: "blender"
1: "colmapWithCuda"
2: "cudaPackages.cuda_compat"
3: "cudaPackages.cutensor"
4: "cudaPackages.libcudla"
5: "python3Packages.jax"
6: "python3Packages.jaxlib"
7: "python3Packages.tensorflowWithCuda"
8: "python3Packages.torch"
9: "python3Packages.torchvision"
10: "tts"
Cf. https://hercules-ci.com/github/SomeoneSerge/nixpkgs-cuda-ci/jobs/4832 for logs
Probably caused by the 11.7 -> 11.8 update
Notify maintainers
@NixOS/cuda-maintainers
The text was updated successfully, but these errors were encountered: