Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python 3.10 closure contains incompatible versions of CUDA and g++, causing build failures #217878

Closed
bcdarwin opened this issue Feb 23, 2023 · 2 comments · Fixed by #218035
Closed
Assignees
Labels
0.kind: bug Something is broken 6.topic: cuda Parallel computing platform and API 6.topic: python

Comments

@bcdarwin
Copy link
Member

bcdarwin commented Feb 23, 2023

Describe the bug

As per the title, the current CUDA and stdenv versions used to build torchWithCuda are seemingly incompatible and can lead to build failures stating this (specifically torchvision - see below - but likely others as well).

Likely introduced in #214887 [of course it was the actual stdenv bump], but as it's a dynamic failure it's not a triviality to add an assert to torch to catch similar errors in future.

Steps To Reproduce

  1. Built python310Packages.torchvision with CUDA-enabled torch (e.g. cudaSupport = true; in config.nix or override torch = torchWithCuda in torchvision).
  2. Observe the build fails with the following output:
  File "/nix/store/v9qczq3z12j33pnkzdfmpv7mq6zw9yvm-python3.10-setuptools-65.6.3/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 346, in run
    self.build_extensions()
  File "/nix/store/z8y26ilyi7qgi9ai3m3dcn7iz6b24b2n-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 499, in build_extensions
    _check_cuda_version(compiler_name, compiler_version)
  File "/nix/store/z8y26ilyi7qgi9ai3m3dcn7iz6b24b2n-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 416, in _check_cuda_version
    raise RuntimeError(
RuntimeError: The current installed version of g++ (12.2.0) is greater than the maximum required version by CUDA 11.7 (11.5.0). Please make sure to use an adequate version
of g++ (>=6.0.0, <=11.5.0).
/nix/store/b09v23lirgvci3wzszh22mbkdfj0h0yq-stdenv-linux/setup: line 1582: pop_var_context: head of shell_variables not a function context
error: builder for '/nix/store/yv47nwpr15gggbfh7wjb5hjd6xn4pcxq-python3.10-torchvision-0.14.1.drv' failed with exit code 1

Notify maintainers

@teh @thoughtpolice @tscholak @NickCao

Metadata

[user@system:~]$ nix-shell -p nix-info --run "nix-info -m"
 - system: `"x86_64-linux"`
 - host os: `Linux 4.15.0-169-generic, Ubuntu, 18.04.6 LTS (Bionic Beaver), nobuild`
 - multi-user?: `no`
 - sandbox: `no`
 - version: `nix-env (Nix) 2.14.0pre20230222_4a921ba`
 - channels(bcdarwin): `"home-manager, nixpkgs"`
 - nixpkgs: `/home/bcdarwin/.nix-defexpr/channels/nixpkgs`
@samuela
Copy link
Member

samuela commented Feb 28, 2023

cc @NixOS/cuda-maintainers

@github-project-automation github-project-automation bot moved this to 🆕 New in CUDA Team Mar 8, 2023
@ConnorBaker ConnorBaker moved this from 🆕 New to 📋 Backlog in CUDA Team Mar 8, 2023
@ConnorBaker ConnorBaker self-assigned this Mar 8, 2023
@ConnorBaker
Copy link
Contributor

This should be fixed when #218035 is merged.

@ConnorBaker ConnorBaker moved this from 📋 Backlog to 👀 In review in CUDA Team Mar 9, 2023
@github-project-automation github-project-automation bot moved this from 👀 In review to ✅ Done in CUDA Team Mar 9, 2023
samuela added a commit that referenced this issue Mar 9, 2023
…compilers

torchvision: fix #217878; migrate to cudaPackages
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: bug Something is broken 6.topic: cuda Parallel computing platform and API 6.topic: python
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

4 participants