Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cudaPackages.cudnn: hash correction and 8.9.6.50 -> 8.9.7.29 #273604

Merged
merged 2 commits into from
Dec 13, 2023

Conversation

ConnorBaker
Copy link
Contributor

@ConnorBaker ConnorBaker commented Dec 11, 2023

Description of changes

  • Fixes bad hash for release for x86_64-linux CUDA 10.2, v7.6.5.32 caused by bad copy-paste
  • Updates 8.9.6.50 -> 8.9.7.29

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 24.05 Release Notes (or backporting 23.05 and 23.11 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

@ConnorBaker ConnorBaker added 6.topic: cuda Parallel computing platform and API backport release-23.11 labels Dec 11, 2023
@ConnorBaker ConnorBaker self-assigned this Dec 11, 2023
@SomeoneSerge
Copy link
Contributor

I don't think fq7IA5osMKsLx1jTA1iHZ2k972v0myJIWiwAvy4TbLN had anything to do with bad copy-pasting btw. That's the same hash I saw in the git blames, so either we broke it much earlier, or nvidia really has swapped the blob. Otherwise, we really should avoid editing these files manually. Instead, we want a deterministic and idempotent update script that we can run pre-commit style, instead of reviews. If two users run the updater and get the exact same output, the hashes pass the review. We should also add this to the CI probably

@ConnorBaker
Copy link
Contributor Author

ConnorBaker commented Dec 11, 2023

Note

Template nixpkgs-review command:

PR=273604; \
SYSTEM="aarch64-linux"; \
CUDA_SUPPORT="true"; \
CUDA_CAPABILITIES='[ "7.5" ]'; \
nixpkgs-review pr "$PR" \
  --system "$SYSTEM" \
  --no-shell \
  --checkout commit \
  --allow aliases \
  --build-args "--max-jobs 1" \
  --extra-nixpkgs-config "{
    allowUnfree = true;
    allowBroken = false;
    cudaSupport = ${CUDA_SUPPORT:-false};
    cudaCapabilities = ${CUDA_CAPABILITIES:-[]};
  }"

aarch64-darwin

Result of nixpkgs-review pr 273604 --extra-nixpkgs-config '{ allowUnfree = true; allowBroken = false; cudaSupport = false; cudaCapabilities = []; }' run on aarch64-darwin 1

x86_64-darwin

Result of nixpkgs-review pr 273604 --extra-nixpkgs-config '{ allowUnfree = true; allowBroken = false; cudaSupport = false; cudaCapabilities = []; }' run on x86_64-darwin 1

aarch64-linux

Jetson

Result of nixpkgs-review pr 273604 --extra-nixpkgs-config '{ allowUnfree = true; allowBroken = false; cudaSupport = true; cudaCapabilities = [ "7.2" ]; }' run on aarch64-linux 1

4 packages marked as broken and skipped:
  • python310Packages.theanoWithCuda
  • python310Packages.theanoWithCuda.dist
  • python311Packages.theanoWithCuda
  • python311Packages.theanoWithCuda.dist
10 packages built:
  • python310Packages.torchWithCuda
  • python310Packages.torchWithCuda.cxxdev
  • python310Packages.torchWithCuda.dev
  • python310Packages.torchWithCuda.dist
  • python310Packages.torchWithCuda.lib
  • python311Packages.torchWithCuda
  • python311Packages.torchWithCuda.cxxdev
  • python311Packages.torchWithCuda.dev
  • python311Packages.torchWithCuda.dist
  • python311Packages.torchWithCuda.lib

Non-Jetson

Result of nixpkgs-review pr 273604 --extra-nixpkgs-config '{ allowUnfree = true; allowBroken = false; cudaSupport = true; cudaCapabilities = [ "7.5" ]; }' run on aarch64-linux 1

4 packages marked as broken and skipped:
  • python310Packages.theanoWithCuda
  • python310Packages.theanoWithCuda.dist
  • python311Packages.theanoWithCuda
  • python311Packages.theanoWithCuda.dist
10 packages built:
  • python310Packages.torchWithCuda
  • python310Packages.torchWithCuda.cxxdev
  • python310Packages.torchWithCuda.dev
  • python310Packages.torchWithCuda.dist
  • python310Packages.torchWithCuda.lib
  • python311Packages.torchWithCuda
  • python311Packages.torchWithCuda.cxxdev
  • python311Packages.torchWithCuda.dev
  • python311Packages.torchWithCuda.dist
  • python311Packages.torchWithCuda.lib

Non-CUDA

Result of nixpkgs-review pr 273604 --extra-nixpkgs-config '{ allowUnfree = true; allowBroken = false; cudaSupport = false; cudaCapabilities = []; }' run on aarch64-linux 1

4 packages marked as broken and skipped:
  • python310Packages.theanoWithCuda
  • python310Packages.theanoWithCuda.dist
  • python311Packages.theanoWithCuda
  • python311Packages.theanoWithCuda.dist
10 packages built:
  • python310Packages.torchWithCuda
  • python310Packages.torchWithCuda.cxxdev
  • python310Packages.torchWithCuda.dev
  • python310Packages.torchWithCuda.dist
  • python310Packages.torchWithCuda.lib
  • python311Packages.torchWithCuda
  • python311Packages.torchWithCuda.cxxdev
  • python311Packages.torchWithCuda.dev
  • python311Packages.torchWithCuda.dist
  • python311Packages.torchWithCuda.lib

x86_64-linux

Non-Jetson

Result of nixpkgs-review pr 273604 --extra-nixpkgs-config '{ allowUnfree = true; allowBroken = false; cudaSupport = true; cudaCapabilities = [ "7.5" ]; }' run on x86_64-linux 1

25 packages marked as broken and skipped:
  • cudaPackages.tensorrt_8_5
  • cudaPackages.tensorrt_8_5.bin
  • cudaPackages.tensorrt_8_5.dev
  • cudaPackages.tensorrt_8_5.lib
  • cudaPackages.tensorrt_8_5.python
  • cudaPackages.tensorrt_8_5.sample
  • cudaPackages.tensorrt_8_5.static
  • cudaPackagesGoogle.tensorrt_8_5
  • cudaPackagesGoogle.tensorrt_8_5.bin
  • cudaPackagesGoogle.tensorrt_8_5.dev
  • cudaPackagesGoogle.tensorrt_8_5.lib
  • cudaPackagesGoogle.tensorrt_8_5.python
  • cudaPackagesGoogle.tensorrt_8_5.sample
  • cudaPackagesGoogle.tensorrt_8_5.static
  • cudaPackages_11.tensorrt_8_5
  • cudaPackages_11.tensorrt_8_5.bin
  • cudaPackages_11.tensorrt_8_5.dev
  • cudaPackages_11.tensorrt_8_5.lib
  • cudaPackages_11.tensorrt_8_5.python
  • cudaPackages_11.tensorrt_8_5.sample
  • cudaPackages_11.tensorrt_8_5.static
  • python310Packages.theanoWithCuda
  • python310Packages.theanoWithCuda.dist
  • python311Packages.theanoWithCuda
  • python311Packages.theanoWithCuda.dist
13 packages failed to build:
  • katagoWithCuda
  • python310Packages.cupy
  • python310Packages.cupy.dist
  • python310Packages.jaxlibWithCuda
  • python310Packages.jaxlibWithCuda.dist
  • python310Packages.tensorrt
  • python310Packages.tensorrt.dist
  • python311Packages.cupy
  • python311Packages.cupy.dist
  • python311Packages.jaxlibWithCuda
  • python311Packages.jaxlibWithCuda.dist
  • python311Packages.tensorrt
  • python311Packages.tensorrt.dist
16 packages built:
  • cudaPackages.cudnn (cudaPackages.cudnn.dev ,cudaPackages.cudnn.lib ,cudaPackages.cudnn.static ,cudaPackages.cudnn_8_9 ,cudaPackages.cudnn_8_9.dev ,cudaPackages.cudnn_8_9.lib ,cudaPackages.cudnn_8_9.static ,cudaPackagesGoogle.cudnn ,cudaPackagesGoogle.cudnn.dev ,cudaPackagesGoogle.cudnn.lib ,cudaPackagesGoogle.cudnn.static ,cudaPackagesGoogle.cudnn_8_9 ,cudaPackagesGoogle.cudnn_8_9.dev ,cudaPackagesGoogle.cudnn_8_9.lib ,cudaPackagesGoogle.cudnn_8_9.static ,cudaPackages_11.cudnn ,cudaPackages_11.cudnn.dev ,cudaPackages_11.cudnn.lib ,cudaPackages_11.cudnn.static ,cudaPackages_11.cudnn_8_9 ,cudaPackages_11.cudnn_8_9.dev ,cudaPackages_11.cudnn_8_9.lib ,cudaPackages_11.cudnn_8_9.static)
  • cudaPackages.tensorrt (cudaPackages.tensorrt.bin ,cudaPackages.tensorrt.dev ,cudaPackages.tensorrt.lib ,cudaPackages.tensorrt.python ,cudaPackages.tensorrt.sample ,cudaPackages.tensorrt.static ,cudaPackages.tensorrt_8_6 ,cudaPackages.tensorrt_8_6.bin ,cudaPackages.tensorrt_8_6.dev ,cudaPackages.tensorrt_8_6.lib ,cudaPackages.tensorrt_8_6.python ,cudaPackages.tensorrt_8_6.sample ,cudaPackages.tensorrt_8_6.static ,cudaPackagesGoogle.tensorrt ,cudaPackagesGoogle.tensorrt.bin ,cudaPackagesGoogle.tensorrt.dev ,cudaPackagesGoogle.tensorrt.lib ,cudaPackagesGoogle.tensorrt.python ,cudaPackagesGoogle.tensorrt.sample ,cudaPackagesGoogle.tensorrt.static ,cudaPackagesGoogle.tensorrt_8_6 ,cudaPackagesGoogle.tensorrt_8_6.bin ,cudaPackagesGoogle.tensorrt_8_6.dev ,cudaPackagesGoogle.tensorrt_8_6.lib ,cudaPackagesGoogle.tensorrt_8_6.python ,cudaPackagesGoogle.tensorrt_8_6.sample ,cudaPackagesGoogle.tensorrt_8_6.static ,cudaPackages_11.tensorrt ,cudaPackages_11.tensorrt.bin ,cudaPackages_11.tensorrt.dev ,cudaPackages_11.tensorrt.lib ,cudaPackages_11.tensorrt.python ,cudaPackages_11.tensorrt.sample ,cudaPackages_11.tensorrt.static ,cudaPackages_11.tensorrt_8_6 ,cudaPackages_11.tensorrt_8_6.bin ,cudaPackages_11.tensorrt_8_6.dev ,cudaPackages_11.tensorrt_8_6.lib ,cudaPackages_11.tensorrt_8_6.python ,cudaPackages_11.tensorrt_8_6.sample ,cudaPackages_11.tensorrt_8_6.static)
  • cudaPackages_10.cudnn_7_6 (cudaPackages_10.cudnn_7_6.dev ,cudaPackages_10.cudnn_7_6.lib ,cudaPackages_10.cudnn_7_6.static)
  • cudaPackages_12.cudnn (cudaPackages_12.cudnn.dev ,cudaPackages_12.cudnn.lib ,cudaPackages_12.cudnn.static ,cudaPackages_12.cudnn_8_9 ,cudaPackages_12.cudnn_8_9.dev ,cudaPackages_12.cudnn_8_9.lib ,cudaPackages_12.cudnn_8_9.static)
  • cudaPackages_12.tensorrt (cudaPackages_12.tensorrt.bin ,cudaPackages_12.tensorrt.dev ,cudaPackages_12.tensorrt.lib ,cudaPackages_12.tensorrt.python ,cudaPackages_12.tensorrt.sample ,cudaPackages_12.tensorrt.static ,cudaPackages_12.tensorrt_8_6 ,cudaPackages_12.tensorrt_8_6.bin ,cudaPackages_12.tensorrt_8_6.dev ,cudaPackages_12.tensorrt_8_6.lib ,cudaPackages_12.tensorrt_8_6.python ,cudaPackages_12.tensorrt_8_6.sample ,cudaPackages_12.tensorrt_8_6.static)
  • katagoTensorRT
  • python310Packages.torchWithCuda
  • python310Packages.torchWithCuda.cxxdev
  • python310Packages.torchWithCuda.dev
  • python310Packages.torchWithCuda.dist
  • python310Packages.torchWithCuda.lib
  • python311Packages.torchWithCuda
  • python311Packages.torchWithCuda.cxxdev
  • python311Packages.torchWithCuda.dev
  • python311Packages.torchWithCuda.dist
  • python311Packages.torchWithCuda.lib

Non-CUDA

Result of nixpkgs-review pr 273604 --extra-nixpkgs-config '{ allowUnfree = true; allowBroken = false; cudaSupport = false; cudaCapabilities = []; }' run on x86_64-linux 1

13 packages failed to build:
  • katagoWithCuda
  • python310Packages.cupy
  • python310Packages.cupy.dist
  • python310Packages.jaxlibWithCuda
  • python310Packages.jaxlibWithCuda.dist
  • python310Packages.tensorrt
  • python310Packages.tensorrt.dist
  • python311Packages.cupy
  • python311Packages.cupy.dist
  • python311Packages.jaxlibWithCuda
  • python311Packages.jaxlibWithCuda.dist
  • python311Packages.tensorrt
  • python311Packages.tensorrt.dist
16 packages built:
  • cudaPackages.cudnn (cudaPackages.cudnn.dev ,cudaPackages.cudnn.lib ,cudaPackages.cudnn.static)
  • cudaPackages.tensorrt (cudaPackages.tensorrt.bin ,cudaPackages.tensorrt.dev ,cudaPackages.tensorrt.lib ,cudaPackages.tensorrt.python ,cudaPackages.tensorrt.sample ,cudaPackages.tensorrt.static)
  • cudaPackages_10.cudnn_7_6 (cudaPackages_10.cudnn_7_6.dev ,cudaPackages_10.cudnn_7_6.lib ,cudaPackages_10.cudnn_7_6.static)
  • cudaPackages_12.cudnn (cudaPackages_12.cudnn.dev ,cudaPackages_12.cudnn.lib ,cudaPackages_12.cudnn.static)
  • cudaPackages_12.tensorrt (cudaPackages_12.tensorrt.bin ,cudaPackages_12.tensorrt.dev ,cudaPackages_12.tensorrt.lib ,cudaPackages_12.tensorrt.python ,cudaPackages_12.tensorrt.sample ,cudaPackages_12.tensorrt.static)
  • katagoTensorRT
  • python310Packages.torchWithCuda (python310Packages.pytorchWithCuda)
  • python310Packages.torchWithCuda.cxxdev (python310Packages.pytorchWithCuda.cxxdev)
  • python310Packages.torchWithCuda.dev (python310Packages.pytorchWithCuda.dev)
  • python310Packages.torchWithCuda.dist (python310Packages.pytorchWithCuda.dist)
  • python310Packages.torchWithCuda.lib (python310Packages.pytorchWithCuda.lib)
  • python311Packages.torchWithCuda (python311Packages.pytorchWithCuda)
  • python311Packages.torchWithCuda.cxxdev (python311Packages.pytorchWithCuda.cxxdev)
  • python311Packages.torchWithCuda.dev (python311Packages.pytorchWithCuda.dev)
  • python311Packages.torchWithCuda.dist (python311Packages.pytorchWithCuda.dist)
  • python311Packages.torchWithCuda.lib (python311Packages.pytorchWithCuda.lib)

@ofborg ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 11-100 labels Dec 11, 2023
@ConnorBaker
Copy link
Contributor Author

I don't think fq7IA5osMKsLx1jTA1iHZ2k972v0myJIWiwAvy4TbLN had anything to do with bad copy-pasting btw. That's the same hash I saw in the git blames, so either we broke it much earlier, or nvidia really has swapped the blob. Otherwise, we really should avoid editing these files manually. Instead, we want a deterministic and idempotent update script that we can run pre-commit style, instead of reviews. If two users run the updater and get the exact same output, the hashes pass the review. We should also add this to the CI probably

I agree, but I don't have the bandwidth at the moment to work on that.

I imagine such a script could be implemented as a DFS traversal of the pages at the CUDNN resist URL. If there's a new version which doesn't exist, add it. If there's a newer version of the same major/minor release, replace it with the newer version. Otherwise, skip it.

@ConnorBaker
Copy link
Contributor Author

@SomeoneSerge can you confirm that jaxlib and cupy are broken on master?

@SomeoneSerge
Copy link
Contributor

SomeoneSerge commented Dec 12, 2023

@ConnorBaker yes https://hercules-ci.com/github/SomeoneSerge/nixpkgs-cuda-ci/jobs/7448

I agree, but I don't have the bandwidth at the moment to work on that.

Yea this is definitely not the top priority

Copy link
Contributor

@jonringer jonringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise seems fine

@jonringer jonringer merged commit 987845b into NixOS:master Dec 13, 2023
30 of 31 checks passed
@ConnorBaker ConnorBaker deleted the feat/cudnn-8_9_7 branch December 13, 2023 02:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.topic: cuda Parallel computing platform and API 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 11-100
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants