Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Slurm-GCP to 6.8.2 #3132

Merged
merged 1 commit into from
Oct 15, 2024

Conversation

tpdownes
Copy link
Member

Brings in new default NVIDIA driver 550.90.12 which solves several known issues, including NCCL Timeout errors.

https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-550-90-12/index.html

Submission Checklist

NOTE: Community submissions can take up to 2 weeks to be reviewed.

Please take the following actions before submitting this pull request.

  • Fork your PR branch from the Toolkit "develop" branch (not main)
  • Test all changes with pre-commit in a local branch #
  • Confirm that "make tests" passes all tests
  • Add or modify unit tests to cover code changes
  • Ensure that unit test coverage remains above 80%
  • Update all applicable documentation
  • Follow Cluster Toolkit Contribution guidelines #

Brings in new default NVIDIA driver 550.90.12 which solves several known
issues, including NCCL Timeout errors.

https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-550-90-12/index.html
@tpdownes tpdownes requested a review from mr0re1 October 15, 2024 18:33
@tpdownes tpdownes added the release-version-updates Added to release notes under the "Version Updates" heading. label Oct 15, 2024
@mr0re1 mr0re1 assigned tpdownes and unassigned mr0re1 Oct 15, 2024
@tpdownes tpdownes merged commit d1b6fbb into GoogleCloudPlatform:develop Oct 15, 2024
11 of 55 checks passed
@tpdownes tpdownes deleted the update_slurm_gcp branch October 15, 2024 20:02
@harshthakkar01 harshthakkar01 mentioned this pull request Oct 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-version-updates Added to release notes under the "Version Updates" heading.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants