v2.10.6
arnaudfroidmont
released this
17 May 16:55
·
185 commits
to master
since this release
What's Changed
- Add Healthchecks for GPU nodes in Slurm (Idle nodes and at job start)
- Scratch from NVMe not default
- OCI ALgo Tuner example for A100
- GPU and RDMA monitoring turned on
- New images
- Bug fixes