{lib}[gcccuda/2020b] NCCL v2.8.4-1 #12183

branfosj · 2021-02-17T10:34:58Z

(created using eb --new-pr)

easyblock in ~~easybuilders/easybuild-easyblocks#2337~~ and minor fix in easybuilders/easybuild-easyblocks#2460

Micket · 2021-02-17T10:41:53Z

Should we limit it to the cuda compute capabilities that we have?

By default, NCCL is compiled for all supported architectures. To accelerate the compilation and reduce the binary size, consider redefining NVCC_GENCODE (defined in makefiles/common.mk) to only include the architecture of the target platform :

$ make -j src.build NVCC_GENCODE="-gencode=arch=compute_70,code=sm_70"

Flamefire · 2021-02-17T10:52:00Z

Test report by @Flamefire
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
taurusml3 - Linux RHEL 7.6, POWER, 8335-GTX (power9le), Python 2.7.5
See https://gist.github.com/ce8638c4a91f592497eef0bb085cb915 for a full test report.

Flamefire · 2021-02-17T10:52:15Z

Test report by @Flamefire
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
taurusi8019 - Linux centos linux 7.9.2009, x86_64, AMD EPYC 7352 24-Core Processor (zen2), Python 2.7.5
See https://gist.github.com/711ee113b01079ec87b6c5aff0ebfa6f for a full test report.

boegel · 2021-02-17T13:13:10Z

Should we limit it to the cuda compute capabilities that we have?

Yes, but ideally that's done in a custom easyblock, may be difficult to do it cleanly in an easyconfig (i.e. handle the case when --cuda-compute-capabilities is not set)

branfosj · 2021-02-18T10:57:41Z

Should we limit it to the cuda compute capabilities that we have?

Yes, but ideally that's done in a custom easyblock, may be difficult to do it cleanly in an easyconfig (i.e. handle the case when --cuda-compute-capabilities is not set)

I think we can just fallback to passing nothing to the NCCL build and let NCCL build the fat binary.

verdurin · 2021-02-22T09:43:56Z

Test report by @verdurin
Using easyblocks from PR(s) easybuilders/easybuild-easyblocks#2337
SUCCESS
Build succeeded for 4 out of 4 (1 easyconfigs in total)
easybuild-c7.novalocal - Linux centos linux 7.9.2009, x86_64, Intel Xeon Processor (Skylake, IBRS), Python 3.6.8
See https://gist.github.com/c685c3d2b31a200a6c4275273a4c4fbf for a full test report.

branfosj · 2021-06-07T17:32:03Z

Test report by @branfosj
Using easyblocks from PR(s) easybuilders/easybuild-easyblocks#2337
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
bear-pg0212u15b.bear.cluster - Linux RHEL 8.3, x86_64, Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz (broadwell), Python 3.6.8
See https://gist.github.com/3417e645d85399498e9a6bad3bab4974 for a full test report.

branfosj · 2021-06-07T17:34:54Z

Test report by @branfosj
Using easyblocks from PR(s) easybuilders/easybuild-easyblocks#2337
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
bear-pg0212u15b.bear.cluster - Linux RHEL 8.3, x86_64, Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz (broadwell), Python 3.6.8
See https://gist.github.com/78ba58131ace34827f40bfd84ed77838 for a full test report.

branfosj · 2021-06-07T17:40:35Z

Test report by @branfosj
Using easyblocks from PR(s) easybuilders/easybuild-easyblocks#2337
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
bear-pg0212u15b.bear.cluster - Linux RHEL 8.3, x86_64, Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz (broadwell), Python 3.6.8
See https://gist.github.com/7e7d884dc2fa2453291de987dcf031e5 for a full test report.

…asyconfigs into 20210217103451_new_pr_NCCL2841

Flamefire · 2021-06-08T08:42:05Z

easybuild/easyconfigs/n/NCCL/NCCL-2.8.4-1-gcccuda-2020b.eb

+description = """The NVIDIA Collective Communications Library (NCCL) implements multi-GPU and multi-node collective
+communication primitives that are performance optimized for NVIDIA GPUs."""
+
+toolchain = {'name': 'gcccuda', 'version': '2020b'}


I'm thinking about the toolchain for this. Prior we had SYSTEM-CUDA, now (full)gcccuda, which means that it can't be used with Intel toolchains, can it?
So maybe GCCcore with CUDA dep and suffix?

branfosj · 2021-06-09T10:34:48Z

Closing this. Using #13071 instead.

adding easyconfigs: NCCL-2.8.4-1-gcccuda-2020b.eb

0b43d59

branfosj added the update label Feb 17, 2021

branfosj marked this pull request as draft February 17, 2021 10:35

boegel added the enhancement label Feb 17, 2021

switch to using easyblock

e0766fe

branfosj marked this pull request as ready for review February 18, 2021 17:07

branfosj added this to the 4.x milestone Feb 19, 2021

branfosj added 2 commits June 7, 2021 18:18

clean up ec after changes to block

79c2f96

better comment on cuda_compute_capabilities

f4a9623

Samuel Moors and others added 3 commits June 7, 2021 21:16

Merge branch 'develop' of https://github.com/easybuilders/easybuild-e…

869cba5

…asyconfigs into 20210217103451_new_pr_NCCL2841

use ec variables for version and github_account

3c29a0d

sanity check for nccl.pc

3b08b68

Flamefire reviewed Jun 8, 2021

View reviewed changes

Flamefire mentioned this pull request Jun 8, 2021

{lib}[GCCcore/10.2.0] NCCL v2.8.3 #13071

Merged

easybuilders deleted a comment from boegelbot Jun 8, 2021

move sanity check to easyblock

039de94

branfosj closed this Jun 9, 2021

branfosj deleted the 20210217103451_new_pr_NCCL2841 branch August 19, 2021 18:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

{lib}[gcccuda/2020b] NCCL v2.8.4-1 #12183

{lib}[gcccuda/2020b] NCCL v2.8.4-1 #12183

branfosj commented Feb 17, 2021 •

edited by boegel

Loading

Micket commented Feb 17, 2021

Flamefire commented Feb 17, 2021

Flamefire commented Feb 17, 2021

boegel commented Feb 17, 2021

branfosj commented Feb 18, 2021

verdurin commented Feb 22, 2021

branfosj commented Jun 7, 2021

branfosj commented Jun 7, 2021

branfosj commented Jun 7, 2021

Flamefire Jun 8, 2021

branfosj commented Jun 9, 2021

{lib}[gcccuda/2020b] NCCL v2.8.4-1 #12183

{lib}[gcccuda/2020b] NCCL v2.8.4-1 #12183

Conversation

branfosj commented Feb 17, 2021 • edited by boegel Loading

Micket commented Feb 17, 2021

Flamefire commented Feb 17, 2021

Flamefire commented Feb 17, 2021

boegel commented Feb 17, 2021

branfosj commented Feb 18, 2021

verdurin commented Feb 22, 2021

branfosj commented Jun 7, 2021

branfosj commented Jun 7, 2021

branfosj commented Jun 7, 2021

Flamefire Jun 8, 2021

Choose a reason for hiding this comment

branfosj commented Jun 9, 2021

branfosj commented Feb 17, 2021 •

edited by boegel

Loading