From d1bfd0984c5e0cf986e509aeba75371e5c6e5e85 Mon Sep 17 00:00:00 2001 From: Max Katz Date: Wed, 29 Jan 2020 14:05:40 -0800 Subject: [PATCH 1/2] Note Spectrum MPI + Nsight Compute incompatibility --- systems/summit_user_guide.rst | 35 +++++++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+) diff --git a/systems/summit_user_guide.rst b/systems/summit_user_guide.rst index 0a12fc4b..53b60d76 100644 --- a/systems/summit_user_guide.rst +++ b/systems/summit_user_guide.rst @@ -3169,6 +3169,41 @@ Last Updated: 04 December 2019 Open Issues ----------- +Nsight Compute cannot be used with MPI programs +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +When profiling an MPI application using NVIDIA Nsight Compute, like the following, +you may see an error message in Spectrum MPI that aborts the program: + +:: + + jsrun -n 1 -a 1 -g 1 nv-nsight-cu-cli ./a.out + + Error: common_pami.c:1049 - ompi_common_pami_init() Unable to create PAMI client (rc=1) + -------------------------------------------------------------------------- + No components were able to be opened in the pml framework. + + This typically means that either no components of this type were + installed, or none of the installed components can be loaded. + Sometimes this means that shared libraries required by these + components are unable to be found/loaded. + + Host: + Framework: pml + -------------------------------------------------------------------------- + PML pami cannot be selected + +This is due to an incompatibility in the 2019.x versions of Nsight Compute with +Spectrum MPI. As a workaround, you can disable CUDA hooks in Spectrum MPI using + +:: + + jsrun -n 1 -a 1 -g 1 --smpiargs="-disable_gpu_hooks" nv-nsight-cu-cli ./a.out + +Unfortunately, this is incompatible with using CUDA-aware MPI in your application. + +This will be resolved in a future release of CUDA. + CUDA hook error when program uses CUDA without first calling MPI_Init() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ From c9c0b61ca53fb46f0595c1e4058b4ada45a086c4 Mon Sep 17 00:00:00 2001 From: Max Katz Date: Sat, 1 Feb 2020 11:10:41 -0800 Subject: [PATCH 2/2] Note new last updated date --- systems/summit_user_guide.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/systems/summit_user_guide.rst b/systems/summit_user_guide.rst index 53b60d76..1c6a3818 100644 --- a/systems/summit_user_guide.rst +++ b/systems/summit_user_guide.rst @@ -3164,7 +3164,7 @@ please see the `Vampir Software Page