Skip to content

Latest commit

 

History

History
239 lines (170 loc) · 10.3 KB

spack_instructions.md

File metadata and controls

239 lines (170 loc) · 10.3 KB

Spack Instructions

Table of contents

OpenMP

  • There are 3 offloading options for OpenMP: NVIDIA, AMD and Intel.
  • If a user provides a value for cuda_arch, the execution will be automatically offloaded to NVIDIA.
  • If a user provides a value for amdgpu_target, the operation will be offloaded to AMD.
  • In the absence of cuda_arch and amdgpu_target, the execution will be offloaded to Intel.
Flag Definition
cuda_arch - List of supported compute capabilities are provided here
- Useful link for matching CUDA gencodes with NVIDIA architectures
amdgpu_target List of supported architectures are provided here
# Example 1: for Intel offload
 $ spack install babelstream%oneapi +omp 

# Example 2: for Nvidia GPU for Volta (sm_70) 
 $ spack install babelstream +omp cuda_arch=70 
 
# Example 3: for AMD GPU gfx701 
 $ spack install babelstream +omp amdgpu_target=gfx701 

OpenCL

  • No need to specify amdgpu_target or cuda_arch here since we are using AMD and CUDA as backend respectively.
Flag Definition
backend 4 different backend options:
- cuda
- amd
- intel
- pocl
# Example 1:  CUDA backend
 $ spack install babelstream%gcc +ocl backend=cuda

# Example 2:  AMD backend 
 $ spack install babelstream%gcc +ocl backend=amd
 
# Example 3:  Intel backend
 $ spack install babelstream%gcc +ocl backend=intel

# Example 4:  POCL backend
 $ spack install babelstream%gcc +ocl backend=pocl

STD

  • Minimum GCC version requirement 10.1.0
  • NVHPC Offload will be added in the future release
# Example 1:  data 
 $ spack install babelstream +stddata

# Example 2:  ranges
 $ spack install babelstream +stdranges
 
# Example 3:  indices
 $ spack install babelstream +stdindices

HIP(ROCM)

  • amdgpu_target and flags are optional here.
Flag Definition
amdgpu_target List of supported architectures are provided here
flags Extra flags to pass
# Example 1:  ROCM default
 $ spack install babelstream +rocm

# Example 2:  ROCM with GPU target
 $ spack install babelstream +rocm amdgpu_target=<gfx701>
 
# Example 3:  ROCM with extra flags option
 $ spack install babelstream +rocm flags=<xxx>

# Example 4:  ROCM with GPU target and extra flags
 $ spack install babelstream +rocm amdgpu_target=<gfx701> flags=<xxx>

CUDA

  • The cuda_arch value is mandatory here.
  • If a user provides a value for mem, device memory mode will be chosen accordingly
  • If a user provides a value for flags, additional CUDA flags will be passed to NVCC
  • In the absence of mem and flags, the execution will choose DEFAULT for device memory mode and no additional flags will be passed
Flag Definition
cuda_arch - List of supported compute capabilities are provided here
- Useful link for matching CUDA gencodes with NVIDIA architectures
mem Device memory mode:
- DEFAULT allocate host and device memory pointers.
- MANAGED use CUDA Managed Memory.
- PAGEFAULT shared memory, only host pointers allocated
flags Extra flags to pass
# Example 1: CUDA no mem and flags specified
 $ spack install babelstream +cuda cuda_arch=<70>

# Example 2: for Nvidia GPU for Volta (sm_70) 
 $ spack install babelstream +cuda cuda_arch=<70> mem=<managed>
 
# Example 3: CUDA with mem and flags specified
 $ spack install babelstream +cuda cuda_arch=<70> mem=<managed> flags=<CUDA_EXTRA_FLAGS> 

Kokkos

  • Kokkos implementation requires kokkos source folder to be provided because it builds it from the scratch
Flag Definition
dir Download the kokkos release from github repository ( https://github.com/kokkos/kokkos ) and extract the zip file to a directory you want and target this directory with dir flag
backend 2 different backend options:
- cuda
- omp
cuda_arch - List of supported compute capabilities are provided here
- Useful link for matching CUDA gencodes with NVIDIA architectures
# Example 1:  No Backend option specified
 $ spack install babelstream +kokkos dir=</home/user/Downloads/kokkos-x.x.xx>

# Example 2:  CUDA backend 
 $ spack install babelstream +kokkos backend=cuda cuda_arch=70 dir=</home/user/Downloads/kokkos-x.x.xx>
 
# Example 3:  OMP backend
 $ spack install babelstream +kokkos  backend=omp dir=</home/user/Downloads/kokkos-x.x.xx>

SYCL2020

  • Instructions for installing the intel compilers are provided here
Flag Definition
implementation 3 different implementation options:
- OneAPI-ICPX
- OneAPI-DPCPP
- Compute-CPP
# Example 1:  No implementation option specified (build for OneAPI-ICPX)
 $ spack install babelstream%oneapi +sycl2020

# Example 2:  OneAPI-DPCPP implementation 
 $ spack install babelstream +sycl2020 implementation=ONEAPI-DPCPP

SYCL

Flag Definition
implementation 2 different implementation options:
- OneAPI-DPCPP
- Compute-CPP
# Example 1:  OneAPI-DPCPP implementation 
 $ spack install babelstream +sycl2020 implementation=ONEAPI-DPCPP

ACC

  • Target device selection process is automatic with 2 options:
    • gpu : Globally set the target device to an NVIDIA GPU automatically if cuda_arch is specified
    • multicore : Globally set the target device to the host CPU automatically if cpu_arch is specified
Flag Definition
cuda_arch - List of supported compute capabilities are provided here
- Useful link for matching CUDA gencodes with NVIDIA architectures
CPU_ARCH This sets the -tp (target processor) flag, possible values are:
px - Generic x86 Processor
bulldozer - AMD Bulldozer processor
piledriver - AMD Piledriver processor
zen - AMD Zen architecture (Epyc, Ryzen)
zen2 - AMD Zen 2 architecture (Ryzen 2)
sandybridge - Intel SandyBridge processor
haswell - Intel Haswell processor
knl - Intel Knights Landing processor
skylake - Intel Skylake Xeon processor
host - Link native version of HPC SDK cpu math library
native - Alias for -tp host
# Example 1:  For GPU Run 
 $ spack install babelstream +acc cuda_arch=<70>

# Example 2:  For Multicore CPU Run 
 $ spack install babelstream +acc cpu_arch=<bulldozer>

RAJA

  • RAJA implementation requires RAJA source folder to be provided because it builds it from the scratch
Flag Definition
dir Download the Raja release from github repository and extract the zip file to a directory you want and target this directory with dir flag
backend 2 different backend options:
- cuda
- omp
offload Choose offloading platform offload= [cpu]/[nvidia]
# Example 1:  For CPU offload with backend OMP 
 $ spack install babelstream +raja offload=cpu backend=omp dir=/home/dir/raja

TBB

# Example: 
 $ spack install babelstream +tbb

THRUST

Flag Definition
implementation Choose one of the implementation for Thrust. Options are cuda and rocm
backend CUDA's Thrust implementation supports the following backends:- CUDA- OMP - TBB
cuda_arch - List of supported compute capabilities are provided here
- Useful link for matching CUDA gencodes with NVIDIA architectures
flags Additional CUDA flags passed to nvcc, this is appended after CUDA_ARCH
# Example1: CUDA implementation
$ spack install babelstream +thrust implementation=cuda backend=cuda cuda_arch=<70> flags=<option>
$ spack install babelstream +thrust implementation=cuda backend=omp cuda_arch=<70> flags=<option>
$ spack install babelstream +thrust implementation=cuda backend=tbb cuda_arch=<70> flags=<option>
# Example1: ROCM implementation
*  spack install babelstream +thrust implementation=rocm backend=<option>