This is from Chapter 10 of Parallel and High Performance Computing, Robey and Zamora Manning Publications, available at http://manning.com
The book may be obtained at http://www.manning.com/?a_aid=ParallelComputingRobey
Copyright 2019-2020 Robert Robey, Yuliana Zamora, and Manning Publications Emails: [email protected], [email protected]
See License.txt for licensing information.
Build with make cd lambda && make Run with ./lambda
Build with make cd BabelStream && export EXTRA_FLAGS='-Xptxas="-v"' && make -f CUDA.make Look at output of compiler for number of registers
https://docs.nvidia.com/cuda/cuda-occupancy-calculator/index.html
cd BabelStream
make -f OpenCL.make
rcprof -p -O -o pwd
/Session1.csv ./ocl-stream
rcprof --occupancydisplay Session1.occupancy --occupancyindex 4 -o occupancy4.html