Testing

clSPARSE Tests

The clSPARSE test suite is written with the googletest framework. The root of the test sources are located in ./src/tests/. Several test executables are built as part of test project, each test executable focuses on a different domain. Currently, the domains under test are:

*  test-blas1                Testing focusing on clsparse's level 1 blas routines (vector x vector)
*  test-blas2                Testing focusing on clsparse's level 2 blas routines (matrix x vector)
*  test-blas3                Testing focusing on clsparse's level 3 blas routines (matrix x matrix)
*  test-clsparse-utils       Testing focusing on clsparse's library interface (setup, teardown, c++ interfaces)
*  test-conversion           Testing focusing on clsparse's matrix format conversion routines
*  test-solvers              Testing focusing on clsparse's solver routines (various conjugate gradient routines and preconditioners)

Setting up tests to run

Each and every target of the build system is compiled and placed into a 'staging' directory. This happens for GUI and command line based build environments. This is done as a convenience for the library developers, because the executables and the libraries are copied into the same directory and dependencies are trivially resolved. To start testing clsparse, create a terminal window (windows, macosx or linux), change directories into the projects staging directory and run the test program of interest.

The clsparse test executables currently use Boost uBLAS for our result reference. For debug builds, we have found uBLAS to be slow at times, because expression template operations are disabled and it extra testing it performs (see the end of this faq). We recommend testing with release builds to validate the correctness of the library. It is only necessary to test with the debug builds of the test programs when it is necessary to step through the test code. Below is an example of the help message given by the test-blas2 program. The test program takes in various parameters to control the scope of test cases to be executed.

test-blas2.exe --help
Parsing command line options...
Error: the option '--path' is required but missing
Allowed options:
  -h [ --help ]                Produce this message.
  -p [ --path ] arg            Path to matrix in mtx format.
  -l [ --platform ] arg (=AMD) OpenCL platform: AMD or NVIDIA.
  -d [ --device ] arg (=0)     Device id within platform.
  -a [ --alpha ] arg (=1)      Alpha parameter for eq:
                               y = alpha * M * x + beta * y
  -b [ --beta ] arg (=1)       Beta parameter for eq:
                               y = alpha * M * x + beta * y
  -e [ --extended ]            Use compensated summation to improve accuracy by
                               emulating extended precision.

Since the test framework is based on googletest, gtest filters can be applied to narrow testing to only a fraction of the overall tests. The gtest_filter flag takes a regular expression that it matches to the test name, and each test name is unique. Negative test filters are also available with the '-' operator to filter tests out. A complicated expression can be created using both positive and negative filters, separated by the ':' character. This is just using standard googletest filter notation.

One requirement of running the clsparse tests are sparse matrix data files; currently the test program's can not generate random sparse matrices on the fly. A good source of pre-baked matrix market sparse matrix data files can be found at the University of Florida Sparse Matrix collection. The clsparse build system automates the downloading of select sparse matrices to save some time; the ones that we use in our benchmark runs. The process of using the clsparse superbuild to download sparse matrix files is documented in our build wiki page. Once a matrix market sparse data file is present, the test programs expect the path to the sparse matrix file to be passed through the '--path' command line option.

test-blas2.exe -p "C:\src-bin\MTX\Bell_Garland\cant\cant.mtx"
AMD Accelerated Parallel Processing 0
Using device Hawaii
key: [control/control] -cl-kernel-arg-info -cl-std=CL1.2 -DWG_SIZE=256 hash = 174490369
kernel not found in cache: 174490369
Matrix: C:\src-bin\MTX\Bell_Garland\cant\cant.mtx [nRow: 62451] [nCol: 62451] [nNZ: 4069834]
[validateMemObject] Buffer size: 32,558,672 bytes. Required size: 32,059,064 bytes.
[validateMemObject] Buffer size: 16,279,336 bytes. Required size: 16,029,532 bytes.
[validateMemObject] Buffer size: 249,808 bytes. Required size: 249,808 bytes.
[==========] Running 4 tests from 2 test cases.
[----------] Global test environment set-up.
[----------] 2 tests from Blas2/0, where TypeParam = float
[ RUN      ] Blas2/0.csrmv_adaptive
key: [csrmv_adaptive/csrmv_adaptive] -cl-kernel-arg-info -cl-std=CL1.2 -DROWBITS=32 -DWGBITS=24 -DVALUE_TYPE=float -DWG_SIZE=256 -DBLOCKSIZE=1024 -DBLOCK_MULTIPLIER=3 -DROWS_FOR_VECTOR=1 -DINDEX_TYPE=uint hash = 4124998496
kernel not found in cache: 4124998496
[       OK ] Blas2/0.csrmv_adaptive (281 ms)
[ RUN      ] Blas2/0.csrmv_vector
key: [csrmv_general/csrmv_general] -cl-kernel-arg-info -cl-std=CL1.2 -DVALUE_TYPE=float -DSIZE_TYPE=ulong -DWG_SIZE=256 -DWAVE_SIZE=64 -DSUBWAVE_SIZE=64 -DINDEX_TYPE=uint hash = 2083389547
kernel not found in cache: 2083389547
[       OK ] Blas2/0.csrmv_vector (240 ms)
[----------] 2 tests from Blas2/0 (521 ms total)

[----------] 2 tests from Blas2/1, where TypeParam = double
[ RUN      ] Blas2/1.csrmv_adaptive
key: [csrmv_adaptive/csrmv_adaptive] -cl-kernel-arg-info -cl-std=CL1.2 -DROWBITS=32 -DWGBITS=24 -DVALUE_TYPE=double -DWG_SIZE=256 -DBLOCKSIZE=1024 -DBLOCK_MULTIPLIER=3 -DROWS_FOR_VECTOR=1 -DINDEX_TYPE=uint -DDOUBLE hash = 2456098779
kernel not found in cache: 2456098779
[       OK ] Blas2/1.csrmv_adaptive (311 ms)
[ RUN      ] Blas2/1.csrmv_vector
key: [csrmv_general/csrmv_general] -cl-kernel-arg-info -cl-std=CL1.2 -DVALUE_TYPE=double -DSIZE_TYPE=ulong -DWG_SIZE=256 -DWAVE_SIZE=64 -DSUBWAVE_SIZE=64 -DINDEX_TYPE=uint -DDOUBLE hash = 1221582318
kernel not found in cache: 1221582318
[       OK ] Blas2/1.csrmv_vector (267 ms)
[----------] 2 tests from Blas2/1 (578 ms total)

[----------] Global test environment tear-down
[==========] 4 tests from 2 test cases ran. (1102 ms total)
[  PASSED  ] 4 tests.

NOTE: Compensated summation currently only applies to SpM-dV

If a particular sparse matrix file appears to fail the reference check against the CPU, but the error delta looks to be small, try rerunning the test with the '--extended' flag. This tells the library to run a special SpM-dV kernel that uses compensated summation techniques that improves GPU accuracy, at the cost of a modest performance loss. OpenCL device results can differ from results from computed results on CPU devices for many reasons, including parallel algorithmic reordering of math operations (leveraging associative properties of additions and multiplications, but floating point numbers are not associative in nature) and device floating point denormal support.

Failing accuracy tests with pwtk because of GPU accuracy

test-blas2.exe -p "C:\src-bin\MTX\Bell_Garland\pwtk\pwtk.mtx"
AMD Accelerated Parallel Processing 0
Using device Hawaii
key: [control/control] -cl-kernel-arg-info -cl-std=CL1.2 -DWG_SIZE=256 hash = 174490369
kernel not found in cache: 174490369
Matrix: C:\src-bin\MTX\Bell_Garland\pwtk\pwtk.mtx [nRow: 217918] [nCol: 217918] [nNZ: 11852342]
[validateMemObject] Buffer size: 94,818,736 bytes. Required size: 93,075,392 bytes.
[validateMemObject] Buffer size: 47,409,368 bytes. Required size: 46,537,696 bytes.
[validateMemObject] Buffer size: 871,676 bytes. Required size: 871,676 bytes.
[==========] Running 4 tests from 2 test cases.
[----------] Global test environment set-up.
[----------] 2 tests from Blas2/0, where TypeParam = float
[ RUN      ] Blas2/0.csrmv_adaptive
key: [csrmv_adaptive/csrmv_adaptive] -cl-kernel-arg-info -cl-std=CL1.2 -DROWBITS=32 -DWGBITS=24 -DVALUE_TYPE=float -DWG_SIZE=256 -DBLOCKSIZE=1024 -DBLOCK_MULTIPLIER=3 -DROWS_FOR_VECTOR=1 -DINDEX_TYPE=uint hash = 4124998496
kernel not found in cache: 4124998496
C:\src\github\kvaragan\clSPARSE\src\tests\test-blas2.cpp(216): error: The difference between hY[i] and host_result[i] is 0.042902544140815735, which exceeds compare_val, where
hY[i] evaluates to 0.21477754414081573,
host_result[i] evaluates to 0.171875, and
compare_val evaluates to 0.021477754414081576.
[  FAILED  ] Blas2/0.csrmv_adaptive, where TypeParam = float (610 ms)
[ RUN      ] Blas2/0.csrmv_vector
key: [csrmv_general/csrmv_general] -cl-kernel-arg-info -cl-std=CL1.2 -DVALUE_TYPE=float -DSIZE_TYPE=ulong -DWG_SIZE=256 -DWAVE_SIZE=64 -DSUBWAVE_SIZE=32 -DINDEX_TYPE=uint hash = 65959904
kernel not found in cache: 65959904
C:\src\github\kvaragan\clSPARSE\src\tests\test-blas2.cpp(216): error: The difference between hY[i] and host_result[i] is 0.03910769522190094, which exceeds compare_val, where
hY[i] evaluates to -0.24223269522190094,
host_result[i] evaluates to -0.203125, and
compare_val evaluates to 0.024223269522190095.
[  FAILED  ] Blas2/0.csrmv_vector, where TypeParam = float (558 ms)
[----------] 2 tests from Blas2/0 (1171 ms total)

[----------] 2 tests from Blas2/1, where TypeParam = double
[ RUN      ] Blas2/1.csrmv_adaptive
key: [csrmv_adaptive/csrmv_adaptive] -cl-kernel-arg-info -cl-std=CL1.2 -DROWBITS=32 -DWGBITS=24 -DVALUE_TYPE=double -DWG_SIZE=256 -DBLOCKSIZE=1024 -DBLOCK_MULTIPLIER=3 -DROWS_FOR_VECTOR=1 -DINDEX_TYPE=uint -DDOUBLE hash = 2456098779
kernel not found in cache: 2456098779
[       OK ] Blas2/1.csrmv_adaptive (741 ms)
[ RUN      ] Blas2/1.csrmv_vector
key: [csrmv_general/csrmv_general] -cl-kernel-arg-info -cl-std=CL1.2 -DVALUE_TYPE=double -DSIZE_TYPE=ulong -DWG_SIZE=256 -DWAVE_SIZE=64 -DSUBWAVE_SIZE=32 -DINDEX_TYPE=uint -DDOUBLE hash = 1815015837
kernel not found in cache: 1815015837
[       OK ] Blas2/1.csrmv_vector (696 ms)
[----------] 2 tests from Blas2/1 (1439 ms total)

[----------] Global test environment tear-down
[==========] 4 tests from 2 test cases ran. (2612 ms total)
[  PASSED  ] 2 tests.
[  FAILED  ] 2 tests, listed below:
[  FAILED  ] Blas2/0.csrmv_adaptive, where TypeParam = float
[  FAILED  ] Blas2/0.csrmv_vector, where TypeParam = float

 2 FAILED TESTS

Passing accuracy tests with pwtk because of GPU kernel using compensated summation

test-blas2.exe -p "C:\src-bin\MTX\Bell_Garland\pwtk\pwtk.mtx" -e
AMD Accelerated Parallel Processing 0
Using device Hawaii
key: [control/control] -cl-kernel-arg-info -cl-std=CL1.2 -DWG_SIZE=256 hash = 174490369
kernel not found in cache: 174490369
Matrix: C:\src-bin\MTX\Bell_Garland\pwtk\pwtk.mtx [nRow: 217918] [nCol: 217918] [nNZ: 11852342]
[validateMemObject] Buffer size: 94,818,736 bytes. Required size: 93,075,392 bytes.
[validateMemObject] Buffer size: 47,409,368 bytes. Required size: 46,537,696 bytes.
[validateMemObject] Buffer size: 871,676 bytes. Required size: 871,676 bytes.
[==========] Running 4 tests from 2 test cases.
[----------] Global test environment set-up.
[----------] 2 tests from Blas2/0, where TypeParam = float
[ RUN      ] Blas2/0.csrmv_adaptive
key: [csrmv_adaptive/csrmv_adaptive] -cl-kernel-arg-info -cl-std=CL1.2 -DROWBITS=32 -DWGBITS=24 -DVALUE_TYPE=float -DWG_SIZE=256 -DBLOCKSIZE=1024 -DBLOCK_MULTIPLIER=3 -DROWS_FOR_VECTOR=1 -DINDEX_TYPE=uint -DEXTENDED_PRECISION hash = 3734194111
kernel not found in cache: 3734194111
Float Min ulps: 0
Float Max ulps: 11
Float Total ulps: 232
Float Average ulps: 0.00106462 (Size: 217918)
[       OK ] Blas2/0.csrmv_adaptive (596 ms)
[ RUN      ] Blas2/0.csrmv_vector
key: [csrmv_general/csrmv_general] -cl-kernel-arg-info -cl-std=CL1.2 -DVALUE_TYPE=float -DSIZE_TYPE=ulong -DWG_SIZE=256 -DWAVE_SIZE=64 -DSUBWAVE_SIZE=32 -DINDEX_TYPE=uint -DEXTENDED_PRECISION hash = 4172172175
kernel not found in cache: 4172172175
Float Min ulps: 0
Float Max ulps: 8
Float Total ulps: 321
Float Average ulps: 0.00147303 (Size: 217918)
[       OK ] Blas2/0.csrmv_vector (521 ms)
[----------] 2 tests from Blas2/0 (1118 ms total)

[----------] 2 tests from Blas2/1, where TypeParam = double
[ RUN      ] Blas2/1.csrmv_adaptive
key: [csrmv_adaptive/csrmv_adaptive] -cl-kernel-arg-info -cl-std=CL1.2 -DROWBITS=32 -DWGBITS=24 -DVALUE_TYPE=double -DWG_SIZE=256 -DBLOCKSIZE=1024 -DBLOCK_MULTIPLIER=3 -DROWS_FOR_VECTOR=1 -DINDEX_TYPE=uint -DDOUBLE -DEXTENDED_PRECISION hash = 1699454612
kernel not found in cache: 1699454612
Double Min ulps: 0
Double Max ulps: 0
Double Total ulps: 0
Double Average ulps: 0 (Size: 217918)
[       OK ] Blas2/1.csrmv_adaptive (688 ms)
[ RUN      ] Blas2/1.csrmv_vector
key: [csrmv_general/csrmv_general] -cl-kernel-arg-info -cl-std=CL1.2 -DVALUE_TYPE=double -DSIZE_TYPE=ulong -DWG_SIZE=256 -DWAVE_SIZE=64 -DSUBWAVE_SIZE=32 -DINDEX_TYPE=uint -DDOUBLE -DEXTENDED_PRECISION hash = 1976306226
kernel not found in cache: 1976306226
Double Min ulps: 0
Double Max ulps: 0
Double Total ulps: 0
Double Average ulps: 0 (Size: 217918)
[       OK ] Blas2/1.csrmv_vector (619 ms)
[----------] 2 tests from Blas2/1 (1308 ms total)

[----------] Global test environment tear-down
[==========] 4 tests from 2 test cases ran. (2430 ms total)
[  PASSED  ] 4 tests.

Provide feedback

Saved searches