Skip to content

Commit

Permalink
solver: use bsrsm2 by default for cuda 12
Browse files Browse the repository at this point in the history
  • Loading branch information
bd4 committed Jan 3, 2023
1 parent 067feeb commit cff2152
Showing 1 changed file with 10 additions and 1 deletion.
11 changes: 10 additions & 1 deletion include/gt-solver/solver.h
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,19 @@

#ifdef GTENSOR_DEVICE_CUDA
#if CUDART_VERSION >= 12000
// new generic MPI; available since 11.3.1, but performance
#ifdef GTENSOR_DEVICE_CUDA_CUSPARSE_GENERIC
// New generic API; available since 11.3.1, but performance
// is worse than old csrsm2 API in many cases pre-12.
#include "gt-solver/backend/cuda-generic.h"
#else
// bsrsm2 API, which can work with csr format by setting
// block size 1. In CUDA 12, appears to use less memory and
// often be faster than generic API. Exists even in 8.0,
// but not clear it has advantage over csrsm2 API for older
// cuda versions where csrsm2 is still available.
#include "gt-solver/backend/cuda-bsrsm2.h"
#endif // GTENSOR_DEVICE_CUDA_CUSPARSE_GENERIC
#else
// legacy API, deprecated since 11.3.1 but still supported until 12
#include "gt-solver/backend/cuda-csrsm2.h"
#endif // CUDA_VERSION
Expand Down

0 comments on commit cff2152

Please sign in to comment.