Accelerating Sparse Matrix-Matrix Multiplication with GPU Tensor Cores

In this repository we provide the source code of our accelerated Sparse Matrix-Matrix multiplication (SpGEMM) implementation, which we desrcribe in "Accelerating Sparse Matrix-Matrix Multiplication with GPU Tensor Cores" [1].

How to build

Cmake and CUDA are required. Modification to the CMakeLists.txt files may be necessary, e.g. to change GPU architecture.

Instructions:

Download the source code into a folder e.g. tSparse-src.
Create a folder for building the executable in the same directory as tSparse-src, e.g tSparse-release.
Inside tSparse-release and call cmake:

cmake ../tSparse-src

Finally, call make:

make -j{number of CPU cores}

How to run

In order to test SPGEMM run "spmm" executable. spmm accepts 1 or 2 arguments. In case of 1 argument it performs matrix squaring (A*A). In case of 2 arguments it performs the matrix multiplication (A*B).

examples

./spmm A.mtx
./spmm A.mtx B.mtx

Troubleshooting

The first compilation after running cmake may give an error similar to : "error: undefined reference to '__cudaRegisterLinkedBinary_12_spmm_cpp1_ii_handle'". This error is related to the CUDA library that is used by CUDA dynamic parallelism. Running make a second time solves this issue.
The latest multiplication and counting kernels do not support Volta. The reason is that we use direct access to fragments (instead of through shared memory) for performance reasons.

Contact data

Orestis Zachariadis ([email protected])

References

[1] O. Zachariadis, N. Satpute, J. Gómez-Luna, and J. Olivares, “Accelerating sparse matrix–matrix multiplication with GPU Tensor Cores,” Computers & Electrical Engineering, vol. 88, p. 106848, Dec. 2020, doi: 10.1016/j.compeleceng.2020.106848.
Also in arXiv.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
external		external
src		src
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Accelerating Sparse Matrix-Matrix Multiplication with GPU Tensor Cores

How to build

How to run

examples

Troubleshooting

Contact data

References

About

Releases

Packages

Languages

License

oresths/tSparse

Folders and files

Latest commit

History

Repository files navigation

Accelerating Sparse Matrix-Matrix Multiplication with GPU Tensor Cores

How to build

How to run

examples

Troubleshooting

Contact data

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages