Compare I/O of sparse matrix libraries.
Some intentionally include matrix construction time. These timings can be affected by the sort order of the values in the MatrixMarket file.
- fast_matrix_market
- Matrix Market read/write
- PIGO
- Matrix Market read
- proprietary binary write
- ASCII format write (like Matrix Market body only)
- GraphBLAS
- Reads include matrix construction time
- Matrix Market read/write using fast_matrix_market's GraphBLAS binding. This includes matrix construction time, which highly depends on whether values are already sorted or not.
- LAGraph
- Reads include matrix construction time
- Matrix Market read/write
- Eigen
- Reads include matrix construction time
- Matrix Market read/write (library native)
- Matrix Market read/write using fast_matrix_market's Eigen binding.
- Polars
- Parquet read/write
- Pandas
- Parquet read/write
Libraries are fetched from their main branches on GitHub. To pin a version modify the appropriate file in cmake/.
CMake will pull in all dependencies.
Exception is GraphBLAS, its benchmark is skipped if GraphBLAS is not found. Up to you to install GraphBLAS, brew install suite-sparse
works on macOS.
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build --config Release
builds everything into the build
subdirectory.
In a virtual environment:
pip install -r requirements.txt
The benchmarks look for any *.mtx
MatrixMarket files in the current directory and benchmark against these. For benchmarks of non Matrix Market formats, the datastructure is first populated with the MM file and then written to the tested format.
Use any method you wish to create the .mtx
files.
Generate randomized matrix market files of a specified size (in megabytes):
build/generate_matrix_market 1024
creates a file named 1024MiB.mtx
in the current directory that is 1 GiB in size.
Some benchmarks like GraphBLAS perform much better if the indices are sorted. Use sort_matrix_market
to create a sorted copy of a .mtx
file:
build/sort_matrix_market 1024MiB.mtx
Run all benchmarks:
build/fmm
build/PIGO
build/graphblas_fmm
Or use Google Benchmark's filter option to run only some benchmarks:
build/fmm '--benchmark_filter=.*read.*'
build/PIGO '--benchmark_filter=.*read.*'
build/graphblas_fmm '--benchmark_filter=.*read.*'
The benchmarks report the end-to-end time, as that is the primary thing the end user cares about.
This includes overheads and any datastructure construction time. For example, the GraphBLAS benchmark may include the time for GrB_Matrix_build
in addition to the I/O time. This is intentional.
In addition to the runtime in seconds each benchmark divides this time by the file size and reports an effective read speed in bytes/second. This normalized value is very informative:
- Directly comparable to other benchmarked files, which are almost certainly of different sizes.
- Shows at a glance whether performance varies by file size or not.
- Directly comparable to system I/O capabilities.
M1 Macbook Pro with 16GiB RAM, 6 performance and 2 efficiency cores (ARM).
Input data is a random 1GiB file, generated by generate_matrix_market
as above, and the same file sorted (by row then column index) by sort_matrix_market
as above.
bench_fmm
:
----------------------------------------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
----------------------------------------------------------------------------------------------------------------------------------------------
op:read/impl:FMM/format:MatrixMarket/problem:0/p:8/iterations:1/real_time 0.494 s 0.214 s 1 bytes_per_second=2.02622G/s problem_name=1024MiB.mtx
op:read/impl:FMM/format:MatrixMarket/problem:1/p:8/iterations:1/real_time 0.491 s 0.201 s 1 bytes_per_second=2.03837G/s problem_name=1024MiB.sorted.mtx
op:write/impl:FMM/format:MatrixMarket/problem:0/p:8/iterations:1/real_time 1.26 s 0.227 s 1 bytes_per_second=876.407M/s problem_name=1024MiB.mtx
op:write/impl:FMM/format:MatrixMarket/problem:1/p:8/iterations:1/real_time 1.25 s 0.231 s 1 bytes_per_second=877.677M/s problem_name=1024MiB.sorted.mtx
op:write/impl:FMM/format:MatrixMarket(pattern)/problem:0/p:8/iterations:1/real_time 0.815 s 0.187 s 1 bytes_per_second=804.211M/s problem_name=1024MiB.mtx
op:write/impl:FMM/format:MatrixMarket(pattern)/problem:1/p:8/iterations:1/real_time 0.824 s 0.185 s 1 bytes_per_second=795.726M/s problem_name=1024MiB.sorted.mtx
10GiB file (note machine has 16GiB RAM):
----------------------------------------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
----------------------------------------------------------------------------------------------------------------------------------------------
op:read/impl:FMM/format:MatrixMarket/problem:0/p:8/iterations:1/real_time 6.72 s 3.23 s 1 bytes_per_second=1.48919G/s problem_name=10240MiB.mtx
op:write/impl:FMM/format:MatrixMarket/problem:0/p:8/iterations:1/real_time 14.7 s 2.77 s 1 bytes_per_second=746.948M/s problem_name=10240MiB.mtx
op:write/impl:FMM/format:MatrixMarket(pattern)/problem:0/p:8/iterations:1/real_time 10.0 s 1.94 s 1 bytes_per_second=653.106M/s problem_name=10240MiB.mtx
bench_pigo
:
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
----------------------------------------------------------------------------------------------------------------------------------------------------------------
op:read/impl:PIGO/format:MatrixMarket/problem:0/p:8/iterations:1/real_time 0.351 s 0.314 s 1 bytes_per_second=2.84972G/s problem_name=1024MiB.mtx
op:read/impl:PIGO/format:MatrixMarket/problem:1/p:8/iterations:1/real_time 0.391 s 0.299 s 1 bytes_per_second=2.55914G/s problem_name=1024MiB.sorted.mtx
op:write/impl:PIGO/format:binary/problem:0/p:8/iterations:1/real_time 0.922 s 0.356 s 1 bytes_per_second=1066.59M/s problem_name=1024MiB.mtx
op:write/impl:PIGO/format:binary/problem:1/p:8/iterations:1/real_time 0.718 s 0.306 s 1 bytes_per_second=1.33625G/s problem_name=1024MiB.sorted.mtx
op:write/impl:PIGO/format:ASCII(MatrixMarket_body_only)/problem:0/p:8/iterations:1/real_time 16.4 s 14.8 s 1 bytes_per_second=62.5738M/s problem_name=1024MiB.mtx
op:write/impl:PIGO/format:ASCII(MatrixMarket_body_only)/problem:1/p:8/iterations:1/real_time 16.4 s 14.8 s 1 bytes_per_second=62.4265M/s problem_name=1024MiB.sorted.mtx
op:write/impl:PIGO/format:ASCII(MatrixMarket_body_only(pattern))/problem:0/p:8/iterations:1/real_time 0.604 s 0.316 s 1 bytes_per_second=1085.65M/s problem_name=1024MiB.mtx
op:write/impl:PIGO/format:ASCII(MatrixMarket_body_only(pattern))/problem:1/p:8/iterations:1/real_time 0.587 s 0.337 s 1 bytes_per_second=1116.13M/s problem_name=1024MiB.sorted.mtx
10GiB file (note machine has 16GiB RAM):
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
----------------------------------------------------------------------------------------------------------------------------------------------------------------
op:read/impl:PIGO/format:MatrixMarket/problem:0/p:8/iterations:1/real_time 21.5 s 4.62 s 1 bytes_per_second=475.754M/s problem_name=10240MiB.mtx
op:write/impl:PIGO/format:binary/problem:0/p:8/iterations:1/real_time 57.5 s 10.7 s 1 bytes_per_second=170.985M/s problem_name=10240MiB.mtx
op:write/impl:PIGO/format:ASCII(MatrixMarket_body_only)/problem:0/p:8/iterations:1/real_time 206 s 141 s 1 bytes_per_second=49.6292M/s problem_name=10240MiB.mtx
op:write/impl:PIGO/format:ASCII(MatrixMarket_body_only(pattern))/problem:0/p:8/iterations:1/real_time 44.0 s 8.20 s 1 bytes_per_second=148.965M/s problem_name=10240MiB.mtx
Reads include matrix construction time
bench_graphblas_fmm
:
-----------------------------------------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
-----------------------------------------------------------------------------------------------------------------------------------------------
op:read/impl:GraphBLAS_FMM/format:MatrixMarket/problem:0/p:8/iterations:1/real_time 5.40 s 5.14 s 1 bytes_per_second=189.543M/s problem_name=1024MiB.mtx
op:read/impl:GraphBLAS_FMM/format:MatrixMarket/problem:1/p:8/iterations:1/real_time 0.925 s 0.676 s 1 bytes_per_second=1106.64M/s problem_name=1024MiB.sorted.mtx
op:write/impl:GraphBLAS_FMM/format:MatrixMarket/problem:0/p:8/iterations:1/real_time 1.27 s 0.200 s 1 bytes_per_second=864.388M/s problem_name=1024MiB.mtx
op:write/impl:GraphBLAS_FMM/format:MatrixMarket/problem:1/p:8/iterations:1/real_time 1.16 s 0.206 s 1 bytes_per_second=951.295M/s problem_name=1024MiB.sorted.mtx
bench_lagraph
:
-----------------------------------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
-----------------------------------------------------------------------------------------------------------------------------------------
op:read/impl:LAGraph/format:MatrixMarket/problem:0/p:8/iterations:1/real_time 18.4 s 18.1 s 1 bytes_per_second=55.5359M/s problem_name=1024MiB.mtx
op:read/impl:LAGraph/format:MatrixMarket/problem:1/p:8/iterations:1/real_time 12.0 s 12.0 s 1 bytes_per_second=85.0187M/s problem_name=1024MiB.sorted.mtx
op:write/impl:LAGraph/format:MatrixMarket/problem:0/p:8/iterations:1/real_time 26.1 s 25.5 s 1 bytes_per_second=37.6224M/s problem_name=1024MiB.mtx
op:write/impl:LAGraph/format:MatrixMarket/problem:1/p:8/iterations:1/real_time 26.3 s 25.4 s 1 bytes_per_second=37.3481M/s problem_name=1024MiB.sorted.mtx
Reads include matrix construction time
bench_eigen
:
---------------------------------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
---------------------------------------------------------------------------------------------------------------------------------------
op:read/impl:Eigen/format:MatrixMarket/problem:0/p:8/iterations:1/real_time 14.0 s 13.8 s 1 bytes_per_second=73.1673M/s problem_name=1024MiB.mtx
op:read/impl:Eigen/format:MatrixMarket/problem:1/p:8/iterations:1/real_time 11.7 s 11.6 s 1 bytes_per_second=87.3179M/s problem_name=1024MiB.sorted.mtx
op:write/impl:Eigen/format:MatrixMarket/problem:0/p:8/iterations:1/real_time 24.6 s 24.2 s 1 bytes_per_second=66.6896M/s problem_name=1024MiB.mtx
op:write/impl:Eigen/format:MatrixMarket/problem:1/p:8/iterations:1/real_time 24.4 s 24.2 s 1 bytes_per_second=67.1702M/s problem_name=1024MiB.sorted.mtx
bench_eigen_fmm
:
-------------------------------------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
-------------------------------------------------------------------------------------------------------------------------------------------
op:read/impl:Eigen_FMM/format:MatrixMarket/problem:0/p:8/iterations:1/real_time 2.58 s 2.37 s 1 bytes_per_second=396.515M/s problem_name=1024MiB.mtx
op:read/impl:Eigen_FMM/format:MatrixMarket/problem:1/p:8/iterations:1/real_time 1.83 s 1.61 s 1 bytes_per_second=558.58M/s problem_name=1024MiB.sorted.mtx
op:write/impl:Eigen_FMM/format:MatrixMarket/problem:0/p:8/iterations:1/real_time 1.36 s 1.18 s 1 bytes_per_second=808.776M/s problem_name=1024MiB.mtx
op:write/impl:Eigen_FMM/format:MatrixMarket/problem:1/p:8/iterations:1/real_time 1.35 s 1.19 s 1 bytes_per_second=816.8M/s problem_name=1024MiB.sorted.mtx
python bench_polars.py
-----------------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
-----------------------------------------------------------------------------------------------------------------------
op:read/impl:Polars/format:Parquet/0/iterations:1/real_time 0.272 s 0.003 s 1 1024MiB.mtx=0 MM_equivalent_bytes_per_second=3.94835G/s bytes_per_second=1.97559G/s
op:read/impl:Polars/format:Parquet/1/iterations:1/real_time 0.208 s 0.002 s 1 1024MiB.sorted.mtx=1 MM_equivalent_bytes_per_second=5.16654G/s bytes_per_second=1.99667G/s
op:write/impl:Polars/format:Parquet/0/iterations:1/real_time 2.75 s 2.73 s 1 1024MiB.mtx=0 MM_equivalent_bytes_per_second=390.834M/s bytes_per_second=200.25M/s
op:write/impl:Polars/format:Parquet/1/iterations:1/real_time 2.70 s 2.69 s 1 1024MiB.sorted.mtx=1 MM_equivalent_bytes_per_second=398.328M/s bytes_per_second=157.633M/s
10GiB file (note machine has 16GiB RAM):
-----------------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
-----------------------------------------------------------------------------------------------------------------------
op:read/impl:Polars/format:Parquet/0/iterations:1/real_time 27.7 s 0.053 s 1 10240MiB.mtx=0 MM_equivalent_bytes_per_second=387.431M/s bytes_per_second=198.52M/s
op:write/impl:Polars/format:Parquet/0/iterations:1/real_time 37.3 s 30.0 s 1 10240MiB.mtx=0 MM_equivalent_bytes_per_second=287.603M/s bytes_per_second=147.368M/s