Fast and full-featured Matrix Market I/O.
C++:
Ready-to-use bindings for
GraphBLAS,
Eigen,
CXSparse,
Blaze,
Armadillo,
and std::vector
-like types.
Easy to integrate with any datastructure.
Python:
Python bindings. Included as scipy.io.mmread
and mmwrite
in SciPy 1.12.
R:
Third party bindings for R in fastMatMR.
Matrix Market is a simple, human-readable, and widely used sparse matrix file format that looks like this:
%%MatrixMarket matrix coordinate real general
% 3-by-3 identity matrix
3 3 3
1 1 1
2 2 1
3 3 1
Most sparse matrix libraries include Matrix Market read and write routines.
However, these are typically written using scanf
, IOStreams, or equivalent in their language. These routines work,
but are too slow to saturate even a low-end mechanical hard drive, nevermind today's fast storage devices.
This means working with large multi-gigabyte datasets is far more painful than it needs to be.
Additionally, many loaders only support a subset of the Matrix Market format. Typically only coordinate matrices and often just a subset of that. This means loading data from external sources like the SuiteSparse Matrix Collection becomes hit or miss as some valid matrices cannot be loaded.
Use this library to fix these shortcomings.
fast_matrix_market
takes the fastest available sequential methods and parallelizes them.
Reach >1GB/s read and write speeds on a laptop (about 25x improvement over IOStreams).
Note: IOStreams benchmark is sequential because IOStreams get slower with additional parallelism due to internal locking on the locale.
Loaders using IOStreams or fscanf
are both slow and do not parallelize.
See parse_bench for more.
The majority of fast_matrix_market
's sequential improvement comes from using C++17's <charconv>
.
If floating-point versions are not available then fall back on fast_float
and Dragonbox. These methods are then parallelized by chunking the input stream and parsing each chunk in parallel.
Use the run_benchmarks.sh
script to see for yourself on your own system. The script builds, runs, and saves benchmark data.
Then simply run all the cells in the benchmark_plots/plot.ipynb Jupyter notebook.
-
coordinate
andarray
, each readable into both sparse and dense structures. -
All
field
types supported:integer
,real
,complex
,pattern
.-
Support all C++ types.
float
,double
,long double
,std::complex<>
, integer types,bool
.- Arbitrary types.
std::string
comes bundled. See implementation, example usage - C++23 fixed width floating point types like
std::float32_t
.
-
Automatic
std::complex
up-cast. For example,real
files can be read intostd::complex<double>
arrays. -
Read and write
pattern
files. Read just the indices or supply a default value.
-
-
Both
matrix
andvector
files. Most loaders crash onvector
files.- A vector data structure (sparse doublet or dense) will accept either a
vector
file, an M-by-1matrix
file or a 1-by-Nmatrix
file. - A matrix data structure will read a
vector
file as if it was a M-by-1 column matrix.
- A vector data structure (sparse doublet or dense) will accept either a
-
Ability to read just the header (useful for metadata collection).
-
Read and write Matrix Market header comments.
-
Read and write all symmetries:
general
,symmetric
,skew-symmetric
,hermitian
. -
Optional (on by default) automatic symmetry generalization. If your code cannot make use of the symmetry but the file specifies one, the loader can emit the symmetric entries for you.
fast_matrix_market
provides both ready-to-use methods for common data structures and building blocks for your own. See examples/ for complete code.
In general, the read_matrix_market_*
methods accept a std::istream
and a datastructure to read into. This datastructure can be sparse or dense, and will accept any Matrix Market file.
The write_matrix_market_*
methods accept a std::ostream
and a datastructure. If the datastructure is sparse then a coordinate
Matrix Market file is written, else an array
file is written.
The methods also accept an optional header
argument that can be used to read and write file metadata, such as the comment or whether the matrix is a pattern
.
Important: Open output file streams in binary mode. Text mode on Windows will naturally emit files with CRLF line endings. FMM can read such files on any platform, but that is not always true of other MatrixMarket loaders.
Matrix composed of row and column index vectors and a value vector. Any vector class that can be resized and iterated like std::vector
will work.
#include <fast_matrix_market/fast_matrix_market.hpp>
struct triplet_matrix {
int64_t nrows = 0, ncols = 0;
std::vector<int64_t> rows, cols;
std::vector<double> vals; // or int64_t, float, std::complex<double>, etc.
} mat;
fast_matrix_market::read_matrix_market_triplet(
input_stream,
mat.nrows, mat.ncols,
mat.rows, mat.cols, mat.vals);
Doublet sparse vectors, composed of index and value vectors, are supported in a similar way by read_matrix_market_doublet()
.
CSC and CSR matrices composed of indptr
, indices
, and values
arrays can be written directly with write_matrix_market_csc()
.
Any vector class that can be resized and iterated like std::vector
will work.
Be mindful of whether your code expects row or column major ordering.
#include <fast_matrix_market/fast_matrix_market.hpp>
struct array_matrix {
int64_t nrows = 0, ncols = 0;
std::vector<double> vals; // or int64_t, float, std::complex<double>, etc.
} mat;
fast_matrix_market::read_matrix_market_array(
input_stream,
mat.nrows, mat.ncols,
mat.vals,
fast_matrix_market::row_major);
1D dense vectors supported by the same method.
GrB_Matrix
and GrB_Vector
s are supported, with zero-copy where possible. See GraphBLAS README.
#include <fast_matrix_market/app/GraphBLAS.hpp>
GrB_Matrix A;
fast_matrix_market::read_matrix_market_graphblas(input_stream, &A);
Sparse and dense matrices and vectors are supported. See Eigen README.
#include <fast_matrix_market/app/Eigen.hpp>
Eigen::SparseMatrix<double> mat;
fast_matrix_market::read_matrix_market_eigen(input_stream, mat);
cs_xx
structures (in both COO and CSC modes) are supported. See CXSparse README.
#include <fast_matrix_market/app/CXSparse.hpp>
cs_dl *A;
fast_matrix_market::read_matrix_market_cxsparse(input_stream, &A, cs_dl_spalloc);
Blaze sparse and dense matrices and vectors are supported. See Blaze README.
#include <fast_matrix_market/app/Blaze.hpp>
blaze::CompressedMatrix<double> A;
fast_matrix_market::read_matrix_market_blaze(input_stream, A);
Armadillo sparse and dense matrices are supported. See Armadillo README.
#include <fast_matrix_market/app/Armadillo.hpp>
arma::SpMat<double> A;
fast_matrix_market::read_matrix_market_arma(input_stream, A);
Use the provided methods to read or write the header.
Next read or write the body. You'll mostly just need to provide parse_handler
(reads) and formatter
(writes) classes to read and write from/to your datastructure, respectively. The class you need is likely already in the library, though subtle differences between datastructures mean each one tends to need some customization.
Follow the example of the triplet and array implementations in include/fast_matrix_market/app/.
The fast_matrix_market
write mechanism can write procedurally generated data as well as materialized datastructures.
See generator README.
For example, write a 10-by-10 identity matrix to output_stream
:
#include <fast_matrix_market/app/generator.hpp>
fast_matrix_market::write_matrix_market_generated_triplet<int64_t, double>(
output_stream, {10, 10}, 10,
[](auto coo_index, auto& row, auto& col, auto& value) {
row = coo_index;
col = coo_index;
value = 1;
});
fast_matrix_market
is written in C++17. Parallelism uses C++11 threads. Header-only if optional dependencies are disabled.
include(FetchContent)
FetchContent_Declare(
fast_matrix_market
GIT_REPOSITORY https://github.com/alugowski/fast_matrix_market
GIT_TAG main
GIT_SHALLOW TRUE
)
FetchContent_MakeAvailable(fast_matrix_market)
target_link_libraries(YOUR_TARGET fast_matrix_market::fast_matrix_market)
Alternatively copy or checkout the repo into your project and:
add_subdirectory(fast_matrix_market)
See examples/ for what parts of the repo are needed.
You may also copy include/fast_matrix_market
into your project's include
directory.
See also dependencies.