Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
perf: Tile 8x8 covariance matrix multiplication
Currently, we are multiplying an 8x8 covariance matrix with an 8x8 transport matrix, and we see that Eigen is failing to optimize this properly, because it is calling a generalized GEMM method rather than an optimized small matrix method. In order to resolve this, we change the code to use a tiled multiplication method which splits the matrices into 4x4 sub-matrices which can be multiplied and added to achieve the desired effect. This has two advantages: 1. It allows Eigen to use its hand-rolled optimized 4x4 matrix multiplication methods. 2. It allows us to perform some trickery with matrix identities to reduce the number of floating point operations.
- Loading branch information