-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: Tile 8×8 covariance matrix multiplication #1181
perf: Tile 8×8 covariance matrix multiplication #1181
Conversation
2097897
to
b290c25
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1181 +/- ##
==========================================
- Coverage 48.69% 48.67% -0.03%
==========================================
Files 493 493
Lines 28992 29004 +12
Branches 13804 13816 +12
==========================================
Hits 14117 14117
- Misses 4946 4947 +1
- Partials 9929 9940 +11 ☔ View full report in Codecov by Sentry. |
This commit optimizes some of the Eigen usage in the covariance engine, specifically in the critical path for the propagation examples. The first optimisation we make is to introduce a tiled matrix multiplication method, which takes 2i×2j matrices, and performs four i×j multiplications instead, which Eigen can optimize far more easily. Secondly, we reduce the number of floating point operations performed by working with smaller submatrices wherever possible. On my machine, the following performance is achieved in the propagation example before this patch: 53.555595 ms/event. After this patch, we take 43.750143 ms/event. This performance gain is independent from the performance gain of acts-project#1181.
b290c25
to
a19f558
Compare
With both this and #1183, my performance for the Eigen stepper becomes 36.595292 ms/event on the generic propagation example, while the performance for the ATLAS stepper is 29.353479 ms/event. Getting closer! |
8732133
to
56bc920
Compare
Miffed why this fails to be honest. |
56bc920
to
0309ea8
Compare
Okay, I really think this is just harmless numerical errors causing hash changes. |
0309ea8
to
61fb806
Compare
This commit optimizes some of the Eigen usage in the covariance engine, specifically in the critical path for the propagation examples. The first optimisation we make is to introduce a tiled matrix multiplication method, which takes 2i×2j matrices, and performs four i×j multiplications instead, which Eigen can optimize far more easily. Secondly, we reduce the number of floating point operations performed by working with smaller submatrices wherever possible. On my machine, the following performance is achieved in the propagation example before this patch: 53.555595 ms/event. After this patch, we take 43.750143 ms/event. This performance gain is independent from the performance gain of acts-project#1181.
61fb806
to
2309f35
Compare
This commit optimizes some of the Eigen usage in the covariance engine, specifically in the critical path for the propagation examples. The first optimisation we make is to introduce a tiled matrix multiplication method, which takes 2i×2j matrices, and performs four i×j multiplications instead, which Eigen can optimize far more easily. Secondly, we reduce the number of floating point operations performed by working with smaller submatrices wherever possible. On my machine, the following performance is achieved in the propagation example before this patch: 53.555595 ms/event. After this patch, we take 43.750143 ms/event. This performance gain is independent from the performance gain of #1181.
2309f35
to
d12cf00
Compare
47ebcd7
to
ffc6256
Compare
ffc6256
to
51e2ff1
Compare
51e2ff1
to
ee2fed6
Compare
The physmon coverage and configuration is much more robust than it was a year or even half a year ago. I don't see any significant changes, other than the ROOT file hashes, everything is green. Should we just merge this at this stage? |
looks like the physmon diff is gone |
physmon is passing. results here #1181 (comment) hashes are changing which means it does at least something 😄 should we put this in @paulgessinger ? |
Did a quick performance measurement with https://github.com/andiwand/cern-scripts/blob/main/tmp/full_chain_perf.py Fatras on average CKF on average |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All green, let's go!
Just to try if this also returns clean outputs, I'm downgrading the `if` added in #1181 to an `assert`. Let's see what the CI says here.
Currently, we are multiplying an 8x8 covariance matrix with an 8x8 transport matrix, and we see that Eigen is failing to optimize this properly, because it is calling a generalized GEMM method rather than an optimized small matrix method. In order to resolve this, we change the code to use a tiled multiplication method which splits the matrices into 4x4 sub-matrices which can be multiplied and added to achieve the desired effect. This has two advantages: 1. It allows Eigen to use its hand-rolled optimized 4x4 matrix multiplication methods. 2. It allows us to perform some trickery with matrix identities to reduce the number of floating point operations. Co-authored-by: Andreas Stefl <[email protected]>
…ect#3009) Just to try if this also returns clean outputs, I'm downgrading the `if` added in acts-project#1181 to an `assert`. Let's see what the CI says here.
Currently, we are multiplying an 8x8 covariance matrix with an 8x8 transport matrix, and we see that Eigen is failing to optimize this properly, because it is calling a generalized GEMM method rather than an optimized small matrix method. In order to resolve this, we change the code to use a tiled multiplication method which splits the matrices into 4x4 sub-matrices which can be multiplied and added to achieve the desired effect. This has two advantages: 1. It allows Eigen to use its hand-rolled optimized 4x4 matrix multiplication methods. 2. It allows us to perform some trickery with matrix identities to reduce the number of floating point operations. Co-authored-by: Andreas Stefl <[email protected]>
…ect#3009) Just to try if this also returns clean outputs, I'm downgrading the `if` added in acts-project#1181 to an `assert`. Let's see what the CI says here.
Currently, we are multiplying an 8x8 covariance matrix with an 8x8 transport matrix, and we see that Eigen is failing to optimize this properly, because it is calling a generalized GEMM method rather than an optimized small matrix method. In order to resolve this, we change the code to use a tiled multiplication method which splits the matrices into 4x4 sub-matrices which can be multiplied and added to achieve the desired effect. This has two advantages: 1. It allows Eigen to use its hand-rolled optimized 4x4 matrix multiplication methods. 2. It allows us to perform some trickery with matrix identities to reduce the number of floating point operations. Co-authored-by: Andreas Stefl <[email protected]>
…ect#3009) Just to try if this also returns clean outputs, I'm downgrading the `if` added in acts-project#1181 to an `assert`. Let's see what the CI says here.
Currently, we are multiplying an 8x8 covariance matrix with an 8x8 transport matrix, and we see that Eigen is failing to optimize this properly, because it is calling a generalized GEMM method rather than an optimized small matrix method. In order to resolve this, we change the code to use a tiled multiplication method which splits the matrices into 4x4 sub-matrices which can be multiplied and added to achieve the desired effect. This has two advantages: