Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel solving tridiagonal matrix #1066

Open
antobazir opened this issue Oct 18, 2024 · 2 comments
Open

Parallel solving tridiagonal matrix #1066

antobazir opened this issue Oct 18, 2024 · 2 comments

Comments

@antobazir
Copy link

Hello,

I'm working on a diffusion reaction problem for modelling and I would ideally like to compute diffusion over a 300x300x300 grid.
Reading on the subject oriented me towards locally one-dimension method which means I have to solve 3 tridiagonal systems back-to-back.

I used LAPACK/OpenBLAS to do that. However, the dgtsv function of the LAPACK package (both the C and Fortran implementation found in the OpenBLAS repositories) are not parallelized. I wish to use a 24 cores machine for acceleration and wondered if there was an OpenMP implementation of the dgtsv that can be parallelized in order to take full advantage of the computing power.

I tried to use dgesv(which is parallelized by OpenBLAS) but it seems the gain from parallellizing is negated by the heavier calculations on the full matrix.

Someone has suggested that I use the GETRF + GETRS combination (as they are parallelized by OpenBLAS) but those take full matrices as arguments and my understanding of it tells me it would essentially be the same as just using dgesv (which I suppose wraps those two steps in one function). Am I correct or can I expect better performance with GETRF+GETRS vs DGESV ?

Thanks a lot

@Michael-J-Hennnebry
Copy link

Michael-J-Hennnebry commented Oct 31, 2024

Splitting the recurrence in straight forward fashion produces
5/3 as many multiplications and twice as many subtractions.
If multiplication-bound, that is a speedup factor of about 14.
If each core has SIMD instructions, a larger speedup-is possible.
Precisely how to split the recurrence likely requires cache effects.

Added:
I'm a novice at lapack and a complete newbie at openmp,
but I think it is something I can do.

@Michael-J-Hennnebry
Copy link

For some reason, when I wrote my previous comment, I was thinking of the formula for the determinant of the matrix.
That said, I think I could do solve the equations, but not with the API of dgtsv.
Even if the tridiagonal matrix is invertible, a badly chosen sub-matrix might not be.
Redoing the choice would require keeping the original data.

Since there has not been any comment in the two weeks it took me to get back on github (grrr),
I'm guessing antobazir has decded to live without.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants