You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm working on a diffusion reaction problem for modelling and I would ideally like to compute diffusion over a 300x300x300 grid.
Reading on the subject oriented me towards locally one-dimension method which means I have to solve 3 tridiagonal systems back-to-back.
I used LAPACK/OpenBLAS to do that. However, the dgtsv function of the LAPACK package (both the C and Fortran implementation found in the OpenBLAS repositories) are not parallelized. I wish to use a 24 cores machine for acceleration and wondered if there was an OpenMP implementation of the dgtsv that can be parallelized in order to take full advantage of the computing power.
I tried to use dgesv(which is parallelized by OpenBLAS) but it seems the gain from parallellizing is negated by the heavier calculations on the full matrix.
Someone has suggested that I use the GETRF + GETRS combination (as they are parallelized by OpenBLAS) but those take full matrices as arguments and my understanding of it tells me it would essentially be the same as just using dgesv (which I suppose wraps those two steps in one function). Am I correct or can I expect better performance with GETRF+GETRS vs DGESV ?
Thanks a lot
The text was updated successfully, but these errors were encountered:
Splitting the recurrence in straight forward fashion produces
5/3 as many multiplications and twice as many subtractions.
If multiplication-bound, that is a speedup factor of about 14.
If each core has SIMD instructions, a larger speedup-is possible.
Precisely how to split the recurrence likely requires cache effects.
Added:
I'm a novice at lapack and a complete newbie at openmp,
but I think it is something I can do.
For some reason, when I wrote my previous comment, I was thinking of the formula for the determinant of the matrix.
That said, I think I could do solve the equations, but not with the API of dgtsv.
Even if the tridiagonal matrix is invertible, a badly chosen sub-matrix might not be.
Redoing the choice would require keeping the original data.
Since there has not been any comment in the two weeks it took me to get back on github (grrr),
I'm guessing antobazir has decded to live without.
Hello,
I'm working on a diffusion reaction problem for modelling and I would ideally like to compute diffusion over a 300x300x300 grid.
Reading on the subject oriented me towards locally one-dimension method which means I have to solve 3 tridiagonal systems back-to-back.
I used LAPACK/OpenBLAS to do that. However, the dgtsv function of the LAPACK package (both the C and Fortran implementation found in the OpenBLAS repositories) are not parallelized. I wish to use a 24 cores machine for acceleration and wondered if there was an OpenMP implementation of the dgtsv that can be parallelized in order to take full advantage of the computing power.
I tried to use dgesv(which is parallelized by OpenBLAS) but it seems the gain from parallellizing is negated by the heavier calculations on the full matrix.
Someone has suggested that I use the GETRF + GETRS combination (as they are parallelized by OpenBLAS) but those take full matrices as arguments and my understanding of it tells me it would essentially be the same as just using dgesv (which I suppose wraps those two steps in one function). Am I correct or can I expect better performance with GETRF+GETRS vs DGESV ?
Thanks a lot
The text was updated successfully, but these errors were encountered: