Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework symmetric generalized eigen/eigvals #49673

Merged
merged 54 commits into from
Jun 28, 2023

Conversation

aravindh-krishnamoorthy
Copy link
Contributor

@aravindh-krishnamoorthy aravindh-krishnamoorthy commented May 7, 2023

This PR contains a fix for JuliaLang/LinearAlgebra.jl#1002.

Summary

  1. No methods are defined for the case eigenvals!(A,B) when one of A or B is not (Hermitian or) symmetric.
  2. The function eigen!(A,B) when either of A or B is not (Hermitian or) symmetric, computes the solution based on Cholesky decomposition, which is only valid for positive definite matrix B. Hence, the function fails for Hermitian or symmetric B.

Changes

  1. Include definitions for function eigvals!(A::StridedMatrix{T}, B::Union{RealHermSymComplexHerm{T},Diagonal{T}}; sortby::Union{Function,Nothing}=nothing) where {T<:Number} and function eigvals!(A::StridedMatrix{T}, B::Union{RealHermSymComplexHerm{T},Diagonal{T}}; sortby::Union{Function,Nothing}=nothing) where {T<:Number} which utilise the generic (dense) matrix variants.

  2. Change the definitions of function eigen!(A::RealHermSymComplexHerm{T,<:StridedMatrix}, B::AbstractMatrix{T}; sortby::Union{Function,Nothing}=nothing) where {T<:Number} and function eigen!(A::StridedMatrix{T}, B::Union{RealHermSymComplexHerm{T},Diagonal{T}}; sortby::Union{Function,Nothing}=nothing) where {T<:Number} to utilise the generic (dense) matrix variants.

  3. Define a new function function eigvals(A::AbstractMatrix{T}, C::Cholesky{T, <:AbstractMatrix}; sortby::Union{Function,Nothing}=nothing) where {T<:Number}, which can take advantage of Cholesky decomposition to reduce the computational complexity of B is positive definite.

  4. Define a new function function eigvals(A::AbstractMatrix{T}, B::BunchKaufman{T,<:StridedMatrix}; sortby::Union{Function,Nothing}=nothing) where {T<:Number}, which tan take advantage of Bunchkaufman (LDLT) decomposition to reduce the computational complexity if B is Hermitian or symmetric. (To be moved to another PR)

Plan

  • function eigen!(A::RealHermSymComplexHerm{T,<:StridedMatrix}, B::AbstractMatrix{T}; sortby::Union{Function,Nothing}=nothing) where {T<:Number} is forwarded to eigen!(Matrix{T}(A), B; sortby)

  • function eigen!(A::StridedMatrix{T}, B::Union{RealHermSymComplexHerm{T},Diagonal{T}}; sortby::Union{Function,Nothing}=nothing) where {T<:Number} is forwarded to eigen!(A, Matrix{T}(B); sortby)

  • function eigvals!(A::StridedMatrix{T}, B::Union{RealHermSymComplexHerm{T},Diagonal{T}}; sortby::Union{Function,Nothing}=nothing) where {T<:Number} is forwarded to eigvals!(A, Matrix{T}(B); sortby)

  • function eigvals!(A::RealHermSymComplexHerm{T,<:StridedMatrix}, B::AbstractMatrix{T}; sortby::Union{Function,Nothing}=nothing) where {T<:Number} is forwarded to eigvals!(Matrix{T}(A), B; sortby)

  • Define a new function function eigvals(A::AbstractMatrix{T}, C::Cholesky{T, <:AbstractMatrix}; sortby::Union{Function,Nothing}=nothing) where {T<:Number}, which can take advantage of Cholesky decomposition to reduce the computational complexity of B is positive definite.

  • Define a new function function eigvals(A::AbstractMatrix{T}, B::BunchKaufman{T,<:StridedMatrix}; sortby::Union{Function,Nothing}=nothing) where {T<:Number}, which tan take advantage of Bunchkaufman (LDLT) decomposition to reduce the computational complexity if B is Hermitian or symmetric. (To be moved to another PR)

  • Add unit tests.

Notes

  1. The Cholesky and Bunchkaufman decompostion based versions are only efficient if the matrix dimension $N &gt; 1000.$ For smaller matrices, the generic (dense) variant is better. See below for benchmarks.

TODO

  • [ISSUE-1] File an issue on tridiagonal.jl to support HermTridiagonal and ldlt for HermTridiagonal.
  • [ISSUE-2] File an issue on bunchkaufman to return SymTridiagonal or HermTridiagonal instead of Tridiagonal. Then, ldlt can be exploited to speed up Bunchkaufman-based implementation.
  • [ISSUE-3] File an issue to resolve the inconsistency between the outputs of two-generic-matrices and BK/Cholesky based eig/en/valuss for a low-rank $B.$
  • [ISSUE-4] eigen(A::HermOrSym,B::HermOrSym) and eigvals(A::HermOrSym,B::HermOrSym) fail for Hermitian/Symmetric matrices if B is not additionally positive definite.
  • [FR-1] Implement Bunchkaufman based eigen and eigvals in another PR.

aravindh-krishnamoorthy and others added 9 commits April 27, 2023 12:05
1. Fix output permutation when A.P is not the identity matrix.
2. Fix the application of the elementary transformation (correct complex conjugation) to the solution as per LAPACK zlarzb https://netlib.org/lapack/explore-html/d5/ddd/zlarzb_8f_source.html
3. Align the final permutation with LAPACK's ZGELSY at https://netlib.org/lapack/explore-html/d8/d6e/zgelsy_8f_source.html
4. Fix a complex conjugate bug in the original solution for #48911, align with LAPACK zlarzb, https://netlib.org/lapack/explore-html/d5/ddd/zlarzb_8f_source.html
5. Improve permutation performance
…n! functions for generalised Normal+Symmetric matrices and forward to the generic functions in eigen.jl
@jishnub
Copy link
Contributor

jishnub commented May 7, 2023

Regarding your questions, I would say that having more tests is certainly a good idea, and allowing different element types is perhaps unnecessary, since (from what I recall) the LAPACK functions require identical element types. Users may promote matrices to the same element type, if this is necessary.

@aravindh-krishnamoorthy
Copy link
Contributor Author

Done with tests for Hermitian symmetric matrices.

NOTE: That I've retained sortby=nothing which is the default sort in symmetriceigen.jl whereas most of eigen.jl uses eigsortby, defined eigsortby(λ::Real) = λ; eigsortby(λ::Complex) = (real(λ),imag(λ)), as the default sort.

Therefore, I had to add this sortby manually in the tests.

@dkarrasch dkarrasch added the linear algebra Linear algebra label May 7, 2023
@dkarrasch
Copy link
Member

dkarrasch commented May 7, 2023

This looks like a very rough approach, honestly. So the file was created in #39301, with certain conditions on A and B in mind. Now, I agree that these conditions are not necessarily encoded in the type, but I feel that we should still respect those cases, and keep the choleigen! methodology for them, but certainly compare the runtimes between master and this PR. I think one solution could be to compute the cholesky decomposition first, check if it succeeded, and if it did send it down the _choleigen! path (unless the proposal in this PR is better computationally anyway). If it doesn't succeed, then we can still turn one or the other into a regular Matrix. The point is that spd B is a very common case, and we should be careful about its performance.

Addendum: I only know read the original issue. Seems like I agree with @jishnub. @aravindh-krishnamoorthy You can see in the PDMats.jl (see also the PositiveDefiniteMatrices package) that there is no way to simply wrap a matrix and say it's spd, but the proof of spd is by performing a successful cholesky decomposition.

As I think about it, another option could be to provide an eigen interface where B::Cholesky, which then goes the _choleigen path. This, however, requires clear communication that the user needs to provide the cholesky decomposition themselves and upfront (which is kind of fair because it gets computed anyway in _choleigen and may be handy to have elsewhere), or else a generic eigen path is taken. How about that?

We should reach out to @schneiderfelipe and @andreasnoack who were discussing these issues initially in #39301.

@aravindh-krishnamoorthy
Copy link
Contributor Author

@dkarrasch Thank you for your detailed response and your concerns. Below, I will try to address them point by point.

This looks like a very rough approach, honestly. So the file was created in #39301, with certain conditions on A and B in mind. Now, I agree that these conditions are not necessarily encoded in the type, but I feel that we should still respect those cases, and keep the choleigen! methodology for them, but certainly compare the runtimes between master and this PR. I think one solution could be to compute the cholesky decomposition first, check if it succeeded, and if it did send it down the _choleigen! path (unless the proposal in this PR is better computationally anyway). If it doesn't succeed, then we can still turn one or the other into a regular Matrix. The point is that spd B is a very common case, and we should be careful about its performance.
Addendum: I only know read the original issue. Seems like I agree with @jishnub. @aravindh-krishnamoorthy You can see in the PDMats.jl (see also the PositiveDefiniteMatrices package) that there is no way to simply wrap a matrix and say it's spd, but the proof of spd is by performing a successful cholesky decomposition.

Indeed, the solution in the PR is a "basic" solution, where the computation is just forwarded to the generic equations in eigen.jl. The only performance penalty incurred is due to Matrix{T}(...), which is negligible.

As noted in the issue, the present solution makes additional assumption about the inputs, which are not encoded in the type signature of the functions. Julia's Cholesky decomposition only works for positive definite (PD) matrices, so the present functions will fail for positive semi-definite (PSD, the most common case) and indefinite matrices. Making additional (failing) assumption on input types not a good thing for stdlib functions.

Furthermore, the suggestion to check for positive definiteness and move to the respective algorithms (Cholesky or LAPACK) does not seem elegant. This is because the switching will depend on a numerical algorithm (checking for positive definiteness), which not only incurs computation complexity, but may also be influenced by numerical errors. In my experience, avoiding such nondeterminism (in this case numerical algorithm chaining) in stdlib is the best. As a toy example, imagine if the generic functions in eigen.jl again chained to another algorithm based on a numerical algorithm.

Moreover, as you're probably aware, the probability that an arbitrary matrix is PD is zero. Most PD matrices have a structure known to the user, or are computed as $B = X X^{\mathrm{H}},$ in which case, it is also known to the user that the matrix is PD. Hence, it may be better for the user to choose the right function instead of the function determining if a matrix is PD or not, especially in high performance computing scenarios.

Regarding the performance, this was evaluated quickly in JuliaLang/LinearAlgebra.jl#1002. The preliminary outcome was that, indeed, for matrices of order around $1000,$ Cholesky decomposition has lower complexity than the corresponding generic LAPACK functions. However, given the above, I feel this is not a strong reason to go for Cholesky decomposition based solution in this PR.

As I think about it, another option could be to provide an eigen interface where B::Cholesky, which then goes the _choleigen path. This, however, requires clear communication that the user needs to provide the cholesky decomposition themselves and upfront (which is kind of fair because it gets computed anyway in _choleigen and may be handy to have elsewhere), or else a generic eigen path is taken. How about that?

Indeed, this is a good proposal for using Cholesky.

Alternatively (or even alongside), we could implement ldlt(...) decomposition for ::Hermitian and ::Symmetric (currently only for symmetric tridiagonal matrices), which is only slightly more computationally expensive than cholesky. This way, we can redo the existing Cholesky-based algorithm in symmetriceigen.jl in terms of LDLT, which would be a relatively minor change. With the change, the algorithm can handle all (indefinite and large) Hermitian and symmetric matrices.

I feel this might be a good long term solution. Furthermore, we could implement a switch based on the matrix order n, which will lead to a deterministic algorithm given n.

We should reach out to @schneiderfelipe and @andreasnoack who were discussing these issues initially in #39301.

Indeed. It will be interesting to understand why they went for Cholesky decomposition for this case.

I look forward to yours and others opinion, given the above.

@jishnub
Copy link
Contributor

jishnub commented May 9, 2023

The LDLt idea sounds interesting, could you check what they performance is like with that? I would imagine that it's comparable

@dkarrasch
Copy link
Member

I think it's called bunchkaufman in the dense case. We have ldlt only for SymTridiagonal and for sparse matrices.

@aravindh-krishnamoorthy
Copy link
Contributor Author

aravindh-krishnamoorthy commented May 10, 2023

@jishnub @dkarrasch Thank you very much for the inputs. I'll take a look at the function soon and update this PR with two additional definitions for eigen and friends: (1) which accepts ::Cholesky, and (2) which uses the BK factorisation (ldlt) for Hermitian matrices. Unfortunately, due to the additional numerical stability checks and the possibility of obtaining a block-diagonal matrix, using BK factorisation may be a bit slower than the conventional LDLT, but hopefully better than LU. It is interesting to me too to compare this result -- both from a numerical and from a computational complexity perspective.

Presently, I'm a bit caught up with other work, but I will get back to you at the earliest, at least in parts.

Copy link
Member

@dkarrasch dkarrasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more comments. Thanks for your persistent work!

stdlib/LinearAlgebra/src/symmetriceigen.jl Outdated Show resolved Hide resolved
stdlib/LinearAlgebra/src/symmetriceigen.jl Outdated Show resolved Hide resolved
stdlib/LinearAlgebra/src/symmetriceigen.jl Outdated Show resolved Hide resolved
Copy link
Member

@dkarrasch dkarrasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last small comment. After this, I think this is ready to go. Are the timings still the same as they used to be?

stdlib/LinearAlgebra/src/symmetriceigen.jl Outdated Show resolved Hide resolved
@dkarrasch
Copy link
Member

I checked the discussion, and found that there was an issue with Infs and NaNs in some cases. Is that triggered by the changes here (due to the multiplication with the Cholesky/BK factors) or is that an existing issue? If it's the latter, then we should have an issue that clearly states inconsistencies (say, between the "two generic matrices" and "two Hermitian matrices" or "the Cholesky approach"), and then we move on with this PR.

@dkarrasch


Change StridedMatrix to AbstractMatrix in function eigen(A,B)

Co-authored-by: Daniel Karrasch <[email protected]>
@aravindh-krishnamoorthy
Copy link
Contributor Author

I checked the discussion, and found that there was an issue with Infs and NaNs in some cases. Is that triggered by the changes here (due to the multiplication with the Cholesky/BK factors) or is that an existing issue? If it's the latter, then we should have an issue that clearly states inconsistencies (say, between the "two generic matrices" and "two Hermitian matrices" or "the Cholesky approach"), and then we move on with this PR.

@dkarrasch Thank you for your suggestions and comments. Indeed, I was also planning to provide the benchmark figures after incorporating all review comments. Please note: The output below can be obtained by copy-pasting the contents of the attached file JT.txt into a fresh Julia prompt in the LinearAlgebra/test directory with the branch https://github.com/aravindh-krishnamoorthy/julia/tree/49533 checked out.

Summary

  1. From the benchmarking results below, we note similar benchmarking numbers and an analogous pattern as in the earlier results: for small and moderate matrices, the generic functions (which use LAPACK for generic matrices) are better. For large matrices, bunchkaufman and cholesky methods are better. Breakeven is around $N=1000.$

  2. For low-rank $B,$ e.g., $\lambda_N \approx 10^{-16}$ in the test, the LAPACK routines return NaN + NaN*im. On the other hand, the bunchkaufman and cholesky routines, which perform intermediate steps in Julia, return results around $10^{+16}.$ Furthermore, R, MATLAB, and Mathematica consistently return Inf. Hence, as you suggest, I have added a TODO item to raise an issue to resolve this inconsistency between two-generic-matrices and the BK/Cholesky methods. In fact, I think we also need to decide if NaN + NaN*im is acceptable or whether Inf output is better. For this reason, I chose to leave the results as-is in this PR instead of conforming to NaN + NaN*im.

Benchmarking

julia> # Packages
       using Revise
julia> using LinearAlgebra
julia> Revise.track(LinearAlgebra)
julia> using BenchmarkTools
julia> # Test definitions
       function mm(N)
                 A = complex.(randn(N,N), randn(N,N))
                 B = complex.(randn(N,N), randn(N,N))
                 return A, (B+B')/2
             end
mm (generic function with 1 method)

julia> # Tests for size N=4
       A,B = mm(4) ;
julia> @btime eigvals(A,B) ;
  7.038 μs (12 allocations: 68.95 KiB)
julia> @btime eigvals(A,bunchkaufman(B)) ;
  9.183 μs (36 allocations: 10.41 KiB)
julia> @btime eigen(A,B) ;
  8.494 μs (15 allocations: 69.41 KiB)
julia> @btime eigen(A,bunchkaufman(B)) ;
  14.998 μs (53 allocations: 18.48 KiB)
julia> BP = B*B' ;
julia> @btime eigvals(A,BP) ;
  6.566 μs (12 allocations: 68.95 KiB)
julia> @btime eigvals(A,cholesky(BP)) ;
  7.176 μs (15 allocations: 3.72 KiB)
julia> @btime eigen(A,BP) ;
  8.118 μs (15 allocations: 69.41 KiB)
julia> @btime eigen(A,cholesky(BP)) ;
  12.009 μs (28 allocations: 11.03 KiB)

julia> # Tests for size N=1000
       A,B = mm(1000) ;
julia> @btime eigvals(A,B) ;
  5.212 s (16 allocations: 33.62 MiB)
julia> @btime eigvals(A,bunchkaufman(B)) ;
  1.106 s (41 allocations: 78.00 MiB)
julia> @btime eigen(A,B) ;
  6.471 s (19 allocations: 48.87 MiB)
julia> @btime eigen(A,bunchkaufman(B)) ;
  1.681 s (62 allocations: 140.53 MiB)
julia> BP = B*B' ;
julia> @btime eigvals(A,BP) ;
  5.056 s (16 allocations: 33.62 MiB)
julia> @btime eigvals(A,cholesky(BP)) ;
  1.040 s (17 allocations: 31.08 MiB)
julia> @btime eigen(A,BP) ;
  6.503 s (19 allocations: 48.87 MiB)
julia> @btime eigen(A,cholesky(BP)) ;
  1.721 s (32 allocations: 63.08 MiB)
julia> pwd()
"/home/ark/julia/julia/stdlib/LinearAlgebra/test"
julia> include("symmetriceigen.jl")
Test Summary:   | Pass  Total  Time
bk-eigen-eigals |   12     12  3.8s
Test Summary:      | Pass  Total  Time
chol-eigen-eigvals |    4      4  1.5s
Test Summary: | Pass  Total  Time
issue JuliaLang/LinearAlgebra.jl#1002  |    8      8  0.8s
Main.TestSymmetricEigen

Inconsistency in the output for low-rank $B$:

julia> # Tests for return value inconsistency for low-rank matrices
       A,B = mm(4) ;
julia> eb,vb = eigen(B) ;
julia> BL = Hermitian(vb[:,1:3]*Diagonal(eb[1:3])*vb[:,1:3]') ;
julia> eigvals(BL)
4-element Vector{Float64}:
 -1.7928438224057546
  3.629908873481469e-16
  1.1217275857698366
  1.9404879734178766
julia> eigvals(A,BL)
4-element Vector{ComplexF64}:
 -1.5122973004623377 - 1.8070289261802528im
  1.2489459023691214 + 0.9580436524788615im
  0.0837744393810742 + 0.20398156152125624im
                 NaN + NaN*im
julia> eigvals(A,bunchkaufman(BL))
4-element Vector{ComplexF64}:
 -8.607522398243418e16 + 8.71140170912962e16im
    -1.512297300462337 - 1.8070289261802532im
    1.2489459023691214 + 0.9580436524788604im
   0.08377443938107355 + 0.2039815615212566im
julia> eigvals(A,cholesky(BL*BL'))
4-element Vector{ComplexF64}:
  0.3188575288787393 + 0.5198161999042132im
 9.50353228327252e15 - 9.618225029790684e15im
 -0.0824158180768777 - 0.7529204258209732im
 -0.6848278122147297 + 0.08971723311042369im

@dkarrasch
Copy link
Member

As I understand, the issue is only with BK, right? Because cholesky will fail in the low-rank case, as the resulting matrix is not positive-definite.

Regarding the output of NaN vs. Inf, we don't seem to "choose" anything, we just take the output of LAPACK, so I wonder where the mismatch with MATLAB and Mathematica comes from.

Given that we can't compute things in-place with the BK approach (and the consequence that we need massively more memory in the interesting case when the matrices are big) and the fake finite eigenvalues, should we remove the BK-related methods here and keep that for later? What are typical use cases for generalized eigencomputations with symmetric/Hermitian, but not necessarily positive-definite rhs's?

@aravindh-krishnamoorthy
Copy link
Contributor Author

aravindh-krishnamoorthy commented Jun 26, 2023

@dkarrasch Thank you for raising these concerns. Kindly find my response and opinion below.

As I understand, the issue is only with BK, right? Because cholesky will fail in the low-rank case, as the resulting matrix is not positive-definite.

In fact, the problem affects both BK and (the earlier) Cholesky based implementations. That is, the (earlier) code based on Cholesky also returns a large value instead of NaN + NaN*im (returned by the LAPACK routines). As seen from the example in my previous comment, Cholesky decomposition seems to be able to handle matrices with eigenvalues in the range $10^{-16},$ even though such matrices are effectively rank deficient.

Regarding the output of NaN vs. Inf, we don't seem to "choose" anything, we just take the output of LAPACK, so I wonder where the mismatch with MATLAB and Mathematica comes from.

In eigen.jl, LAPACK GGEV's outputs a and b are divided. The reason for the NaN output is that, in Julia, 1/(0+0*im) is chosen to be NaN+NaN*im. This is not so in the other languages.

Given that we can't compute things in-place with the BK approach (and the consequence that we need massively more memory in the interesting case when the matrices are big) and the fake finite eigenvalues, should we remove the BK-related methods here and keep that for later? What are typical use cases for generalized eigencomputations with symmetric/Hermitian, but not necessarily positive-definite rhs's?

This was my initial thought as well. However, during implementation, I changed my mind and felt that BK-based routines are indeed quite elegant and superior.

BK's inbuilt pivoting makes it superior to plain Cholesky for positive definite (PD) matrices. (Pivoted Cholesky based solution may be an option for this, but BK has more advantages.) Furthermore, BK can also handle PD matrices which turn out to be positive semi-definite (PSD) or indefinite due to numerical issues.

Hence, BK solution may be a one-stop solution for all kinds of matrices: PD, PSD, and indefinite. This, coupled with the fact that BK routines have a runtime comparable with that of plain Cholesky makes them interesting.

Note: Maybe the best solution is to implement BK-based method in LAPACK alongside GGEV and SYGVD :)

As you rightly identified, the present implementation is memory intensive compared to the generic LAPACK-based methods. However, I feel that solving the SymTridiagonal and HermTridiagonal ([ISSUE-1] and [ISSUE-2]) issues will enable us to use in-place division for the division by B.D, thereby reducing the memory load by a bit. But, it seems unlikely that we can get rid of the permutation copy A[B.p, B.p] for any pivoted scheme as the @view routines are presently much slower.

Next, regarding the inconsistency in the outputs: I feel that it's better to investigate this further in the context of the yet-to-be-filed [ISSUE-3]. Personally, I feel this is a corner case (not a blocking important case). Furthermore, the output is not wrong, but only that the thresholding is inconsistent.

Lastly, during my investigations, I found that the older eigen(A::HermOrSym, B::HermOrSym) and eigvals(A::HermOrSym, B::HermOrSym) methods do not work if B is not additionally positive definite. This is because the routines use SYGVD. I've written down a TODO to raise another issue for this [ISSUE-4].

Based on the above, and looking at the balance between memory and speed, my suggestion is to keep the BK routine in for now and evaluate it again once we have the fix to the SymTridiagonal and HermTridiagonal issue. However, I leave the final decision to you. In case you feel it might be better to remove it, I will gladly do so.

@dkarrasch
Copy link
Member

For your BL, cholesky always throws a PosDefException for me locally. I'm not sure if we are going to add a HermTridiagonal with all the method overloads. That seems like a lot of additional code. I think we should play it safe: even without the BK-based approach, this PR fixes at least two issues. (a) It fixes the inconsistency observed in JuliaLang/LinearAlgebra.jl#1002 and thereby closes it, and (b) moves the construction of the Cholesky object out of the eigen/eigvals computation. I think the latter is much clearer. With PivotedCholesky I assume we get back to the issue that we need to allocate another huge copy of the permuted matrix. So, I suggest we remove the BK-stuff and put it in another PR that can be discussed. The remaining stuff can then be merged quickly and be included in v1.10, for which the feature freeze date is today!

@dkarrasch dkarrasch changed the title Potential fix for #49533: eigenvals!(A, Symmetric(B)) has no defined methods Rework symmetric generalized eigen/eigvals Jun 27, 2023
@aravindh-krishnamoorthy
Copy link
Contributor Author

For your BL, cholesky always throws a PosDefException for me locally. I'm not sure if we are going to add a HermTridiagonal with all the method overloads. That seems like a lot of additional code. I think we should play it safe: even without the BK-based approach, this PR fixes at least two issues. (a) It fixes the inconsistency observed in JuliaLang/LinearAlgebra.jl#1002 and thereby closes it, and (b) moves the construction of the Cholesky object out of the eigen/eigvals computation. I think the latter is much clearer. With PivotedCholesky I assume we get back to the issue that we need to allocate another huge copy of the permuted matrix. So, I suggest we remove the BK-stuff and put it in another PR that can be discussed. The remaining stuff can then be merged quickly and be included in v1.10, for which the feature freeze date is today!

@dkarrasch Thank you. I understand your concerns. I've removed the eigen/eigvals functions with Bunchkaufman decomposition and updated the tests and NEWS.md. Kindly let me know if further changes are necessary. I will move the Bunchkaufman version to another PR.

@dkarrasch dkarrasch added merge me PR is reviewed. Merge when all tests are passing and removed merge me PR is reviewed. Merge when all tests are passing labels Jun 27, 2023
@dkarrasch
Copy link
Member

Hm, I overlooked that we have a method ambiguity, and realized that we should keep the eigen stuff in diagonal.jl in sync with the changes here. I made some changes and am running local tests now to see if the ambiguity is resolved and all other tests pass.

@aravindh-krishnamoorthy
Copy link
Contributor Author

Hm, I overlooked that we have a method ambiguity, and realized that we should keep the eigen stuff in diagonal.jl in sync with the changes here. I made some changes and am running local tests now to see if the ambiguity is resolved and all other tests pass.

Thank you. Please let me know if I can be of any help.

@dkarrasch
Copy link
Member

Please let me know if I can be of any help.

You did a great job! Thank you.

@dkarrasch dkarrasch added the merge me PR is reviewed. Merge when all tests are passing label Jun 27, 2023
@aravindh-krishnamoorthy
Copy link
Contributor Author

aravindh-krishnamoorthy commented Jun 27, 2023

For your BL, cholesky always throws a PosDefException for me locally. I'm not sure if we are going to add a HermTridiagonal with all the method overloads. That seems like a lot of additional code. I think we should play it safe: even without the BK-based approach, this PR fixes at least two issues. (a) It fixes the inconsistency observed in JuliaLang/LinearAlgebra.jl#1002 and thereby closes it, and (b) moves the construction of the Cholesky object out of the eigen/eigvals computation. I think the latter is much clearer. With PivotedCholesky I assume we get back to the issue that we need to allocate another huge copy of the permuted matrix. So, I suggest we remove the BK-stuff and put it in another PR that can be discussed. The remaining stuff can then be merged quickly and be included in v1.10, for which the feature freeze date is today!

@dkarrasch There's just one more point that I forgot to add earlier. HermTridiagonal is similar to SymTridiagonal via symmetric transformation by a diagonal unitary matrix. Hence, we might just need routine for or directly compute the following transformation. I'll keep this in mind for the next PR.

# Arbitrary Hermitian tridiagonal matrix
julia> d = [1+0*im, 2+0*im, 3+0*im] ;
julia> e = [1+im, 2+2im] ;
julia> A = Tridiagonal(e,d,conj(e))
3×3 Tridiagonal{Complex{Int64}, Vector{Complex{Int64}}}:
 1+0im  1-1im    
 1+1im  2+0im  2-2im
       2+2im  3+0im

# Construct the diagonal unitary transformation
julia> D = Diagonal(exp.(im*cumsum([0;angle.(A.du)])))
3×3 Diagonal{ComplexF64, Vector{ComplexF64}}:
 1.0+0.0im                                  
           0.707107-0.707107im              
                               6.12323e-17-1.0im

# Apply transformation symmetrically -> results in SymTridiagonal, which allows ldlt(.)
julia> D*A*D'
3×3 Matrix{ComplexF64}:
     1.0+0.0im          1.41421-1.11022e-16im      0.0+0.0im
 1.41421+1.11022e-16im      2.0+0.0im          2.82843+4.88534e-17im
     0.0+0.0im          2.82843+0.0im              3.0+0.0im

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
linear algebra Linear algebra
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants