Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the canonical metric on the Stiefel manifold #335

Merged
merged 37 commits into from
Mar 22, 2021
Merged
Show file tree
Hide file tree
Changes from 24 commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
03ebd9f
reorganize code and sketch initial algorithms of exp and log for cano…
kellertuer Mar 6, 2021
1908eb6
update documentation.
kellertuer Mar 6, 2021
e79a1b5
undo a commented our part in docs.
kellertuer Mar 6, 2021
c993f3b
Apply suggestions from code review
kellertuer Mar 7, 2021
f36ab3e
Apply suggestions from code review
kellertuer Mar 8, 2021
edc96c9
runs formatter.
kellertuer Mar 8, 2021
e7b914e
Fix plots (for documentation and testing) to 1.10.5, since it introdu…
kellertuer Mar 8, 2021
3950004
Losen constraints for Julia 1.4
kellertuer Mar 8, 2021
d9008f3
fix latex equation.
kellertuer Mar 8, 2021
9c61049
Fix documentation.
kellertuer Mar 8, 2021
0a8aae0
changing compats again, hopefully this works?
kellertuer Mar 8, 2021
f97f37f
starts testing.
kellertuer Mar 8, 2021
cee6df2
adopts one view, such that the multiplication is well defined. Finish…
kellertuer Mar 8, 2021
018c288
Fix spacing.
kellertuer Mar 8, 2021
c89c763
fixes a few bugs, starts extended testing that includes the iterations.
kellertuer Mar 8, 2021
b062c87
runs formatter.
kellertuer Mar 8, 2021
50f83ca
removes a little bit of unnecessary lines.
kellertuer Mar 8, 2021
ec7f6f3
Merge branch 'master' into kellertuer/steifel-canonical-metric
kellertuer Mar 9, 2021
4d359b5
enforce real part of log and avoid svd! to not change V.
kellertuer Mar 11, 2021
7103ada
Update src/manifolds/StiefelCanonicalMetric.jl
kellertuer Mar 11, 2021
102018b
introduce distance and kwargs for the log, extend tests.
kellertuer Mar 11, 2021
6a7447a
Merge branch 'kellertuer/steifel-canonical-metric' of github.com:Juli…
kellertuer Mar 11, 2021
ce20bfb
remove some debug print and run formatter.
kellertuer Mar 11, 2021
94302ca
improve the test, where an approximation is actually needed.
kellertuer Mar 18, 2021
b13cfb0
Update src/manifolds/SymmetricPositiveDefinite.jl
kellertuer Mar 21, 2021
9b8e3c4
address points from code review.
kellertuer Mar 21, 2021
571776a
Merge branch 'kellertuer/steifel-canonical-metric' of github.com:Juli…
kellertuer Mar 21, 2021
456621e
Adress points from code review.
kellertuer Mar 21, 2021
7f99f8d
Merge branch 'master' into kellertuer/steifel-canonical-metric
sethaxen Mar 22, 2021
ca261fd
Multiply blockwise
sethaxen Mar 22, 2021
2b1ebd0
Use convert to avoid copy
sethaxen Mar 22, 2021
34eba33
Use log_safe!
sethaxen Mar 22, 2021
d7ce927
Avoid unnecessary copy
sethaxen Mar 22, 2021
0ceb65b
Multiply blockwise to reduce allocations
sethaxen Mar 22, 2021
296f107
Increment version number
sethaxen Mar 22, 2021
2a2d8e7
remove windows from testing temporarily.
kellertuer Mar 22, 2021
07d8598
Merge branch 'kellertuer/steifel-canonical-metric' of github.com:Juli…
kellertuer Mar 22, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 39 additions & 0 deletions docs/src/manifolds/stiefel.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,48 @@
# Stiefel

## Common and metric independent functions

```@autodocs
Modules = [Manifolds]
Pages = ["manifolds/Stiefel.jl"]
Order = [:type, :function]
```

## Default metric: the Euclidean metric

The [`EuclideanMetric`](@ref) is obtained from the embedding of the Stiefel manifold in ``ℝ^{n,k}``.

```@autodocs
Modules = [Manifolds]
Pages = ["manifolds/StiefelEuclideanMetric.jl"]
Order = [:function]
```

## The canonical metric

```@autodocs
kellertuer marked this conversation as resolved.
Show resolved Hide resolved
Modules = [Manifolds]
Pages = ["manifolds/StiefelCanonicalMetric.jl"]
Order = [:type]
```

Any ``X∈T_p\mathcal M``, ``p∈\mathcal M``, can be written as

```math
X = pA + (I_n-pp^{\mathrm{T}})B,
\quad
A ∈ ℝ^{p×p} \text{ skew-symmetric},
\quad
B ∈ ℝ^{n×p} \text{ arbitrary.}
```

In the Euclidean case, the elements from ``A`` are counted twice (i.e. weighted with a factor of 2).
The canonical metric avoids this.

```@autodocs
Modules = [Manifolds]
Pages = ["manifolds/StiefelCanonicalMetric.jl"]
Order = [:function]
```

## Literature
4 changes: 4 additions & 0 deletions src/Manifolds.jl
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ import ManifoldsBase:
is_tangent_vector,
inverse_retract,
inverse_retract!,
log, #for extension, e.g. in Stiefel.
log!,
manifold_dimension,
mid_point,
Expand Down Expand Up @@ -160,6 +161,8 @@ include("manifolds/Rotations.jl")
include("manifolds/SkewSymmetric.jl")
include("manifolds/Spectrahedron.jl")
include("manifolds/Stiefel.jl")
include("manifolds/StiefelEuclideanMetric.jl")
include("manifolds/StiefelCanonicalMetric.jl")
include("manifolds/Sphere.jl")
include("manifolds/SphereSymmetricMatrices.jl")
include("manifolds/Symmetric.jl")
Expand Down Expand Up @@ -331,6 +334,7 @@ export Metric,
MinkowskiMetric,
PowerMetric,
ProductMetric,
CanonicalMetric,
MetricManifold
export AbstractEmbeddingType, AbstractIsometricEmbeddingType
export DefaultEmbeddingType, DefaultIsometricEmbeddingType, TransparentIsometricEmbedding
Expand Down
138 changes: 0 additions & 138 deletions src/manifolds/Stiefel.jl
Original file line number Diff line number Diff line change
Expand Up @@ -123,105 +123,6 @@ end

decorated_manifold(::Stiefel{N,K,𝔽}) where {N,K,𝔽} = Euclidean(N, K; field=𝔽)

@doc raw"""
exp(M::Stiefel, p, X)

Compute the exponential map on the [`Stiefel`](@ref)`{n,k,𝔽}`() manifold `M`
emanating from `p` in tangent direction `X`.

````math
\exp_p X = \begin{pmatrix}
p\\X
\end{pmatrix}
\operatorname{Exp}
\left(
\begin{pmatrix} p^{\mathrm{H}}X & - X^{\mathrm{H}}X\\
I_n & p^{\mathrm{H}}X\end{pmatrix}
\right)
\begin{pmatrix} \exp( -p^{\mathrm{H}}X) \\ 0_n\end{pmatrix},
````

where $\operatorname{Exp}$ denotes matrix exponential,
$\cdot^{\mathrm{H}}$ denotes the complex conjugate transpose or Hermitian, and $I_k$ and
$0_k$ are the identity matrix and the zero matrix of dimension $k × k$, respectively.
"""
exp(::Stiefel, ::Any...)

function exp!(::Stiefel{n,k}, q, p, X) where {n,k}
return copyto!(
q,
[p X] *
exp([p'X -X'*X; one(zeros(eltype(p), k, k)) p'*X]) *
[exp(-p'X); zeros(eltype(p), k, k)],
)
end

@doc raw"""
get_basis(M::Stiefel{n,k,ℝ}, p, B::DefaultOrthonormalBasis) where {n,k}

Create the default basis using the parametrization for any $X ∈ T_p\mathcal M$.
Set $p_\bot \in ℝ^{n\times(n-k)}$ the matrix such that the $n\times n$ matrix of the common
columns $[p\ p_\bot]$ is an ONB.
For any skew symmetric matrix $a ∈ ℝ^{k\times k}$ and any $b ∈ ℝ^{(n-k)\times k}$ the matrix

````math
X = pa + p_\bot b ∈ T_p\mathcal M
````

and we can use the $\frac{1}{2}k(k-1) + (n-k)k = nk-\frac{1}{2}k(k+1)$ entries
of $a$ and $b$ to specify a basis for the tangent space.
using unit vectors for constructing both
the upper matrix of $a$ to build a skew symmetric matrix and the matrix b, the default
basis is constructed.

Since $[p\ p_\bot]$ is an automorphism on $ℝ^{n\times p}$ the elements of $a$ and $b$ are
orthonormal coordinates for the tangent space. To be precise exactly one element in the upper
trangular entries of $a$ is set to $1$ its symmetric entry to $-1$ and we normalize with
the factor $\frac{1}{\sqrt{2}}$ and for $b$ one can just use unit vectors reshaped to a matrix
to obtain orthonormal set of parameters.
"""
function get_basis(M::Stiefel{n,k,ℝ}, p, B::DefaultOrthonormalBasis{ℝ}) where {n,k}
V = get_vectors(M, p, B)
return CachedBasis(B, V)
end

function get_coordinates!(
M::Stiefel{n,k,ℝ},
c,
p,
X,
B::DefaultOrthonormalBasis{ℝ},
) where {n,k}
V = get_vectors(M, p, B)
c .= [inner(M, p, v, X) for v in V]
return c
end

function get_vector!(M::Stiefel{n,k,ℝ}, X, p, c, B::DefaultOrthonormalBasis{ℝ}) where {n,k}
V = get_vectors(M, p, B)
zero_tangent_vector!(M, X, p)
length(c) < length(V) && error(
"Coordinate vector too short. Excpected $(length(V)), but only got $(length(c)) entries.",
)
@inbounds for i in 1:length(V)
X .+= c[i] .* V[i]
end
return X
end

function get_vectors(::Stiefel{n,k,ℝ}, p, ::DefaultOrthonormalBasis{ℝ}) where {n,k}
p⊥ = nullspace([p zeros(n, n - k)])
an = div(k * (k - 1), 2)
bn = (n - k) * k
V = vcat(
[p * vec2skew(1 / sqrt(2) .* _euclidean_unit_vector(an, i), k) for i in 1:an],
[p⊥ * reshape(_euclidean_unit_vector(bn, j), (n - k, k)) for j in 1:bn],
)
return V
end

_euclidean_unit_vector(n, i) = [k == i ? 1.0 : 0.0 for k in 1:n]

@doc raw"""
inverse_retract(M::Stiefel, p, q, ::PolarInverseRetraction)

Expand Down Expand Up @@ -381,42 +282,6 @@ manifold_dimension(::Stiefel{n,k,ℝ}) where {n,k} = n * k - div(k * (k + 1), 2)
manifold_dimension(::Stiefel{n,k,ℂ}) where {n,k} = 2 * n * k - k * k
manifold_dimension(::Stiefel{n,k,ℍ}) where {n,k} = 4 * n * k - k * (2k - 1)

@doc raw"""
project(M::Stiefel,p)

Projects `p` from the embedding onto the [`Stiefel`](@ref) `M`, i.e. compute `q`
as the polar decomposition of $p$ such that $q^{\mathrm{H}q$ is the identity,
where $\cdot^{\mathrm{H}}$ denotes the hermitian, i.e. complex conjugate transposed.
"""
project(::Stiefel, ::Any, ::Any)

function project!(::Stiefel, q, p)
s = svd(p)
mul!(q, s.U, s.Vt)
return q
end

@doc raw"""
project(M::Stiefel, p, X)

Project `X` onto the tangent space of `p` to the [`Stiefel`](@ref) manifold `M`.
The formula reads

````math
\operatorname{proj}_{\mathcal M}(p, X) = X - p \operatorname{Sym}(p^{\mathrm{H}}X),
````

where $\operatorname{Sym}(q)$ is the symmetrization of $q$, e.g. by
$\operatorname{Sym}(q) = \frac{q^{\mathrm{H}}+q}{2}$.
"""
project(::Stiefel, ::Any...)

function project!(::Stiefel, Y, p, X)
A = p' * X
copyto!(Y, X - p * Hermitian((A + A') / 2))
return Y
end

@doc raw"""
retract(::Stiefel, p, X, ::CayleyRetraction)

Expand Down Expand Up @@ -813,6 +678,3 @@ function vector_transport_to!(
q * (UpperTriangular(qtXrf) - UpperTriangular(qtXrf)') + Xrf - q * qtXrf,
)
end
function vector_transport_to!(M::Stiefel, Y, ::Any, X, q, ::ProjectionTransport)
return project!(M, Y, q, X)
end
153 changes: 153 additions & 0 deletions src/manifolds/StiefelCanonicalMetric.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
@doc raw"""
CanonicalMetric <: Metric

The Canonical Metric is name used on several manifolds, for example for the [`Stiefel`](@ref)
manifold, see for example[^EdelmanAriasSmith1998].

kellertuer marked this conversation as resolved.
Show resolved Hide resolved
[^EdelmanAriasSmith1998]:
> Edelman, A., Ariar, T. A., Smith, S. T.:
> _The Geometry of Algorihthms with Orthogonality Constraints_,
> SIAM Journal on Matrix Analysis and Applications (20(2), pp. 303–353, 1998.
> doi: [10.1137/S0895479895290954](https://doi.org/10.1137/S0895479895290954)
> arxiv: [9806030](https://arxiv.org/abs/physics/9806030)
"""
struct CanonicalMetric <: RiemannianMetric end

function distance(M::MetricManifold{ℝ,Stiefel{n,k,ℝ},CanonicalMetric}, q, p) where {n,k}
sethaxen marked this conversation as resolved.
Show resolved Hide resolved
return norm(M, p, log(M, p, q))
end

@doc raw"""
q = exp(M::MetricManifold{ℝ, Stiefel{n,k,ℝ}, CanonicalMetric}, p, X)
exp!(M::MetricManifold{ℝ, Stiefel{n,k,ℝ}, q, CanonicalMetric}, p, X)

Compute the exponential map on the [`Stiefel`](@ref)`(n,k)` manifold with respect to the [`CanonicalMetric`](@ref).

First, decompose The tangent vector ``X`` into its horizontal and vertical component with
respect to ``p``, i.e.

```math
X = pp^{\mathrm{T}}X + (I_n-pp^{\mathrm{T}})X,
```
where ``I_n`` is the ``n\times n`` identity matrix.
We introduce ``A=p^{\mathrm{T}}X`` and ``QR = (I_n-pp^{\mathrm{T}})X`` the `qr` decomposition
of the vertical component. Then using the matrix exponential ``\operatorname{Exp}`` we introduce ``B`` and ``C`` as

```math
\begin{pmatrix}
B\\C
\end{pmatrix}
\coloneqq
\operatorname{Exp}\left(
\begin{pmatrix}
A & -R^{\mathrm{T}}\\ R & 0
\end{pmatrix}
\right)
\begin{pmatrix}I_k\\0\end{pmatrix}
```

the exponential map reads

```math
q = \exp_p X = pC + QB.
```
For more details, see [^EdelmanAriasSmith1998][^Zimmermann2017].
"""
exp(::MetricManifold{ℝ,Stiefel{n,k,ℝ},CanonicalMetric}, ::Any...) where {n,k}

function exp!(::MetricManifold{ℝ,Stiefel{n,k,ℝ},CanonicalMetric}, q, p, X) where {n,k}
A = p' * X
kellertuer marked this conversation as resolved.
Show resolved Hide resolved
n == k && return mul!(q, p, exp(A))
QR = qr(X - p * A)
BC_ext = exp([A -QR.R'; QR.R 0*I])
q .= [p Matrix(QR.Q)] * @view(BC_ext[:, 1:k])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you blockwise multiply here, then instead of the 3 allocations here, I think you can do with none.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would I do that? I twiddled so much with these few lines of code, before and after your allocation reduction (about 2 days to find that we can not use the svd!) that I don't see directly how to do this block wise.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take a stab at it.

return q
end

@doc raw"""
inner(M::MetricManifold{ℝ, Stiefel{n,k,ℝ}, X, CanonicalMetric}, p, X, Y)

compute the inner procuct on the [`Stiefel`](@ref) manifold with respect to the
[`CanonicalMetric`](@ref). The formula reads

```math
g_p(X,Y) = \operatorname{tr}\bigl( X^{\mathrm{T}}(I_n - \frac{1}{2}pp^{\mathrm{T}})Y \bigr).
```
"""
function inner(::MetricManifold{ℝ,Stiefel{n,k,ℝ},CanonicalMetric}, p, X, Y) where {n,k}
T = Base.promote_eltype(p, X, Y)
s = T(dot(X, Y))
if n != k
s -= dot(p'X, p'Y) / 2
end
return s
end

@doc raw"""
X = log(M::MetricManifold{ℝ, Stiefel{n,k,ℝ}, CanonicalMetric}, p, q; kwargs..)
log!(M::MetricManifold{ℝ, Stiefel{n,k,ℝ}, X, CanonicalMetric}, p, q; kwargs...)

Compute the logarithmic map on the [`Stiefel`](@ref)`(n,k)` manifold with respect to the [`CanonicalMetric`](@ref)
using a matrix-algebraic based approach to an iterative inversion of the formula of the
[`exp`](@ref exp(::MetricManifold{ℝ, Stiefel{n,k,ℝ}, CanonicalMetric}, ::Any...) where {n,k}).

The algorithm is derived in[^Zimmermann2017].

# Keyword arguments

Both `maxiter`(`=1e5`) and `tolerance`(`=1e-9`) can be given to the iterative process as
stoppping criteria.

[^Zimmermann2017]:
> Zimmermann, R.: _A matrix-algebraic algorithm for the Riemannian logarithm on the Stiefel manifold under the canoncial metric.
> SIAM Journal on Matrix Analysis and Applications 28(2), pp. 322-342, 2017.
> doi: [10.1137/16M1074485](https://doi.org/10.1137/16M1074485),
> arXiv: [1604.05054](https://arxiv.org/abs/1604.05054).
"""
log(::MetricManifold{ℝ,Stiefel{n,k,ℝ},CanonicalMetric}, ::Any...) where {n,k}

function log(
M::MetricManifold{ℝ,Stiefel{n,k,ℝ},CanonicalMetric},
p,
q;
maxiter=1e5,
tolerance=1e-9,
) where {n,k}
X = allocate_result(M, log, p, q)
log!(M, X, p, q; maxiter=maxiter, tolerance=tolerance)
return X
end

function log!(
::MetricManifold{ℝ,Stiefel{n,k,ℝ},CanonicalMetric},
kellertuer marked this conversation as resolved.
Show resolved Hide resolved
X,
p,
q;
maxiter=1e5,
tolerance=1e-9,
) where {n,k}
M = p' * q
QR = qr(q - p * M)
V = Matrix(qr([M; QR.R]).Q * I)
kellertuer marked this conversation as resolved.
Show resolved Hide resolved
Vcorner = @view V[(k + 1):(2 * k), (k + 1):(2k)] #bottom right corner
Vpcols = @view V[:, (k + 1):(2 * k)] #second half of the columns
S = svd(Vcorner) # preprocessing: Procrustes
mul!(Vpcols, Vpcols * S.U, S.V')
V[:, 1:k] .= [M; QR.R]
kellertuer marked this conversation as resolved.
Show resolved Hide resolved

LV = real.(log(V)) # this can be replaced by log_safe.
C = view(LV, (k + 1):(2 * k), (k + 1):(2 * k))
kellertuer marked this conversation as resolved.
Show resolved Hide resolved
expnC = exp(-C)
i = 0
while (i < maxiter) && (norm(C) > tolerance)
i = i + 1
LV .= real.(log(V))
expnC .= exp(-C)
copyto!(Vpcols, Vpcols * expnC)
kellertuer marked this conversation as resolved.
Show resolved Hide resolved
end
LV .= real.(LV)
kellertuer marked this conversation as resolved.
Show resolved Hide resolved
mul!(X, p, @view(LV[1:k, 1:k]))
# force the first - Q - to be the reduced form, not the full matrix
mul!(X, @view(QR.Q[:, 1:k]), @view(LV[(k + 1):(2 * k), 1:k]), true, true)
return X
end
Loading