-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add LKJ Matrix Distribution #108
Comments
Just defining |
Thanks @mohamed82008, I defined using Turing, Bijectors, LinearAlgebra, Random
Random.seed!(666)
# generate data
sigma = [1,2,3]
Omega = [1 0.3 0.2;
0.3 1 0.1;
0.2 0.1 1]
Sigma = diagm(sigma) * Omega * diagm(sigma)
N = 100
J = 3
y = rand(MvNormal(zeros(J), Sigma), N)'
# model
@model correlation(J, N, y, Zero) = begin
sigma ~ filldist(truncated(Cauchy(0., 5.), 0., Inf), J) # prior on the standard deviations
Omega ~ LKJ(J, 1) # LKJ prior on the correlation matrix
# covariance matrix
Sigma = diagm(sigma) * Omega * diagm(sigma)
for i in 1:N
y[i,:] ~ MvNormal(Zero, Sigma) # sampling distribution of the observations
end
end
Bijectors.bijector(::LKJ) = PDBijector()
# attempt to recover parameters
chain = sample(correlation(J, N, y, zeros(J)), NUTS(), 1000) And the error:
Perhaps I misspecified something here? |
It seems like the issue is with your julia> # model
@model correlation(J, N, y, Zero) = begin
sigma ~ filldist(truncated(Cauchy(0., 5.), 0., Inf), J) # prior on the standard deviations
Omega ~ LKJ(J, 1) # LKJ prior on the correlation matrix
@info isposdef(Omega)
# covariance matrix
Sigma = diagm(sigma) * Omega * diagm(sigma)
@info Sigma
@info isposdef(Sigma)
for i in 1:N
y[i,:] ~ MvNormal(Zero, Sigma) # sampling distribution of the observations
end
end
DynamicPPL.ModelGen{var"###generator#300",(:J, :N, :y, :Zero),(),Tuple{}}(##generator#300, NamedTuple())
julia>
julia> m = correlation(J, N, y, zeros(J));
julia> m()
[ Info: true
[ Info: [2.43118945431723 0.8857326262243734 -0.38161465118558274; 0.8857326262243734 0.496164416668245 -0.2301838107162483; -0.38161465118558274 -0.2301838107162483 0.28572768840562973]
[ Info: true
julia> m()
[ Info: true
[ Info: [289.05898695571943 -3.900716770274498 -45.83646433466277; -3.900716770274498 0.3260931391923026 2.375071240046279; -45.83646433466277 2.3750712400462786 74.5652927508229]
[ Info: false
ERROR: PosDefException: matrix is not Hermitian; Cholesky factorization failed. EDIT: haha, nevermind I'm stupid 🙃 The above is relevant, but I deleted parts of my comment that was just me brain-farting like crazy. |
Thanks for checking out my example @torfjelde! So the data {
int<lower=1> N; // number of observations
int<lower=1> J; // dimension of observations
vector[J] y[N]; // observations
vector[J] Zero; // a vector of Zeros (fixed means of observations)
}
parameters {
corr_matrix[J] Omega;
vector<lower=0>[J] sigma;
}
transformed parameters {
cov_matrix[J] Sigma;
Sigma <- quad_form_diag(Omega, sigma);
}
model {
y ~ multi_normal(Zero,Sigma); // sampling distribution of the observations
sigma ~ cauchy(0, 5); // prior on the standard deviations
Omega ~ lkj_corr(1); // LKJ prior on the correlation matrix
} The Stan docs make it seem pretty straight forward but perhaps there's more going on then I realize:
|
Completely unrelated, but I think you should use julia> using BenchmarkTools, LinearAlgebra
julia> f(v) = Diagonal(v)
julia> g(v) = diagm(v)
julia> @btime f($(rand(100));
4.866 ns (1 allocation: 16 bytes)
julia> @btime g($(rand(100));
5.854 μs (3 allocations: 78.23 KiB) |
These issues with positive definiteness are nasty, even if the matrix is guaranteed to be positive (semi-)definite mathematically it can easily happen that due to numerical issues it's not positive (semi-)definite numerically (I've run into this problem multiple times when parameterizing |
@devmotion to the rescue! julia> # model
@model correlation(J, N, y, Zero) = begin
sigma ~ filldist(truncated(Cauchy(0., 5.), 0., Inf), J) # prior on the standard deviations
Omega ~ LKJ(J, 1) # LKJ prior on the correlation matrix
@info sigma
@info Omega
@info isposdef(Omega)
# covariance matrix
Sigma = Symmetric(Diagonal(sigma) * Omega * Diagonal(sigma))
@info Sigma
@info isposdef(Sigma)
for i in 1:N
y[i,:] ~ MvNormal(Zero, Sigma) # sampling distribution of the observations
end
end
DynamicPPL.ModelGen{var"###generator#348",(:J, :N, :y, :Zero),(),Tuple{}}(##generator#348, NamedTuple())
julia> m = correlation(J, N, y, zeros(J));
julia> m()
[ Info: [6.554761110166019, 2.34510326436457, 0.6888832327192145]
[ Info: [1.0 0.47103043693813107 0.5681366777033788; 0.47103043693813107 1.0 -0.41754820600773; 0.5681366777033788 -0.41754820600773 1.0]
[ Info: true
[ Info: [42.96489321134486 7.24048754385414 2.5654012966083335; 7.24048754385414 5.499509320533363 -0.674550094605337; 2.5654012966083335 -0.674550094605337 0.4745601083216754]
[ Info: true
julia> m()
[ Info: [39.64435704629228, 0.9699293307273142, 3.1448223201647334]
[ Info: [1.0 0.051875690114511874 -0.052259251243831094; 0.051875690114511874 1.0 0.8063928903464262; -0.052259251243831094 0.8063928903464262 1.0]
[ Info: true
[ Info: [1571.675045613904 1.9947356925964466 -6.515393871749325; 1.9947356925964466 0.9407629066051357 2.4597042749565188; -6.515393871749325 2.4597042749565188 9.889907425406298]
[ Info: true
julia> m()
[ Info: [0.8388798227529882, 3.3189729374771266, 8.826661070228887]
[ Info: [1.0 0.7104582898219662 -0.4776195773069013; 0.7104582898219662 1.0 -0.07252323132333671; -0.4776195773069013 -0.07252323132333671 1.0]
[ Info: true
[ Info: [0.703719357022085 1.978071774380738 -3.536537920990547; 1.978071774380738 11.015581359705546 -2.124600640530144; -3.536537920990547 -2.124600640530144 77.90994564869416]
[ Info: true
julia> m()
[ Info: [2.052493460163989, 18.75614790558376, 3.1610610751936297]
[ Info: [1.0 -0.2180845855069199 -0.42228217488217845; -0.2180845855069199 1.0 0.09044611498906638; -0.42228217488217845 0.09044611498906638 1.0]
[ Info: true
[ Info: [4.212729404015944 -8.395574136610355 -2.73979089842532; -8.395574136610355 351.793084256134 5.362489474229928; -2.73979089842532 5.362489474229928 9.992307121104306]
[ Info: true
julia> m()
[ Info: [671.1873925742121, 0.3471925977453725, 12.611536505760812]
[ Info: [1.0 0.6777757171051204 -0.10065195388769885; 0.6777757171051204 1.0 0.6147688385377347; -0.10065195388769885 0.6147688385377347 1.0]
[ Info: true
[ Info: [450492.51595056953 157.94295267110346 -851.9890272445987; 157.94295267110346 0.12054269992918003 2.6918465834085405; -851.9890272445987 2.6918465834085405 159.05085303613762]
[ Info: true
julia> m()
[ Info: [0.8079920565082861, 4.041301143186047, 38.36495641353685]
[ Info: [1.0 0.35915995608688434 -0.19698184713779998; 0.35915995608688434 1.0 0.5033179678024089; -0.19698184713779998 0.5033179678024089 1.0]
[ Info: true
[ Info: [0.6528511633804894 1.1727790914573786 -6.1061575530419185; 1.1727790914573786 16.332114929916848 78.03660324156078; -6.1061575530419185 78.03660324156078 1471.8698806125824]
[ Info: true I'm assuming that the reason why this works is that there exists more numerically stable method for symmetric matrices and by wrapping it in |
Unfortunately, the simple (and slightly inefficient) reason for it is that in https://github.com/JuliaStats/PDMats.jl/blob/00804c3ca96a0839c03d25782a51028fe96fa725/src/pdmat.jl#L20 a new matrix is allocated in which just the upper triangle is mirrored to the lower one. Hence if there was any numerical discrepancy between those, it should be gone afterwards. That's also the reason why it doesn't fix the issues always (according to my experience). |
Maybe you could avoid that by using using PDMats, LinearAlgebra
...
_Sigma = Symmetric(Diagonal(sigma) * Omega * Diagonal(sigma))
Sigma = PDMat(_Sigma, cholesky(_Sigma))
.... since it seems there exists an implementation of |
Thanks for all the tips, I tried out your suggestion with the
If I removed the second argument in using Turing, Distributions, LinearAlgebra, Random, Bijectors, PDMats
Bijectors.bijector(d::LKJ) = Bijectors.PDBijector()
Random.seed!(666)
# generate data
sigma = [1,2,3]
Omega = [1 0.3 0.2;
0.3 1 0.1;
0.2 0.1 1]
Sigma = Diagonal(sigma) * Omega * Diagonal(sigma)
N = 100
J = 3
y = rand(MvNormal(zeros(J), Sigma), N)'
# model
@model correlation(J, N, y, Zero) = begin
sigma ~ filldist(truncated(Cauchy(0., 5.), 0., Inf), J) # prior on the standard deviations
Omega ~ LKJ(J, 1) # LKJ prior on the correlation matrix
_Sigma = Symmetric(Diagonal(sigma) * Omega * Diagonal(sigma))
Sigma = PDMat(_Sigma)
for i in 1:N
y[i,:] ~ MvNormal(Zero, Sigma) # sampling distribution of the observations
end
return Sigma
end
chain = sample(correlation(J, N, y, zeros(J)), HMC(0.01, 5), 1000)
chain = sample(correlation(J, N, y, zeros(J)), NUTS(), 1000)
The
|
Ah, then probably that's the reason for why PDMats doesn't use BTW probably you should also remove |
Good point on the using Turing, Distributions, LinearAlgebra, Random, Bijectors
Bijectors.bijector(d::LKJ) = Bijectors.PDBijector()
Random.seed!(666)
# generate data
sigma = [1,2,3]
Omega = [1 0.3 0.2;
0.3 1 0.1;
0.2 0.1 1]
Sigma = Diagonal(sigma) * Omega * Diagonal(sigma)
N = 100
J = 3
y = rand(MvNormal(Sigma), N)'
# model
@model correlation(J, N, y) = begin
sigma ~ filldist(truncated(Cauchy(0., 5.), 0., Inf), J) # prior on the standard deviations
Omega ~ LKJ(J, 1) # LKJ prior on the correlation matrix
Sigma = Symmetric(Diagonal(sigma) * Omega * Diagonal(sigma))
for i in 1:N
y[i,:] ~ MvNormal(Sigma) # sampling distribution of the observations
end
return Sigma
end
chain = sample(correlation(J, N, y), HMC(0.01, 5), 1000) With
With
But sometimes when I sample with |
I had originally opened an issue on the following repo where @trappmartin ended up giving me some advice on this particular issue with the On the other issue Martin ended up restructuring the model specification like the following: @model correlation(J, N, y) = begin
sigma ~ filldist(truncated(Cauchy(0., 5.), 0., Inf), J) # prior on the standard deviations
Omega ~ LKJ(J, 1) # LKJ prior on the correlation matrix
L = Diagonal(sigma) * Omega
for i in 1:N
y[i,:] ~ MvNormal(L*L') # sampling distribution of the observations
end
return L*L'
end However, even with this I'm still seeing issues with sampling the posterior distribution. Here's an example of the data sampled from the prior for this model: sample(correlation(J, N, y), Prior(), 2000)
So the prior samples look good with the sample(correlation(J, N, y), HMC(0.01, 5), 2000)
Things seem to become much less stable when using I was hoping that there was a simple solution for using the
If there's anything you'd like me to test and report back to you then I'm definitely willing to help. |
Just to add to this, I tried fitting a hierarchical linear model with a multivariate prior using
Similar as in @joshualeond 's case, sampling from the prior works fine but NUTS() goes nuts :) with exploding numbers and a bunch of rejected proposals due to numerical errors. |
I can't see why As you may notice, and It's the result of LKJ density: Obviously, The illusion that So we must define a bijector dedicated for |
@yiyuezhuo There's another julia repo outside of the Turing org that may have some helpful code for reference: https://github.com/tpapp/TransformVariables.jl Specifically the |
His implementation looks fine, but doesn't support |
I'm currently trying to build a model in Turing that utilizes the newly added LKJ distribution from Distributions.jl. I ran into issues when I attempted to sample the model:
It appears that the LKJ distribution maybe needs to be added here in Bijectors:
Bijectors.jl/src/Bijectors.jl
Line 202 in 1f3b581
However, after I simply added LKJ to this line I hit the following error:
So it looks like there needs to be a new
getlogp
method in Bijectors for the LKJ. I see that thesegetlogp
methods follow closely to thelogkernel
methods in Distributions.jl but am unsure what needs to be adjusted to make sure it works nicely with Turing.The text was updated successfully, but these errors were encountered: