-
Notifications
You must be signed in to change notification settings - Fork 0
Simulated Annealing for high dimensional log concave sampling
This project is about sampling from high dimensional log-concave distributions with No-U-Turn Hamiltonian Monte Carlo (HMC) Sampler. In particular, the contributor will adjust for sampling and implement the simulated annealing [1,2] in the case of a log-concave distribution restricted to a convex body. The contributor will have to implement a practical version of this scheme to address the problem of efficient sampling in practice.
Only volesti
package provides an open-source implementation for sampling from
multivariate distributions truncated by a polytope or a spectrahedron with HMC and NUTS HMC algorithms.
It employs several integrators for the ODE that HMC step requires with the most efficient being the
leapfrog integrator. However, when the log-concave distribution is extremely skinny
the mixing rate of those random walks could be arbitrarily small. The current implementation
can not efficiently handle those cases. This project will contribute to volesti
with
a new implementation of a practical simulated annealing scheme that would address these hard instances.
The contributor (a) will implement the simulated annealing [1,2] in C++ for the case of sampling, (b) will evaluate and
tune the implementation, (c) will report experimental evidence on the efficiency of the implementation, and (d) will
write brief documentation. The proposed programming language is C++ because the
implemented code will have to be added to volesti
package and GeomScale
project
in general. Of course, the implementation will be based on the current software of
volesti
, so the basic geometrical concepts (polytopes, boundary reflections, ODE solvers)
are ready to be used. A Matlab prototype will be given to the student.
Difficulty: Hard
- Required: C++
- Preferred: Experience with statistical or other mathematical software is a plus
This is a high-priority project for GeomScale
org.
It is expected to be very helpful to GeomScale
sampling software as it is about
efficient sampling from log-concave or general distributions which appear in many applications.
Moreover, it will be an important stepping stone towards high dimensional MCMC sampling
with open source software, addressing some important applications in biology and bayesian inference.
[1] Simulated Annealing for Convex Optimization, A. Kalai, S. Vempala
[2] Fast Algorithms for Logconcave Functions: Sampling, Rounding, Integration and Optimization, L. Lovasz, S. Vempala
Contributors, please contact both mentors below after completing at least one of the tests below.
-
Vissarion Fisikopoulos <vissarion.fisikopoulos at gmail.com> is an expert in mathematical software, computational geometry and optimization, and has previous GSOC mentoring experience with Boost C++ libraries (2016-2020) and the R-project (2017-2020).
-
Marios Papachristou < papachristoumarios at gmail.com > is a PhD student in the Computer Science Department at Cornell University. His primary research interests lie within the field of Data Science. He has previous experience in GSoC 2018 and 2020 as a student under Org. FOSS and GeomScale. He was GSoC's mentor in GSoC 2019 and 2020.
-
Elias Tsigaridas <elias.tsigaridas at inria.fr> is an expert in computational nonlinear algebra and geometry with experience in mathematical software. He has contributed to the implementation, in C and C++, of several solving algorithms for various open-source computer algebra libraries and has previous GSOC mentoring experience with the R-project (2019) and Geomscale (2020).
Contributors, please do one or more of the following tests before contacting the mentors above.
-
Easy: Download, compile and run a simple sampling example with both C++ and R interfaces of volesti. For example, you can sample points from a 100-dimensional cube using the C++ HMC algorithm implemented in
volesti
for various distributions. -
Medium: Add counters in C++ code of
volesti
to count the average number of reflections per HMC step. For different (a) integration time, (b) step length in ODE solver, (c) dimension, compute the PSRF/run-time and Effective Sample Size (ESS)/run-time for a large number of samples and report. -
Hard: Implement in C++ an approximate and an exact oracle for the gradient of the Spherical Gaussian distribution with unit variance and repeat the experiments requested by the
Medium test
. Compare the performances of two cases and report.
STUDENT 1 AGILAN S, https://github.com/Agi7an, https://github.com/Agi7an/VolEsti/blob/main/setup.r
STUDENT 2 Ioannis Iakovidis, https://github.com/iakoviid/, https://github.com/iakoviid/volesti/tree/gsoc2022
STUDENT 3 HUSSAIN LOHAWALA, https://github.com/H9660/Volesti-/blob/main/setup.r
STUDENT 4 Huu Phuoc Le, https://github.com/huuphuocle, https://github.com/huuphuocle/sampling_correlation_matrices