Skip to content

Monte Carlo integration

ARNAV edited this page Apr 5, 2021 · 8 revisions

Background

Integration is a fundamental problem in mathematics and computer science with many applications that span the whole spectrum of sciences and engineering. It appears, for example, in problems in statistics, biology, and economics, to name a few concrete application areas.

Related work

Related software package exist starting from SI and cubature R packages to C++ package latte-integrale. The first can scale to high dimensions but are limited to cubical domains (i.e. high-dimensional cubes) while the later is for more general convex domains (i.e. polytopes). Since latte-integrale is exact it cannot scale to high-dimensions (typically more than 15 dimensions). Therefore there is a need for efficient software that scales to high-dimensions for general convex domains. See also [1], [2] for the theory behind sampling and integration, and [3] for a recent development.

[1]. Ming-Hui Chen and Bruce W. Schmeiser. 1996. General Hit-and-Run Monte Carlo sampling for evaluating multidimensional integrals. Oper. Res. Lett. 19, 4 (October 1996), 161–169. DOI:https://doi.org/10.1016/0167-6377(96)00030-2

[2] L. Lovasz, S. Vempala, Fast Algorithms for Logconcave Functions: Sampling, Rounding, Integration and Optimization, 2006.

[3] V. Baldoni, N. Berline, J. A. De Loera, M. Koppe, M. Vergne, How to integrate a polynomial over a simplex, 2012.

Details of your coding project

VolEsti is a C++ package with an R interface that performs efficient high dimensional sampling and volume computation. It supports a variety of convex polytope representations and scales to high (i.e., a few hundred) dimensions.

The main purpose of this project is to extent VolEsti’s functionality to handle multi-dimensional integrals. We propose the split the project in the following parts:

  • Simple MC integration
  • Implement a more advanced MC integration algorithm.
  • Check-compare or implement well known MC integration algorithms e.g. MISER, VEGAS
  • Examine possible integrations with stan a state-of-the-art platform for statistical modelling and high-performance statistical computation.
  • Applications to weighted model integration
  • Write tests and documentation

Expected impact

GeomScale will enrich their portfolio of packages by adding support to multi-dimensional integration. Scientific communities will benefit from the use of this package. For example researchers in artificial intelligence have used a home-brew simple versions of the above algorithms for weighted model integration.

Students

Students should have a solid background in C++, algorithms and linear algebra. Knowledge of computational geometry, machine learning or statistical computing will be a plus.

Mentors

Students, please contact mentors below after completing at least one of the tests below.

  • Vissarion Fisikopoulos <vissarion.fisikopoulos at gmail.com> is an international expert in mathematical software, computational geometry and optimization, and has previous GSOC mentoring experience with Boost C++ libraries (2016-2017) and the R-project (2017).

  • Pedro Zuidberg Dos Martires <pedro.zuidbergdosmartires at cs.kuleuven.be> is an expert in probabilistic logic programming, knowledge compilation, symbolic inference and deep Bayesian modeling and contributor to various open source projects (volesti included).

  • Samuel Kolb <first name [dot] second name [at] kuleuven [dot] be> is a postdoctoral researcher who specializes in constraint learning and mixed discrete-continuous (hybrid) probabilistic inference for constrained densities. He co-developed several open source software packages in these fields, including a framework for hybrid probabilistic inference solvers.

Tests

Students, please do one or more of the following tests before contacting the mentors above.

  • Easy: compile and run volesti. Read the CRAN package documentation, generate a random H-polytope and compute its volume.
  • Medium: Use the R package cubature to compute the integral of f(x) = exp^{-a||x||^2} over the cube [-1,1]^n, for various values of a and dimension n until it crashes.
  • Hard: Use volesti to approximate the same integrals as in previous test by simple Monte Carlo based on uniform sampling and by Importance Sampling using multivariate spherical Gaussian. Comment on the accuracy and run-time.

Bonus: Generate a 100-dimensional random H-polytope compute the largest inscribed ball (Chebychev ball) and let the center be the x0. Compute the integral of f(x) = exp^{-a||x-x0||^2} over the polytope for various values of a, 20 times each with both uniform and Gaussian sampling and take the average. Report the standard deviation for each experiment.

IMPORTANT: For tips ask the mentors!

Solutions of tests

Students, please post a link to your test results here.