Skip to content

wangronin/Bayesian-Optimization

Repository files navigation

Actions Status

Bayesian Optimization Library

A Python implementation of the Bayesian Optimization (BO) algorithm working on decision spaces composed of either real, integer, catergorical variables, or a mixture thereof.

Underpinned by surrogate models, BO iteratively proposes candidate solutions using the so-called acquisition function which balances exploration with exploitation, and updates the surrogate model with newly observed objective values. This algorithm is designed to optimize expensive black-box problems efficiently.

Installation

You could either install the stable version on pypi:

pip install bayes-optim

Or, take the lastest version from github:

git clone https://github.com/wangronin/Bayesian-Optimization.git
cd Bayesian-Optimization && python setup.py install --user

Example

For real-valued search variables, the simplest usage is via the fmin function:

from bayes_optim import fmin

def f(x):
  return sum(x ** 2)

minimum = fmin(f, [-5] * 2, [5] * 2, max_FEs=30, seed=42)

And you could also have much finer control over most ingredients of BO, e.g., the surrogate model and acquisition functions. Please see the example below:

from bayes_optim import BO, ContinuousSpace
from bayes_optim.Surrogate import GaussianProcess

dim = 5
space = ContinuousSpace([-5, 5]) * dim  # create the search space

# hyperparameters of the GPR model
thetaL = 1e-10 * (ub - lb) * np.ones(dim)
thetaU = 10 * (ub - lb) * np.ones(dim)
model = GaussianProcess(                # create the GPR model
  thetaL=thetaL, thetaU=thetaU
)

opt = BO(
    search_space=space,
    obj_fun=fitness,
    model=model,
    DoE_size=5,                         # number of initial sample points
    max_FEs=50,                         # maximal function evaluation
    verbose=True
)
opt.run()

For more detailed usage and exmaples, please check out our wiki page.

Features

This implementation differs from alternative packages/libraries in the following features:

  • Parallelization, also known as batch-sequential optimization, for which several different approaches are implemented here.
  • Moment-Generating Function of the improvment (MGFI) [WvSEB17a] is a recently proposed acquistion function, which implictly controls the exploration-exploitation trade-off.
  • Mixed-Integer Evolution Strategy for optimizing the acqusition function, which is enabled when the search space is a mixture of real, integer, and categorical variables.

Project Structure

  • bayes-optim/SearchSpace.py: implementation of the search/decision space.
  • bayes-optim/base.py: the base class of Bayesian Optimization.
  • bayes-optim/AcquisitionFunction.py: the implemetation of acquisition functions (see below for the list of implemented ones).
  • bayes-optim/Surrogate: we implemented the Gaussian Process Regression (GPR) and Random Forest (RF).
  • bayes-optim/BayesOpt.py contains several BO variants:
    • BO: noiseless + sequential
    • ParallelBO: noiseless + parallel (a.k.a. batch-sequential)
    • AnnealingBO: noiseless + parallel + annealling [WEB18]
    • SelfAdaptiveBO: noiseless + parallel + self-adaptive [WEB19]
    • NoisyBO: noisy + parallel
  • bayes-optim/Extension.py is meant to include the lastest developments that are not extensively tested:
    • PCABO: noiseless + parallel + PCA-assisted dimensionality reduction [RaponiWBBD20] [Under Construction]
    • MultiAcquisitionBO: noiseless + parallelization with multiple different acquisition functions [Under Construction]

Acquisition Functions

The following infill-criteria are implemented in the library:

  • Expected Improvement (EI)
  • Probability of Improvement (PI) / Probability of Improvement
  • Upper Confidence Bound (UCB)
  • Moment-Generating Function of Improvement (MGFI)
  • Generalized Expected Improvement (GEI) [Under Construction]

For sequential working mode, Expected Improvement is used by default. For parallelization mode, MGFI is enabled by default.

Surrogate Model

The meta (surrogate)-model used in Bayesian optimization. The basic requirement for such a model is to provide the uncertainty quantification (either empirical or theorerical) for the prediction. To easily handle the categorical data, random forest model is used by default. The implementation here is based the one in scikit-learn, with modifications on uncertainty quantification.

A brief Introduction to Bayesian Optimization

Bayesian Optimization [Moc74, JSW98] (BO) is a sequential optimization strategy originally proposed to solve the single-objective black-box optimiza-tion problem that is costly to evaluate. Here, we shall restrict our discussion to the single-objective case. BO typically starts with sampling an initial design of experiment (DoE) of size, X={x1,x2,...,xn}, which is usually generated by simple random sampling, Latin Hypercube Sampling [SWN03], or the more sophisticated low-discrepancy sequence [Nie88] (e.g., Sobol sequences). Taking the initial DoE X and its corresponding objective value, Y={f(x1), f(x2),..., f(xn)} ⊆ ℝ, we proceed to construct a statistical model M describing the probability distribution of the objective function conditioned onthe initial evidence, namely Pr(f|X,Y). In most application scenarios of BO, there is a lack of a priori knowledge about f and therefore nonparametric models (e.g., Gaussian process regression or random forest) are commonly chosen for M, which gives rise to a predictor f'(x) for all x ∈ X and an uncertainty quantification s'(x) that estimates, for instance, the mean squared error of the predic-tion E(f'(x)−f(x))2. Based on f' and s', promising points can be identified via the so-called acquisition function which balances exploitation with exploration of the optimization process.

Reference

  • [Moc74] Jonas Mockus. "On bayesian methods for seeking the extremum". In Guri I. Marchuk, editor, Optimization Techniques, IFIP Technical Conference, Novosibirsk, USSR, July 1-7, 1974, volume 27 of Lecture Notes in Computer Science, pages 400–404. Springer, 1974.
  • [JSW98] Donald R. Jones, Matthias Schonlau, and William J. Welch. "Efficient global optimization of expensive black-box functions". J. Glob. Optim., 13(4):455–492, 1998.
  • [SWN03] Thomas J. Santner, Brian J. Williams, and William I. Notz. "The Design and Analysis of Computer Experiments". Springer series in statistics. Springer, 2003.
  • [Nie88] Harald Niederreiter. "Low-discrepancy and low-dispersion sequences". Journal of number theory, 30(1):51–70, 1988.
  • [WvSEB17a] Hao Wang, Bas van Stein, Michael Emmerich, and Thomas Bäck. "A New Acquisition Function for Bayesian Optimization Based on the Moment-Generating Function". In Systems, Man, and Cybernetics (SMC), 2017 IEEE International Conference on, pages 507–512. IEEE, 2017.
  • [WEB18] Hao Wang, Michael Emmerich, and Thomas Bäck. "Cooling Strategies for the Moment-Generating Function in Bayesian Global Optimization". In 2018 IEEE Congress on Evolutionary Computation, CEC 2018, Rio de Janeiro, Brazil, July 8-13, 2018, pages 1–8. IEEE, 2018.
  • [WEB19] Hao, Wang, Michael Emmerich, and Thomas Bäck. "Towards self-adaptive efficient global optimization". In AIP Conference Proceedings, vol. 2070, no. 1, p. 020056. AIP Publishing LLC, 2019.
  • [RaponiWBBD20] Elena Raponi, Hao Wang, Mariusz Bujny, Simonetta Boria, and Carola Doerr: "High Dimensional Bayesian Optimization Assisted by Principal Component Analysis". In International Conference on Parallel Problem Solving from Nature, pp. 169-183. Springer, Cham, 2020.