libdogleg - A general purpose optimizer to solve data fitting problems
If you're considering using this library for new projects, please look at the ceres
solver instead:
libdogleg
is not deprecated, and I will fix bugs as they're reported. However ceres
is more feature-rich and much more widely used. I consider ceres
to be a superset of libdogleg
, and if ceres
was available when I wrote libdogleg
, I would not have written it.
This is a library for solving large-scale nonlinear optimization problems. By employing sparse linear algebra, it is taylored for problems that have weak coupling between the optimization variables. For appropriately sparse problems this results in massive performance gains.
For smaller problems with dense Jacobians a dense mode is available also. This utilizes the same optimization loop as the sparse code, but uses dense linear algebra.
The main task of this library is to find the vector p that minimizes
norm2( x )
where x = f(p) is a vector that has higher dimensionality than p. The user passes in a callback function (of type dogleg_callback_t
) that takes in the vector p and returns the vector x and a matrix of derivatives J = df/dp. J is a matrix with a row for each element of f and a column for each element of p. If J is a sparse matrix, then this library can take advantage of that, which results in substantial increases in computational efficiency if most entries of J are 0. J is stored row-first in the callback routine. libdogleg uses a column-first data representation so it references the transpose of J (called Jt). J stored row-first is identical to Jt stored column-first; this is purely a naming choice.
This library implements Powell's dog-leg algorithm to solve the problem. Like the more-widely-known Levenberg-Marquardt algorithm, Powell's dog-leg algorithm solves a nonlinear optimization problem by interpolating between a Gauss-Newton step and a gradient descent step. Improvements over LM are
a more natural representation of the linearity of the operating point (trust region size vs a vague lambda term).
significant efficiency gains, since a matrix inversion isn't needed to retry a rejected step
The algorithm is described in many places, originally in
M. Powell. A Hybrid Method for Nonlinear Equations. In P. Rabinowitz, editor, Numerical Methods for Nonlinear Algebraic Equations, pages 87-144. Gordon and Breach Science, London, 1970.
Various enhancements to Powell's original method are described in the literature; at this time only the original algorithm is implemented here.
The sparse matrix algebra is handled by the CHOLMOD library, written by Tim Davis. Parts of CHOLMOD are licensed under the GPL and parts under the LGPL. Only the LGPL pieces are used here, allowing libdogleg to be licensed under the LGPL as well. Due to this I lose some convenience (all simple sparse matrix arithmetic in CHOLMOD is GPL-ed) and some performance (the fancier computational methods, such as supernodal analysis are GPL-ed). For my current applications the performance losses are minor.
This is the main call to the library for sparse Jacobians. It's declared as
double dogleg_optimize2(double* p, unsigned int Nstate,
unsigned int Nmeas, unsigned int NJnnz,
dogleg_callback_t* f, void* cookie,
const dogleg_parameters2_t* parameters,
dogleg_solverContext_t** returnContext);
p is the initial estimate of the state vector (and holds the final result)
Nstate
specifies the number of optimization variables (elements of p)Nmeas
specifies the number of measurements (elements of x).Nmeas >= Nstate
is a requirementNJnnz
specifies the number of non-zero elements of the jacobian matrix df/dp. In a dense matrixJnnz = Nstate*Nmeas
. We are dealing with sparse jacobians, soNJnnz
should be far less. If not, libdogleg is not an appropriate routine to solve this problem.f
specifies the callback function that the optimization routine calls to sample the problem being solvedcookie
is an arbitrary data pointer passed untouched tof
parameters
a pointer to the set of parameters to use. Set toNULL
to use the global parameters, or calldogleg_optimize
instead. See the Parameters section below for more detailsIf not
NULL
,returnContext
can be used to retrieve the full context structure from the solver. This can be useful since this structure contains the latest operating point values. It also has an activecholmod_common
structure, which can be reused if more CHOLMOD routines need to be called externally. You usually wantreturnContext->beforeStep
. If this data is requested, the user is required to free it withdogleg_freeContext
when done.
dogleg_optimize
returns norm2( x ) at the minimum, or a negative value if an error occurred.
This is a flavor of dogleg_optimize2
that implicitly uses the global parameters. It's declared as
double dogleg_optimize(double* p, unsigned int Nstate,
unsigned int Nmeas, unsigned int NJnnz,
dogleg_callback_t* f, void* cookie,
dogleg_solverContext_t** returnContext);
This is the main call to the library for dense Jacobians. Its declared as
double dogleg_optimize_dense2(double* p, unsigned int Nstate,
unsigned int Nmeas,
dogleg_callback_dense_t* f, void* cookie,
const dogleg_parameters2_t* parameters,
dogleg_solverContext_t** returnContext);
The arguments are almost identical to those in the dogleg_optimize
call.
p is the initial estimate of the state vector (and holds the final result)
Nstate
specifies the number of optimization variables (elements of p)Nmeas
specifies the number of measurements (elements of x).Nmeas >= Nstate
is a requirementf
specifies the callback function that the optimization routine calls to sample the problem being solved. Note that this callback has a different type from that indogleg_optimize
cookie
is an arbitrary data pointer passed untouched tof
parameters
a pointer to the set of parameters to use. Set toNULL
to use the global parameters, or calldogleg_optimize
instead. See the Parameters section below for more detailsIf not
NULL
,returnContext
can be used to retrieve the full context structure from the solver. This can be useful since this structure contains the latest operating point values. You usually wantreturnContext->beforeStep
. If this data is requested, the user is required to free it withdogleg_freeContext
when done.
dogleg_optimize
returns norm2( x ) at the minimum, or a negative value if an error occurred.
This is a flavor of dogleg_optimize_dense2
that implicitly uses the global parameters. It's declared as
double dogleg_optimize_dense(double* p, unsigned int Nstate,
unsigned int Nmeas,
dogleg_callback_dense_t* f, void* cookie,
dogleg_solverContext_t** returnContext);
Used to deallocate memory used for an optimization cycle. Defined as:
void dogleg_freeContext(dogleg_solverContext_t** ctx);
If a pointer to a context is not requested (by passing returnContext = NULL
to dogleg_optimize
), libdogleg calls this routine automatically. If the user did retrieve this pointer, though, it must be freed with dogleg_freeContext
when the user is finished.
Computes the cholesky decomposition of JtJ. This function is only exposed if you need to touch libdogleg internals via returnContext. Sometimes after computing an optimization you want to do stuff with the factorization of JtJ, and this call ensures that the factorization is available. Most people don't need this function. If the comment wasn't clear, you don't need this function.
This is declared as
void dogleg_computeJtJfactorization(dogleg_operatingPoint_t* point,
dogleg_solverContext_t* ctx);
The arguments are
point
is the operating point of the system. Generally this will bereturnContext->beforeStep
wherereturnContext
is from one of thedogleg_optimize_...
functions.ctx
is the dogleg context. Generally this will bereturnContext
from one of thedogleg_optimize_...
functions
libdogleg requires the user to compute the jacobian matrix J. This is a performance optimization, since J could be computed by differences of x. This optimization is often worth the extra effort, but it creates a possibility that J will have a mistake and J = df/dp would not be true. To find these types of issues, the user can call
void dogleg_testGradient(unsigned int var, const double* p0,
unsigned int Nstate, unsigned int Nmeas, unsigned int NJnnz,
dogleg_callback_t* f, void* cookie);
This function computes the jacobian with center differences and compares the result with the jacobian computed by the callback function. It is recommended to do this for every variable while developing the program that uses libdogleg.
var
is the index of the variable being testedp0
is the state vector p where we're evaluating the jacobianNstate
,Nmeas
,NJnnz
are the number of state variables, measurements and non-zero jacobian elements, as beforef
is the callback function, as beforecookie
is the user data, as before
This function returns nothing, but prints out the test results.
Very similar to dogleg_testGradient
, but for dense jacobians.
void dogleg_testGradient_dense(unsigned int var, const double* p0,
unsigned int Nstate, unsigned int Nmeas,
dogleg_callback_dense_t* f, void* cookie);
This function computes the jacobian with center differences and compares the result with the jacobian computed by the callback function. It is recommended to do this for every variable while developing the program that uses libdogleg.
var
is the index of the variable being testedp0
is the state vector p where we're evaluating the jacobianNstate
,NJnnz
are the number of state variables, measurementsf
is the callback function, as beforecookie
is the user data, as before
This function returns nothing, but prints out the test results.
The main user callback that specifies the sparse optimization problem has type
typedef void (dogleg_callback_t)(const double* p,
double* x,
cholmod_sparse* Jt,
void* cookie);
p is the current state vector
x is the resulting f(p)
Jt is the transpose of df/dp at p. As mentioned previously, Jt is stored column-first by CHOLMOD, which can be interpreted as storing J row-first by the user-defined callback routine
The
cookie
is the user-defined arbitrary data passed intodogleg_optimize
.
The main user callback that specifies the dense optimization problem has type
typedef void (dogleg_callback_dense_t)(const double* p,
double* x,
double* J,
void* cookie);
p is the current state vector
x is the resulting f(p)
J is df/dp at p. J is stored row-first, with all the derivatives for the first measurement, then all the derivatives for the second measurement and so on.
The
cookie
is the user-defined arbitrary data passed intodogleg_optimize
.
This is the solver context that can be retrieved through the returnContext
parameter of the dogleg_optimize
call. This structure contains all the internal state used by the solver. If requested, the user is responsible for calling dogleg_freeContext
when done. This structure is defined as:
typedef struct
{
cholmod_common common;
union
{
dogleg_callback_t* f;
dogleg_callback_dense_t* f_dense;
};
void* cookie;
// between steps, beforeStep contains the operating point of the last step.
// afterStep is ONLY used while making the step. Externally, use beforeStep
// unless you really know what you're doing
dogleg_operatingPoint_t* beforeStep;
dogleg_operatingPoint_t* afterStep;
// The result of the last JtJ factorization performed. Note that JtJ is not
// necessarily factorized at every step, so this is NOT guaranteed to contain
// the factorization of the most recent JtJ
union
{
cholmod_factor* factorization;
// This is a factorization of JtJ, stored as a packed symmetric matrix
// returned by dpptrf('L', ...)
double* factorization_dense;
};
// Have I ever seen a singular JtJ? If so, I add this constant to the diagonal
// from that point on. This is a simple and fast way to deal with
// singularities. This constant starts at 0, and is increased every time a
// singular JtJ is encountered. This is suboptimal but works for me for now.
double lambda;
// Are we using sparse math (cholmod)?
int is_sparse;
int Nstate, Nmeasurements;
} dogleg_solverContext_t;
Some of the members are copies of the data passed into dogleg_optimize
; some others are internal state. Of potential interest are
common
is a cholmod_common structure used by all CHOLMOD calls. This can be used for any extra CHOLMOD work the user may want to dobeforeStep
contains the operating point of the optimum solution. The user can analyze this data without the need to re-call the callback routine.
An operating point of the solver. This is a part of dogleg_solverContext_t
. Various variables describing the operating point such as p, J, x, norm2(x) and Jt x are available. All of the just-mentioned variables are computed at every step and are thus always valid.
// an operating point of the solver
typedef struct
{
double* p;
double* x;
double norm2_x;
union
{
cholmod_sparse* Jt;
double* J_dense; // row-first: grad0, grad1, grad2, ...
};
double* Jt_x;
// the cached update vectors. It's useful to cache these so that when a step is rejected, we can
// reuse these when we retry
double* updateCauchy;
union
{
cholmod_dense* updateGN_cholmoddense;
double* updateGN_dense;
};
double updateCauchy_lensq, updateGN_lensq; // update vector lengths
// whether the current update vectors are correct or not
int updateCauchy_valid, updateGN_valid;
int didStepToEdgeOfTrustRegion;
} dogleg_operatingPoint_t;
The optimization is controlled by several parameters. These can be set globally for all callers of libdogleg
in a process using the dogleg_set....()
functions. Those global values are used by dogleg_optimize
and dogleg_optimize_dense
. Or these can be specified independently for each invocation by passing a parameters
argument to dogleg_optimize2
or dogleg_optimize_dense2
. The latter is recommended because multiple instances of libdogleg in a single application would no longer conflict.
It is not required to set any of these, but it's highly recommended to set the initial trust-region size and the termination thresholds to match the problem being solved. Furthermore, it's highly recommended for the problem being solved to be scaled so that every state variable affects the objective norm2( x ) equally. This makes this method's concept of "trust region" much more well-defined and makes the termination criteria work correctly.
To set the maximum number of solver iterations, call
void dogleg_setMaxIterations(int n);
To turn on diagnostic output, call
void dogleg_setDebug(int debug);
with a non-zero value for debug
. Two separate diagnostic streams are available: a verbose human-oriented stream, and a vnlog.
By default, diagnostic output is disabled. The debug
argument is a bit-mapped integer:
if(debug == 0 ): no diagnostic output
if(debug & DOGLEG_DEBUG_VNLOG): output vnlog diagnostics to stdout
if(debug & ~DOGLEG_DEBUG_VNLOG): output human-oriented diagnostics to stderr
DOGLEG_DEBUG_VNLOG
has a very high value, so if human diagnostics are desired, the recommended call is:
dogleg_setDebug(1);
The optimization method keeps track of a trust region size. Here, the trust region is a ball in R^Nstate. When the method takes a step p -> p + delta_p, it makes sure that
sqrt( norm2( delta_p ) ) < trust region size.
The initial value of the trust region size can be set with
void dogleg_setInitialTrustregion(double t);
The dogleg algorithm is efficient when recomputing a rejected step for a smaller trust region, so set the initial trust region size to a value larger to a reasonable estimate; the method will quickly shrink the trust region to the correct size.
The routine exits when the maximum number of iterations is exceeded, or a termination threshold is hit, whichever happens first. The termination thresholds are all designed to trigger when very slow progress is being made. If all went well, this slow progress is due to us finding the optimum. There are 3 termination thresholds:
The function being minimized is E = norm2( x ) where x = f(p).
dE/dp = 2*Jt*x where Jt is transpose(dx/dp).
if( for every i fabs(Jt_x[i]) < JT_X_THRESHOLD ) { we are done; }
The method takes discrete steps: p -> p + delta_p
if( for every i fabs(delta_p[i]) < UPDATE_THRESHOLD) { we are done; }
The method dynamically controls the trust region.
if(trustregion < TRUSTREGION_THRESHOLD) { we are done; }
To set these threholds, call
void dogleg_setThresholds(double Jt_x, double update, double trustregion);
To leave a particular threshold alone, specify a negative value.
This function sets the parameters that control when and how the trust region is updated. The default values should work well in most cases, and shouldn't need to be tweaked.
Declaration looks like
void dogleg_setTrustregionUpdateParameters(double downFactor, double downThreshold,
double upFactor, double upThreshold);
To see what the parameters do, look at evaluateStep_adjustTrustRegion
in the source. Again, these should just work as is.
The current implementation doesn't handle a singular JtJ gracefully (JtJ = Jt * J). Analytically, JtJ is at worst positive semi-definite (has 0 eigenvalues). If a singular JtJ is ever encountered, from that point on, JtJ + lambda*I is inverted instead for some positive constant lambda. This makes the matrix strictly positive definite, but is sloppy. At least I should vary lambda. In my current applications, a singular JtJ only occurs if at a particular operating point the vector x has no dependence at all on some elements of p. In the general case other causes could exist, though.
There's an inefficiency in that the callback always returns x and J. When I evaluate and reject a step, I do not end up using J at all. Dependng on the callback function, it may be better to ask for x and then, if the step is accepted, to ask for J.
Dima Kogan, <[email protected]>
Copyright 2011 Oblong Industries 2017 Dima Kogan <[email protected]>
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
The full text of the license is available at http://www.gnu.org/licenses