Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix lbfgsb linesearch out of bound and findAlpha method #633

Merged
merged 3 commits into from
Apr 18, 2017

Conversation

WeichenXu123
Copy link
Contributor

@WeichenXu123 WeichenXu123 commented Mar 31, 2017

What to solve

After a deep check to the LBFGS-B in breeze, I found two serious bug in LBFGS-B:

  • line search is wrong in this implementation.
  • LBFGSB.findAlpha method is also wrong.

Fix Line search with bound

According to the LBFGS-B paper
http://users.iems.northwestern.edu/~nocedal/PDFfiles/limited.pdf
The line search in LBFGS-B should be restricted in the bound, but the implementation in breeze do not,
this will cause the optimizer run out of bound, it will cause wrong result or cause line search fail, in some case.
This is the root cause of issue #572, and a series of features is blocking because of this bug:

So that it should be fixed ASAP.
cc @yanboliang

Strong wolfe line search with bound restriction

As mentioned in paper, the best linesearch method for LBFGS-B is strong wolfe line search with bound restriction, this require some modification based on StrongWolfeLineSearch in breeze:

modified strong wolfe condition with bound

We know that strong wolfe condition is to satisfy sufficient decrease condition and curvature condition, BUT according to the paper, the strong wolfe line search in LBFGS-B, the condition should be modified as following:

  • satisfy sufficient decrease condition
  • satisfy curvature condition OR new point X(k + 1) hit the bound

Algorithm for strong wolfe condition with bound

Without bound, we already have the following algos (Nodecal & Wright Numerical Optimization p58)
lbfgsb-1
So that, with bound, we can modify this algo into following:
lbfgsb-2

modification in LBFGS-B

  • In method determineStepSize, first calculate the max step size which won't exceed bound box, then use it as the step bound, call StrongWolfeLineSearch.minimizeWithBound
  • override takeStep method, in this method check whether the newX exceed bound and correct it. (This is used to avoid numerical error that cause the newX run out of bound)

Fix findAlpha method

Unfortunately, the findAlpha method here is also wrong, the wrong implementation cause the subspaceMin point walk out of bound and so that in some case it will also cause the algos crash.
I trace to the code here through several failed testcases with Huber loss.
In summary, there are at least 2 mistakes in this method, let me explain what the method should do first:

find the maximum alpha, satisfiy 0.0 <= alpha <= 1.0, and, for each dimension i, should satisfy:

lowerBound_i - xCauchy_i <= alpha * du_i <= upperBound_i - xCauchy_i

xCauchy is the Cauchy point inside the bound box which will satisfy

lowerBound_i - xCauchy_i <= 0
upperBound_i - xCauchy_i >= 0

and du is the direction vector (details please refer to the paper), the key point is , for each i, the condition above should be satisfied, so that the subspace minimum point computed will be restricted in the bound box.

So that the algo to find the maximum alpha should be:

minimize i: 0 -> dimensionSize  {
  if (du_i < 0) (lowerBound_i - xCauchy_i) / du_i
  else if (du_i > 0) (upperBound_i - xCauchy_i) / du_i
  else Double.Infinity
}

Note that we should handle the case du_i == 0 carefully, otherwise it may generate NaN in computation and will cause the whole algo crash.

Then we can check the implementation in breeze, the logic in findAlpha is wrong. The place where it should use math.min it use math.max. And the code write (ub_i - xc_i) / du_i to be the wrong code ub_i - xc_i / du_i

The second mistake in findAlpha is that it do not handle the case that components of du is zero. This may cause computation run into NaN. We should skip the zero components of du.

Numerical error handling

In theory, the cauchy point, the subspaceMin point, the X point, in the computation, should all be restricted in the bound box. BUT because of floating point error, it may slightly exceed the bound, so I add a adjustWithinBound method to correct it. So that it can avoid the point step out of bound which may cause other bugs.

Test

The following typical algos have been tested:

  • huber loss regression
  • bounded LOR
  • bounded LIR

def minimize(f: DiffFunction[Double], init: Double = 1.0): Double = {
minimizeWithBound(f, init = 1.0, bound = Double.PositiveInfinity)
}

/**
* Performs a line search on the function f, returning a point satisfying
* the Strong Wolfe conditions. Based on the line search detailed in
* Nocedal & Wright Numerical Optimization p58.
*/
Copy link
Contributor

@yanboliang yanboliang Mar 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we update the annotation? It looks like out of date.

@WeichenXu123 WeichenXu123 force-pushed the fix_lbfgsb_linesearch_out_of_bound branch from f8932dc to 14b54bf Compare April 11, 2017 12:01
@dlwh
Copy link
Member

dlwh commented Apr 12, 2017

looks great! thanks so much!

So sorry for the long delay. work and life have been extra busy

state.x + (dir *:* stepSize)
val newX = state.x + (dir :* stepSize)
var i = 0
while (i < newX.length) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i prefer cforRange these days

@WeichenXu123 WeichenXu123 changed the title Fix lbfgsb linesearch out of bound Fix lbfgsb linesearch out of bound and findAlpha method Apr 17, 2017
@WeichenXu123
Copy link
Contributor Author

cc @dlwh @yanboliang @dbtsai
Thanks!

@yanboliang
Copy link
Contributor

yanboliang commented Apr 17, 2017

I have run tests for bound constraint LiR/LoR and huber regression against this fix, all passed. This looks good to me. Thanks !

@dbtsai
Copy link
Contributor

dbtsai commented Apr 17, 2017

LGTM. Ping @dlwh to cut a new release for our usage in Spark if this looks okay. Thanks.

@dlwh
Copy link
Member

dlwh commented Apr 18, 2017

thanks!

@dlwh
Copy link
Member

dlwh commented Apr 18, 2017

i'll cut a release this week

@dlwh dlwh merged commit dc4e005 into scalanlp:master Apr 18, 2017
@WeichenXu123 WeichenXu123 deleted the fix_lbfgsb_linesearch_out_of_bound branch April 18, 2017 04:51
asfgit pushed a commit to apache/spark that referenced this pull request Apr 27, 2017
## What changes were proposed in this pull request?
MLlib ```LogisticRegression``` should support bound constrained optimization (only for L2 regularization). Users can add bound constraints to coefficients to make the solver produce solution in the specified range.

Under the hood, we call Breeze [```L-BFGS-B```](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGSB.scala) as the solver for bound constrained optimization. But in the current breeze implementation, there are some bugs in L-BFGS-B, and scalanlp/breeze#633 fixed them. We need to upgrade dependent breeze later, and currently we use the workaround L-BFGS-B in this PR temporary for reviewing.

## How was this patch tested?
Unit tests.

Author: Yanbo Liang <[email protected]>

Closes #17715 from yanboliang/spark-20047.

(cherry picked from commit 606432a)
Signed-off-by: DB Tsai <[email protected]>
ghost pushed a commit to dbtsai/spark that referenced this pull request Apr 27, 2017
## What changes were proposed in this pull request?
MLlib ```LogisticRegression``` should support bound constrained optimization (only for L2 regularization). Users can add bound constraints to coefficients to make the solver produce solution in the specified range.

Under the hood, we call Breeze [```L-BFGS-B```](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGSB.scala) as the solver for bound constrained optimization. But in the current breeze implementation, there are some bugs in L-BFGS-B, and scalanlp/breeze#633 fixed them. We need to upgrade dependent breeze later, and currently we use the workaround L-BFGS-B in this PR temporary for reviewing.

## How was this patch tested?
Unit tests.

Author: Yanbo Liang <[email protected]>

Closes apache#17715 from yanboliang/spark-20047.
@@ -58,12 +58,18 @@ class StrongWolfeLineSearch(maxZoomIter: Int, maxLineSearchIter: Int) extends Cu
val c1 = 1e-4
val c2 = 0.9

def minimize(f: DiffFunction[Double], init: Double = 1.0): Double = {
minimizeWithBound(f, init = 1.0, bound = Double.PositiveInfinity)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LBFGS is passing init value not always equals to 1 to this method. It that right to ignore it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In LBFGS-B, we can use 1 as the init value, according to paper, or do you have some better init value ?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I don't have a better init value, but LBFGS has: https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGS.scala#L76 :)

Let me describe my problem: I've updated spark in my project to latest version (2.2) and some test on logistic regression start failing.
Spark 2.2 uses Breeze 0.13.1 and when I'm explicitly downgrading it to 0.13 tests stop failing.
It seems like in my case regression could not be successfully trained using LBFGS with init value equal to 1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh! you're right. we should fix it. like:

def minimize(f: DiffFunction[Double], init: Double = 1.0): Double = {
  minimizeWithBound(f, init, bound = Double.PositiveInfinity)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for finding this bug!

@dlwh
Copy link
Member

dlwh commented Jul 20, 2017 via email

jzhuge pushed a commit to jzhuge/spark that referenced this pull request Aug 20, 2018
MLlib ```LogisticRegression``` should support bound constrained optimization (only for L2 regularization). Users can add bound constraints to coefficients to make the solver produce solution in the specified range.

Under the hood, we call Breeze [```L-BFGS-B```](https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/LBFGSB.scala) as the solver for bound constrained optimization. But in the current breeze implementation, there are some bugs in L-BFGS-B, and scalanlp/breeze#633 fixed them. We need to upgrade dependent breeze later, and currently we use the workaround L-BFGS-B in this PR temporary for reviewing.

Unit tests.

Author: Yanbo Liang <[email protected]>

Closes apache#17715 from yanboliang/spark-20047.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants