[SPARK-1157][MLlib] L-BFGS Optimizer based on Breeze's implementation. #353

dbtsai · 2014-04-08T01:54:10Z

This PR uses Breeze's L-BFGS implement, and Breeze dependency has already been introduced by Xiangrui's sparse input format work in SPARK-1212. Nice work, @mengxr !

When use with regularized updater, we need compute the regVal and regGradient (the gradient of regularized part in the cost function), and in the currently updater design, we can compute those two values by the following way.

Let's review how updater works when returning newWeights given the input parameters.

w' = w - thisIterStepSize * (gradient + regGradient(w)) Note that regGradient is function of w!
If we set gradient = 0, thisIterStepSize = 1, then
regGradient(w) = w - w'

As a result, for regVal, it can be computed by

val regVal = updater.compute(
  weights,
  new DoubleMatrix(initialWeights.length, 1), 0, 1, regParam)._2

and for regGradient, it can be obtained by

  val regGradient = weights.sub(
    updater.compute(weights, new DoubleMatrix(initialWeights.length, 1), 1, 1, regParam)._1)

The PR includes the tests which compare the result with SGD with/without regularization.

We did a comparison between LBFGS and SGD, and often we saw 10x less
steps in LBFGS while the cost of per step is the same (just computing
the gradient).

The following is the paper by Prof. Ng at Stanford comparing different
optimizers including LBFGS and SGD. They use them in the context of
deep learning, but worth as reference.
http://cs.stanford.edu/~jngiam/papers/LeNgiamCoatesLahiriProchnowNg2011.pdf

AmplabJenkins · 2014-04-08T01:57:23Z

Merged build triggered.

AmplabJenkins · 2014-04-08T01:57:32Z

Merged build started.

AmplabJenkins · 2014-04-08T01:59:33Z

Merged build finished.

AmplabJenkins · 2014-04-08T01:59:34Z

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13872/

AmplabJenkins · 2014-04-08T02:07:23Z

Merged build triggered.

AmplabJenkins · 2014-04-08T02:07:32Z

Merged build started.

AmplabJenkins · 2014-04-08T02:15:28Z

Merged build finished.

AmplabJenkins · 2014-04-08T02:15:29Z

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13873/

AmplabJenkins · 2014-04-08T04:32:23Z

Merged build triggered.

AmplabJenkins · 2014-04-08T04:32:33Z

Merged build started.

AmplabJenkins · 2014-04-08T05:11:22Z

Merged build finished. All automated tests passed.

AmplabJenkins · 2014-04-08T05:11:22Z

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13877/

mengxr · 2014-04-08T07:58:11Z

@dbtsai Did you compare L-BFGS with MLlib's implementation of GD on some real data sets?

mengxr · 2014-04-08T07:59:23Z

mllib/src/main/scala/org/apache/spark/mllib/optimization/LBFGS.scala

+    val miniBatchSize = nexamples * miniBatchFraction
+    var i = 0
+
+    val costFun = new DiffFunction[BDV[Double]] {


Better create a private class for the cost function.

I tested the optimizer with several real data, for example, small ones from UCI Machine Learning Repository, and some big data like mnist8m (although the property and stability of optimizer don't depend on the size of dataset), L-BFGS gives the same or better result compared with GD. For some dataset, GD will converge really slow after 40~50 iterations.

For cost function, I intend to do it in this way because in the code of cost function, I want to access and modify variables outside the cost function, for example, "i", "lossHistory", and if I create a private class for this, it will be extra effort to achieve this without changing breeze DiffFunction signature.

dbtsai · 2014-04-08T20:04:42Z

@mengxr As you suggested, I moved the costFun to private CostFun class.

AmplabJenkins · 2014-04-08T20:07:23Z

Merged build triggered.

AmplabJenkins · 2014-04-08T20:07:32Z

Merged build started.

AmplabJenkins · 2014-04-08T20:47:38Z

Merged build finished. All automated tests passed.

AmplabJenkins · 2014-04-08T20:47:39Z

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13907/

mengxr · 2014-04-09T20:47:49Z

mllib/src/main/scala/org/apache/spark/mllib/optimization/LBFGS.scala

+
+package org.apache.spark.mllib.optimization
+
+import scala.Array


Scala imports Array by default.

dbtsai · 2014-04-15T00:48:12Z

Jenkins, retest this please.

dbtsai · 2014-04-15T00:49:27Z

Timeout for lastest jenkins run. It seems that CI is not stable now.

AmplabJenkins · 2014-04-15T00:53:12Z

Merged build triggered.

AmplabJenkins · 2014-04-15T00:53:21Z

Merged build started.

AmplabJenkins · 2014-04-15T02:24:27Z

Merged build finished.

AmplabJenkins · 2014-04-15T02:24:27Z

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14126/

mengxr · 2014-04-15T07:18:45Z

Jenkins, retest this please.

AmplabJenkins · 2014-04-15T07:23:12Z

Merged build triggered.

AmplabJenkins · 2014-04-15T07:23:18Z

Merged build started.

AmplabJenkins · 2014-04-15T08:44:25Z

Merged build finished. All automated tests passed.

AmplabJenkins · 2014-04-15T08:44:25Z

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14142/

pwendell · 2014-04-15T18:13:09Z

Thanks - merged this!

@mengxr

This PR uses Breeze's L-BFGS implement, and Breeze dependency has already been introduced by Xiangrui's sparse input format work in SPARK-1212. Nice work, @mengxr ! When use with regularized updater, we need compute the regVal and regGradient (the gradient of regularized part in the cost function), and in the currently updater design, we can compute those two values by the following way. Let's review how updater works when returning newWeights given the input parameters. w' = w - thisIterStepSize * (gradient + regGradient(w)) Note that regGradient is function of w! If we set gradient = 0, thisIterStepSize = 1, then regGradient(w) = w - w' As a result, for regVal, it can be computed by val regVal = updater.compute( weights, new DoubleMatrix(initialWeights.length, 1), 0, 1, regParam)._2 and for regGradient, it can be obtained by val regGradient = weights.sub( updater.compute(weights, new DoubleMatrix(initialWeights.length, 1), 1, 1, regParam)._1) The PR includes the tests which compare the result with SGD with/without regularization. We did a comparison between LBFGS and SGD, and often we saw 10x less steps in LBFGS while the cost of per step is the same (just computing the gradient). The following is the paper by Prof. Ng at Stanford comparing different optimizers including LBFGS and SGD. They use them in the context of deep learning, but worth as reference. http://cs.stanford.edu/~jngiam/papers/LeNgiamCoatesLahiriProchnowNg2011.pdf Author: DB Tsai <[email protected]> Closes #353 from dbtsai/dbtsai-LBFGS and squashes the following commits: 984b18e [DB Tsai] L-BFGS Optimizer based on Breeze's implementation. Also fixed indentation issue in GradientDescent optimizer. (cherry picked from commit 6843d63) Signed-off-by: Patrick Wendell <[email protected]>

@mengxr

This PR uses Breeze's L-BFGS implement, and Breeze dependency has already been introduced by Xiangrui's sparse input format work in SPARK-1212. Nice work, @mengxr ! When use with regularized updater, we need compute the regVal and regGradient (the gradient of regularized part in the cost function), and in the currently updater design, we can compute those two values by the following way. Let's review how updater works when returning newWeights given the input parameters. w' = w - thisIterStepSize * (gradient + regGradient(w)) Note that regGradient is function of w! If we set gradient = 0, thisIterStepSize = 1, then regGradient(w) = w - w' As a result, for regVal, it can be computed by val regVal = updater.compute( weights, new DoubleMatrix(initialWeights.length, 1), 0, 1, regParam)._2 and for regGradient, it can be obtained by val regGradient = weights.sub( updater.compute(weights, new DoubleMatrix(initialWeights.length, 1), 1, 1, regParam)._1) The PR includes the tests which compare the result with SGD with/without regularization. We did a comparison between LBFGS and SGD, and often we saw 10x less steps in LBFGS while the cost of per step is the same (just computing the gradient). The following is the paper by Prof. Ng at Stanford comparing different optimizers including LBFGS and SGD. They use them in the context of deep learning, but worth as reference. http://cs.stanford.edu/~jngiam/papers/LeNgiamCoatesLahiriProchnowNg2011.pdf Author: DB Tsai <[email protected]> Closes #353 from dbtsai/dbtsai-LBFGS and squashes the following commits: 984b18e [DB Tsai] L-BFGS Optimizer based on Breeze's implementation. Also fixed indentation issue in GradientDescent optimizer.

@mengxr

This PR uses Breeze's L-BFGS implement, and Breeze dependency has already been introduced by Xiangrui's sparse input format work in SPARK-1212. Nice work, @mengxr ! When use with regularized updater, we need compute the regVal and regGradient (the gradient of regularized part in the cost function), and in the currently updater design, we can compute those two values by the following way. Let's review how updater works when returning newWeights given the input parameters. w' = w - thisIterStepSize * (gradient + regGradient(w)) Note that regGradient is function of w! If we set gradient = 0, thisIterStepSize = 1, then regGradient(w) = w - w' As a result, for regVal, it can be computed by val regVal = updater.compute( weights, new DoubleMatrix(initialWeights.length, 1), 0, 1, regParam)._2 and for regGradient, it can be obtained by val regGradient = weights.sub( updater.compute(weights, new DoubleMatrix(initialWeights.length, 1), 1, 1, regParam)._1) The PR includes the tests which compare the result with SGD with/without regularization. We did a comparison between LBFGS and SGD, and often we saw 10x less steps in LBFGS while the cost of per step is the same (just computing the gradient). The following is the paper by Prof. Ng at Stanford comparing different optimizers including LBFGS and SGD. They use them in the context of deep learning, but worth as reference. http://cs.stanford.edu/~jngiam/papers/LeNgiamCoatesLahiriProchnowNg2011.pdf Author: DB Tsai <[email protected]> Closes apache#353 from dbtsai/dbtsai-LBFGS and squashes the following commits: 984b18e [DB Tsai] L-BFGS Optimizer based on Breeze's implementation. Also fixed indentation issue in GradientDescent optimizer.

Small upstream bump

Enable SSL to test manageiq-providers-openstack-test-public-clouds

…ndException (apache#353)

mengxr reviewed Apr 8, 2014
View reviewed changes

dbtsai changed the title ~~SPARK-1157: L-BFGS Optimizer based on Breeze's implementation.~~ [SPARK-1157][MLlib] L-BFGS Optimizer based on Breeze's implementation. Apr 9, 2014

mengxr reviewed Apr 9, 2014
View reviewed changes

dbtsai closed this Apr 15, 2014

dbtsai reopened this Apr 15, 2014

dbtsai closed this Apr 15, 2014

dbtsai deleted the dbtsai-LBFGS branch April 15, 2014 18:45

dbtsai restored the dbtsai-LBFGS branch April 15, 2014 18:49

dbtsai deleted the dbtsai-LBFGS branch April 15, 2014 20:40

mccheah pushed a commit to mccheah/spark that referenced this pull request Oct 3, 2018

Merge pull request apache#353 from palantir/rk/upstream-bump

3bf8055

Small upstream bump

bzhaoopenstack pushed a commit to bzhaoopenstack/spark that referenced this pull request Sep 11, 2019

Merge pull request apache#353 from theopenlab/fix-timeout

eb7e201

Enable SSL to test manageiq-providers-openstack-test-public-clouds

arjunshroff pushed a commit to arjunshroff/spark that referenced this pull request Nov 24, 2020

[SPARK-318] Submitting Spark jobs from Oozie fails due to ClassNotFou…

6829f7a

…ndException (apache#353)

fishcus added a commit to fishcus/spark that referenced this pull request Nov 26, 2021

KE-31587, checkInterrupted when spark init CartesianRDD (apache#353)

203b89d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-1157][MLlib] L-BFGS Optimizer based on Breeze's implementation. #353

[SPARK-1157][MLlib] L-BFGS Optimizer based on Breeze's implementation. #353

dbtsai commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

mengxr commented Apr 8, 2014

mengxr Apr 8, 2014

dbtsai Apr 8, 2014

dbtsai Apr 8, 2014

dbtsai commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

mengxr Apr 9, 2014

dbtsai commented Apr 15, 2014

dbtsai commented Apr 15, 2014

AmplabJenkins commented Apr 15, 2014

AmplabJenkins commented Apr 15, 2014

AmplabJenkins commented Apr 15, 2014

AmplabJenkins commented Apr 15, 2014

mengxr commented Apr 15, 2014

AmplabJenkins commented Apr 15, 2014

AmplabJenkins commented Apr 15, 2014

AmplabJenkins commented Apr 15, 2014

AmplabJenkins commented Apr 15, 2014

pwendell commented Apr 15, 2014


		package org.apache.spark.mllib.optimization

		import scala.Array

[SPARK-1157][MLlib] L-BFGS Optimizer based on Breeze's implementation. #353

[SPARK-1157][MLlib] L-BFGS Optimizer based on Breeze's implementation. #353

Conversation

dbtsai commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

mengxr commented Apr 8, 2014

mengxr Apr 8, 2014

Choose a reason for hiding this comment

dbtsai Apr 8, 2014

Choose a reason for hiding this comment

dbtsai Apr 8, 2014

Choose a reason for hiding this comment

dbtsai commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

AmplabJenkins commented Apr 8, 2014

mengxr Apr 9, 2014

Choose a reason for hiding this comment

dbtsai commented Apr 15, 2014

dbtsai commented Apr 15, 2014

AmplabJenkins commented Apr 15, 2014

AmplabJenkins commented Apr 15, 2014

AmplabJenkins commented Apr 15, 2014

AmplabJenkins commented Apr 15, 2014

mengxr commented Apr 15, 2014

AmplabJenkins commented Apr 15, 2014

AmplabJenkins commented Apr 15, 2014

AmplabJenkins commented Apr 15, 2014

AmplabJenkins commented Apr 15, 2014

pwendell commented Apr 15, 2014