Working smf #151

DanielTakeshi · 2017-02-16T18:24:14Z

@jcanny Here's what I did for this pull request:

The goal here is to be able to test the SMF.scala code on data (such as netflix, which I use here) using ADAGrad for stochastic gradient updates. This is different from SFA.scala which will internally use a more complicated conjugate gradient updater which I don't know about.

Stuff added:

A test script testsmf_v2.ssc as proof of concept.

Stuff modified:

SFA.scala to remove learner methods which explicitly form an updater as input. The reason is that the SFA code already calls ADAGrad updaters internally (and uses conjugate gradient as well) so there is no need to add more gradient updates to it.
SMF.scala to remove learner methods which do not explicitly form an updater as input. It's the reverse reason for SMF as compared to SFA, because SMF needs an updater to be formed. I added a learner which will form an ADAGrad updater, though the default parameters may be bad.

I also added a predictor method which will explicitly form an empty user matrix like the SFA predictor, and I needed a second evalfun in SMF which will use that matrix.

Finally, I modified the default evalfun for training so that it assigns omats, just like in SFA. This is the only thing that might affect existing code (other than if code was calling the outdated learners).

I added minor documentation.

Test results:

On the netflix data, I can get a RMSE of 0.90 with 1 pass using SMF.scala and the settings in testsmf_v2.ssc. Unfortunately, RMSE rises to 0.95 and 0.97 with 2 and 3 passes, respectively, and soon, it becomes no better than doing SMF with a second factor matrix of all zeros (this is the matrix of size (opts.dim, 480k)).

The results are disconcerting, but these may be because I'm not using good settings for ADAGrad. We should ideally be getting RMSE roughly 0.85 or so. Perhaps there is something else we need in this SMF code?

DanielTakeshi · 2017-02-16T23:50:21Z

I managed to figure out a way to get RMSE of 0.845380, with the following settings:

import BIDMach.models.SMF
val dir = "/data/netflix/"
val a = loadSMat(dir+"newtrain.smat.lz4")
val ta = loadSMat(dir+"newtest.smat.lz4")
val d = 256 
val (nn,opts) = SMF.learner1(a, d)

opts.batchSize = 2000
opts.uiter = 5 
opts.urate = 0.05f
opts.lrate = 0.05
opts.npasses = 2 
val lambda = 4f
opts.lambdau = lambda;
opts.regumean = lambda;
opts.lambdam = lambda / 500000 * 20; 
opts.regmmean = opts.lambdam
opts.evalStep = 31
opts.doUsers = false
opts.lsgd = 0.010f
opts.what
nn.train

val model = nn.model.asInstanceOf[SMF]
val xa = (ta != 0)
val (mm, mopts) = SMF.predictor1(model, a, xa);
mopts.batchSize = 10000
mopts.uiter = 5 
mopts.urate = opts.urate
mopts.lsgd = 0.0f
mm.predict

val pa = SMat(mm.preds(1));
min(pa.contents,5,pa.contents)
max(pa.contents,1,pa.contents)
val diff = ta.contents - pa.contents
val rmse = sqrt((diff ^* diff) / diff.length)
println("rmse = %f" format rmse.v);

The performance of the algorithm seems to be very sensitive to the settings, so one should set them carefully.

Now the next step is to figure out how to use ADAGrad. A couple of points: (1) this will assume that sigma^2 can be computed in one minibatch, (2) this will assume IID components in the matrix, which is clearly violated with, say, one user's column, (3) results are odd, I don't know why RMSE is rougly 0.7 on the training when there's barely any learning signal, (4) RMSE on test set oddly increases, 0.91 to 1 as you increase the MB size. I'm still lost on this. =(

My confusion before was about how the model matrices were still updating even though I wasn't accepting anything in the updater. It turns out that the SMF code will update it in the mupdate method. Ugh ...

I resolved my earlier questions. Now, let's TRY to get MHTest to work on this ... gulp.

two issues: (1) CPU allocation, and (2) we always seem to be accepting, I've only seen one time where it rejected. Does that make sense? Note that I had to cut a threshold of -15 as the log prob for SMF, which should be OK.

DanielTakeshi · 2017-03-18T21:37:37Z

Update: don't merge this into master yet. I am doing some more work offline.

… can track acceptance rates

Now let's wait to see what John thinks of the proposal issue funcitno. Then I can benchmark with different tests but same hyperparameters for withMHTest and noMHTest

To check out performance at different settings, etc.

Now let me switch focus to MALA ...

DanielTakeshi added 2 commits February 16, 2017 10:15

Tentative working SMF

3243d24

Renamed file

53051bc

DanielTakeshi mentioned this pull request Feb 18, 2017

SMF.scala fails on provided SMF.learner method due to ADAGrad options not being initialized correctly #149

Closed

DanielTakeshi added 7 commits March 8, 2017 09:15

changed *@ to ddot since doubles are more precise than floats

af060e0

Slight updates, mostly debugging.

9c22543

My confusion before was about how the model matrices were still updating even though I wasn't accepting anything in the updater. It turns out that the SMF code will update it in the mupdate method. Ugh ...

Really confused. Better ask John. =(

bbbeba8

Wow, I think I finally get SMF ... well, the main idea.

1d1bb21

I resolved my earlier questions. Now, let's TRY to get MHTest to work on this ... gulp.

More documentation to myself. Not ready for integration into master

8bd3018

Updated MH Test, I think ADAGrad and SMF work now, but ...

0ab7a3e

two issues: (1) CPU allocation, and (2) we always seem to be accepting, I've only seen one time where it rejected. Does that make sense? Note that I had to cut a threshold of -15 as the log prob for SMF, which should be OK.

DanielTakeshi added 9 commits March 18, 2017 15:23

OK the memory allocation stuff is fine, not really worried now. And I…

b1178e4

… can track acceptance rates

OK enough debugprints. Now figure out what to do for thep aper

ec31c76

OK this should be the style of script to look for different values.

ae2472a

Now let's wait to see what John thinks of the proposal issue funcitno. Then I can benchmark with different tests but same hyperparameters for withMHTest and noMHTest

Let's try this to run these in batch mode.

9bb1b30

To check out performance at different settings, etc.

more slight script updates

888d677

fixed logu --> psi

0701ec6

Tried my way with energy function and momentum, wasn't working. =(

b0f0aeb

Now let me switch focus to MALA ...

Well, got MALA but its not really working ..

54cbc2c

Fixed bug, 1/(4*tau). Better to just use sigma which is what I do.

685822d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Working smf #151

Working smf #151

DanielTakeshi commented Feb 16, 2017

DanielTakeshi commented Feb 16, 2017

DanielTakeshi commented Mar 18, 2017

Working smf #151

Are you sure you want to change the base?

Working smf #151

Conversation

DanielTakeshi commented Feb 16, 2017

DanielTakeshi commented Feb 16, 2017

DanielTakeshi commented Mar 18, 2017