Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Working smf #151

Open
wants to merge 18 commits into
base: master
Choose a base branch
from
Open

Working smf #151

wants to merge 18 commits into from

Conversation

DanielTakeshi
Copy link
Contributor

@jcanny Here's what I did for this pull request:

The goal here is to be able to test the SMF.scala code on data (such as netflix, which I use here) using ADAGrad for stochastic gradient updates. This is different from SFA.scala which will internally use a more complicated conjugate gradient updater which I don't know about.

Stuff added:

  • A test script testsmf_v2.ssc as proof of concept.

Stuff modified:

  • SFA.scala to remove learner methods which explicitly form an updater as input. The reason is that the SFA code already calls ADAGrad updaters internally (and uses conjugate gradient as well) so there is no need to add more gradient updates to it.

  • SMF.scala to remove learner methods which do not explicitly form an updater as input. It's the reverse reason for SMF as compared to SFA, because SMF needs an updater to be formed. I added a learner which will form an ADAGrad updater, though the default parameters may be bad.

I also added a predictor method which will explicitly form an empty user matrix like the SFA predictor, and I needed a second evalfun in SMF which will use that matrix.

Finally, I modified the default evalfun for training so that it assigns omats, just like in SFA. This is the only thing that might affect existing code (other than if code was calling the outdated learners).

I added minor documentation.

Test results:

On the netflix data, I can get a RMSE of 0.90 with 1 pass using SMF.scala and the settings in testsmf_v2.ssc. Unfortunately, RMSE rises to 0.95 and 0.97 with 2 and 3 passes, respectively, and soon, it becomes no better than doing SMF with a second factor matrix of all zeros (this is the matrix of size (opts.dim, 480k)).

The results are disconcerting, but these may be because I'm not using good settings for ADAGrad. We should ideally be getting RMSE roughly 0.85 or so. Perhaps there is something else we need in this SMF code?

@DanielTakeshi
Copy link
Contributor Author

I managed to figure out a way to get RMSE of 0.845380, with the following settings:

import BIDMach.models.SMF
val dir = "/data/netflix/"
val a = loadSMat(dir+"newtrain.smat.lz4")
val ta = loadSMat(dir+"newtest.smat.lz4")
val d = 256 
val (nn,opts) = SMF.learner1(a, d)

opts.batchSize = 2000
opts.uiter = 5 
opts.urate = 0.05f
opts.lrate = 0.05
opts.npasses = 2 
val lambda = 4f
opts.lambdau = lambda;
opts.regumean = lambda;
opts.lambdam = lambda / 500000 * 20; 
opts.regmmean = opts.lambdam
opts.evalStep = 31
opts.doUsers = false
opts.lsgd = 0.010f
opts.what
nn.train

val model = nn.model.asInstanceOf[SMF]
val xa = (ta != 0)
val (mm, mopts) = SMF.predictor1(model, a, xa);
mopts.batchSize = 10000
mopts.uiter = 5 
mopts.urate = opts.urate
mopts.lsgd = 0.0f
mm.predict

val pa = SMat(mm.preds(1));
min(pa.contents,5,pa.contents)
max(pa.contents,1,pa.contents)
val diff = ta.contents - pa.contents
val rmse = sqrt((diff ^* diff) / diff.length)
println("rmse = %f" format rmse.v);

The performance of the algorithm seems to be very sensitive to the settings, so one should set them carefully.

Now the next step is to figure out how to use ADAGrad. A couple of points: (1) this will assume that sigma^2 can be computed in one minibatch, (2) this will assume IID components in the matrix, which is clearly violated with, say, one user's column, (3) results are odd, I don't know why RMSE is rougly 0.7 on the training when there's barely any learning signal, (4) RMSE on test set oddly increases, 0.91 to 1 as you increase the MB size. I'm still lost on this. =(
My confusion before was about how the model matrices were still updating even though I wasn't accepting anything in the updater. It turns out that the SMF code will update it in the mupdate method. Ugh ...
I resolved my earlier questions. Now, let's TRY to get MHTest to work on this ... gulp.
two issues: (1) CPU allocation, and (2) we always seem to be accepting, I've only seen one time where it rejected. Does that make sense? Note that I had to cut a threshold of -15 as the log prob for SMF, which should be OK.
@DanielTakeshi
Copy link
Contributor Author

Update: don't merge this into master yet. I am doing some more work offline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant