-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Factorization machines #172
base: user_item_features
Are you sure you want to change the base?
Factorization machines #172
Conversation
dumping is now done with pickle 'highest protocol'
added asym_rmse and asym_mae
…aset Revert "Revert "Features dataset""
Lasso prediction algorithm
[GSF] Syncing Fork
I have added three new factorization machine algos. They are many more possible but most of them can be accomplished by using the features. Additional ones could also be conceived when the library will support context (user-item pair features such as timestamp, location, etc.). I would like these algos to be modular such that you can turn on/off implicit information, features, etc. I guess the best way would be to create the sparse lists in By the way, the special value for |
Thanks a lot, Once again I really appreciate the efforts with the docs and the tests. I'm definitely interested in adding FM into surprise! This is a lot of code for me to digest though ^^ and I don't have tons of free time ATM (should be easier in the following months), so I just wanted to make sure you know that the review process may take long.
I personally like it when there's a single uniform interface to deal with, but it should still be easy to use. Like, if there are lots of incompatible parameters in a single class, maybe it's best to separate them into different classes. I'll leave it to your own appreciation to decide what's best here. Are you actually using the FM algos you implemented? If so, with what dataset? I'd like to play around with them to get a feel of how to use them, that would make the understanding of all the code (especially the feature part) a lot easier for me. Thanks! |
Here is a basic factorization machine algorithm that takes into account only the user and item ids. It is equivalent to
SVD
when usingdegree=2
. I have implemented this algorithm with thetffm
library as well as thepolylearn
library for testing purpose. I found that thetffm
is the preferable one given the different options it allows. To be used withGridSearchCV
andRandomizedSearchCV
, it however requires a special value for thesession_config
argument (see doc).It's yet unclear to me what should be good default values for the algorithm that would work in most settings. Currently, it appears that both algorithms are slow while I would have though that using
tensorflow
would be fast...This PR also contains tests for the feature option to
Dataset
,Trainset
, etc.I am planning to construct more elaborate factorization machine algorithms. The tests for the factorization machine algorithms will follow.