A Java implementation of libFM: Factorization Machine Library
Factorization machines (FM) are a generic approach that allows to mimic most factorization models by feature engineering. This way, factorization machines combine the generality of feature engineering with the superiority of factorization models in estimating interactions between categorical variables of large domain. libFM is a software implementation for factorization machines that features stochastic gradient descent (SGD) and alternating least squares (ALS) optimization as well as Bayesian inference using Markov Chain Monte Carlo (MCMC). JLibFM is the Java version of LibFM.
Please note that the code is written in Java, and this project is a Maven project.
- In addition, this project has no third party dependency
Please go to the project folder and run the command "mvn clean package -Dmaven.test.skip=true", then we will get two archive files in the sub folder "target", one is "JLibFM-0.1-SNAPSHOT-jar-with-dependencies.jar". Now we can prepare the dataset. In the current version, only LibSVM format is supported. There is a Java class com.github.gaterslebenchen.libfm.examples.MovieLens1MFormater in this project, which shows us how to format MovieLens 1M Dataset to LibSVM format.
(1) an example of Stochastic Gradient Descent (SGD) method:
* java -Xms1024M -Xmx2048M -jar JLibFM-0.1-SNAPSHOT-jar-with-dependencies.jar -task r -train ratings_train.libfm -test ratings_test.libfm -dim 1,1,8 -iter 30 -method sgd -learn_rate 0.01 -regular 0,0,0.1 -init_stdev 0.1 -rlog log.txt -verbosity 1
(2) an example of Alternating Least Squares (ALS) method:
* java -Xms1024M -Xmx2048M -jar JLibFM-0.1-SNAPSHOT-jar-with-dependencies.jar -task r -train ratings_train.libfm -test ratings_test.libfm -dim 1,1,8 -iter 20 -method als -regular 0,0,10 -init_stdev 0.1 -rlog log.txt -verbosity 1
(3) an example of Markov Chain Monte Carlo (MCMC) method:
* java -Xms1024M -Xmx2048M -jar JLibFM-0.1-SNAPSHOT-jar-with-dependencies.jar -task r -train ratings_train.libfm -test ratings_test.libfm -dim 1,1,8 -iter 20 -method mcmc -init_stdev 0.1 -rlog log.txt -verbosity 1
(4) an example of Adaptive SGD (SGDA) method:
* java -Xms1024M -Xmx2048M -jar JLibFM-0.1-SNAPSHOT-jar-with-dependencies.jar -task r -train ratings_train.libfm -test ratings_test.libfm -validation ratings_valid.libfm -dim 1,1,8 -iter 20 -method sgda -init_stdev 0.1 -learn_rate 0.01 -rlog log.txt -verbosity 1
I built the dataset with MovieLens 1M Dataset. Here are some evaluation data:
Method | RMSE | Dim | Iter |
---|---|---|---|
MCMC | 0.8606 | 1,1,8 | 20 |
ALS | 0.8511 | 1,1,8 | 20 |
SGD | 0.8838 | 1,1,8 | 30 |
SGDA | 0.9046 | 1,1,8 | 20 |
The JUnit TestCase com.github.gaterslebenchen.libfm.TestSaveandLoadModel shows how to save and load model for SGD method.