Readme

arthurmensch · Jan 17, 2017 · 0258cb6 · 0258cb6
1 parent 4d50df2
commit 0258cb6
Show file tree

Hide file tree

Showing 2 changed files with 41 additions and 44 deletions.
diff --git a/README.md b/README.md
@@ -3,23 +3,27 @@
 [![Travis](https://travis-ci.org/arthurmensch/modl.svg?branch=master)](https://travis-ci.org/arthurmensch/modl)
 [![Coveralls](https://coveralls.io/repos/github/arthurmensch/modl/badge.svg?branch=master)](https://coveralls.io/github/arthurmensch/modl?branch=master)
 
-This python package ([webpage](https://github.com/arthurmensch/modl)) implements our ICML'16 paper:
+This python package ([webpage](https://github.com/arthurmensch/modl)) implements our two papers from 2016:
 
 >Arthur Mensch, Julien Mairal, Bertrand Thirion, Gaël Varoquaux.
-Dictionary Learning for Massive Matrix Factorization. International Conference
+[Stochastic Subsampling for Factorizing Huge Matrices](https://hal.archives-ouvertes.fr/hal-01431618v1). <hal-01431618> 2017.
+
+>Arthur Mensch, Julien Mairal, Bertrand Thirion, Gaël Varoquaux.
+[Dictionary Learning for Massive Matrix Factorization](https://hal.archives-ouvertes.fr/hal-01308934v2). International Conference
  on Machine Learning, Jun 2016, New York, United States. 2016
 
 It allows to perform sparse / dense matrix factorization on fully-observed/missing data very efficiently, by leveraging random sampling with online learning.
+It is able to factorize matrices of terabyte scale with hundreds of components in the latent space in a few hours.
 
-Reference paper is available on [HAL](https://hal.archives-ouvertes.fr/hal-01308934) / [arxiv](http://arxiv.org/abs/1605.00937). This package allows to reproduce the
+This package allows to reproduce the
  experiments and figures from the papers.
 
 More importantly, it provides [https://github.com/scikit-learn/scikit-learn](scikit-learn) compatible
  estimators that fully implements the proposed algorithms.
 
 ## Installing from source with pip
 
-Installation from source is simple In a command prompt:
+Installation from source is simple. In a command prompt:
 
 ```
 git clone https://github.com/arthurmensch/modl.git
@@ -30,66 +34,58 @@ cd $HOME
 py.test --pyargs modl
 ```
 
-## Examples
+## Core code
 
-Two simple examples runs out-of-the box. Those are a good basis for understanding the API of `modl` estimators.
-  - ADHD (rfMRI) sparse decomposition, relying on [nilearn](https://github.com/nilearn/nilearn)
-  ```
-  python examples/adhd_decompose.py
-  ```
-  - Movielens (User/Movie ratings) prediction
-   ```
-  python examples/recsys_predict.py
-  ```
-
-For Movielens example, you will need to download the dataset, from [spira repository](https://github.com/mblondel/spira).
-```
-make download-movielens10m
-```
+The package essentially provides three estimators:
 
-## Experiments
+- `DictFact`, that computes a matrix factorization from Numpy arrays
+- `fMRIDictFact`, that computes sparse spatial maps from fMRI images
+- `ImageDictFact`, that computes a patch dictionary from an image
+- `RecsysDictFact`, that allows to predict score from a collaborative filtering approach
 
-### Recommender systems
 
-Recommender systems experiments can be reproduced running the following command in the root repository.
+## Examples
+
+### fMRI decomposition
+
+A fast running example that decomposes a small dataset of resting-fmri data into a 70 components map is provided
 
 ```
-python examples/experimental/recsys/recsys_compare.py
+python examples/recsys_compare.py
 ```
 
-You will need to download datasets beforehand:
+It can be adapted for running on the 2TB HCP dataset, by changing the source parameter into 'hcp' (you will need to download the data first)
+
+### Hyperspectral images
+
+A fast running example that extracts the patches of a HD image can be run from
 
 ```
-make download-movielens1m
-make download-movielens10m
-make download-netflix
+python examples/decompose_image.py
 ```
 
-### HCP decomposition
+It can be adapted to run on AVIRIS data, changing the image source into 'aviris' in the file.
+
+### Recommender systems
 
-You will need to retrieve the S500 release of the [HCP dataset](http://www.humanconnectome.org/data/) in some way
- beforehand. You may use the public S3 bucket, order filled hard-drives, or download it directly.
+Our core algorithm can be run to perform collaborative filtering very efficiently:
 
-Edit `$HCPLOCATION` in the `Makefile` and run
 ```
-make hcp
+python examples/recsys_compare.py
 ```
-to create symlinks and download a useful mask.
 
-The HCP experiment can be reproduced as such:
+You will need to download datasets beforehand:
+
 ```
-# unmask data
-python examples/experiment/fmri/hcp_prepare.py
-# compare methods
-python examples/experiment/fmri/hcp_compare.py
-# analyse convergence
-python examples/experiment/fmri/hcp_analysis.py
-# plot results
-python examples/experiment/fmri/hcp_plot.py
+make download-movielens1m
+make download-movielens10m
 ```
 
-By default, results will be available in `$HOME/output/modl`
+## Future work
 
+- `sacred` dependency will be removed
+- Release a fetcher for HCP from S3 bucker
+- Release examples with larger datasets and benchmarks
 
 ## Contributions
 

diff --git a/modl/__init__.py b/modl/__init__.py
@@ -1,3 +1,4 @@
 from .dict_fact import DictFact
 from .image import ImageDictFact
-from .fmri import fMRIDictFact
+from .fmri import fMRIDictFact
+from .recsys import RecsysDictFact