Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

creg2: use sparse updates #7

Open
nschneid opened this issue Mar 13, 2014 · 3 comments
Open

creg2: use sparse updates #7

nschneid opened this issue Mar 13, 2014 · 3 comments

Comments

@nschneid
Copy link
Collaborator

Perhaps with scipy data structures: http://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.dok_matrix.html#scipy.sparse.dok_matrix

@brendano
Copy link
Collaborator

scipy sparse matrixes are the only game in town for numpy world. the documentation and stuff is a little disappointing, though

@nschneid
Copy link
Collaborator Author

Via scikit-learn docs, I think dropping the .toarray() from

X = X_dict.fit_transform(X).toarray()
will leave it as a sparse matrix. Does that work? @as1986, might be worth trying to see if it saves memory and speeds things up.

@nschneid
Copy link
Collaborator Author

I have something that is hopefully a step in the direction of a working implementation with sparse matrices. https://gist.github.com/nschneid/9748235 (includes non-sparse code for comparison)

The sparse code is much slower than the original code on the iris dataset, which, after all, has dense instances. I have not tested it on a text dataset, and there are probably inefficiencies (e.g., I am making a sparse matrix for Hsqrt to get the division to work out even though it is really dense; also there is less reuse of matrix instances due to limitations in the APIs, though there may be better workarounds).

For testing purposes, I have modified the code to only look at the first 10 training examples rather than sampling a different minibatch for each iteration. On the iris dataset, the loss values that the sparse code prints are close but not exactly matching those of the original code. Unclear whether this is a bug or a numerical precision issue.

This code does not implement regularization. To do sparse L2 regularization there is a trick from Alex Smola's blog that I explain in my features document.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants