-
Notifications
You must be signed in to change notification settings - Fork 145
Speedup pysparnn with CUDA #10
Comments
Hi @vmarkovtsev ! I just had a look at both projects and they are both extremely impressive. The Hashing approaches used in minhash cuda and datasketch are an alternative method to what pysparnn does. Since you are already familiar with minhashing I'll try to describe the process for pysparnn:
The above can actually be done recursively to optimize for matrix multiply size (instead of 2 levels you can do N levels). Sometimes the data is not distributed evenly across the clusters (parts of the space are much more dense than others) so this recursion can be important. The top level is represented as a matrix. Search is done by iteratively finding the closest cluster and then returning the closest root level items. Cosine similarity is implemented as a matrix vector multiply Hopefully you will see how pysparnn takes a different approach from sketching/min-hash. Comparing to the LSH module in sk-learn, pysparnn tends to perform better in terms of speed and recall on a single CPU (I think the LSH code is much more -easily- scaleable in theory). Seek the examples/ directory for the datasets I used. I have not compared to datasketch so I don't know that benchmark =) Having said all that - I think pysparnn could benefit from GPUs as the search resorts to iterative matrix X matrix multiplies. The lib was designed to be easily extendable. If you are interested, all you would have to do is provide a custom subclass of MatrixMetricSearch - https://github.com/facebookresearch/pysparnn/blob/master/pysparnn/matrix_distance.py#L133 I would gladly take a PR to include CUDA code though we should chat about where the contribution would go in the repository. In any case, I am a fan of your work and your article - https://blog.sourced.tech/post/minhashcuda/ Very nice! |
@spencebeecher Great! Thank you very much for this warm feedback! You made my day. I would contribute CUDA code with pleasure, provided by your colleagues from Faiss are not up to it :-) If everything boils down to matrix multiplication, I will be able to use cuBLAS GEMM getting the peak perf. I think if the matrix size becomes small (say, 16), it is faster to perform the multiplication with AVX instead, so it is going to be fun. Renaming this issue to adjust the intent. |
Actually you bring up a great point - I missed the OS release of Fiass! The Faiss team already has a GPU implementation and they implement a number of methods that include one similar to what pysparnn has. I will still take the PR but I would comment that Faiss already has that functionality and is generally more performant since they have a C++ implementation. If you have to 'make' I would go with Fiass. I will say that pysparnn is still targeted towards sparse vectors so if you have a sparse GPU impl in mind that would be interesting. I think Fiass would be interested too but they are more targeted towards dense vectors - (you would have to confirm future direction with them). |
Any updates? If i dont hear back in a week i am going to close this one out! |
Sorry, missed this. Indeed, Faiss works with dense vectors, and the sparse input support is very valuable. We've got cuSPARSE which is able to accelerate the ops and we've got wrappers. Honestly, I don't think I will have time to play with it before April - but I would love to. You may close this and I will reopen when ready. |
Great - Thanks for introducing me to such cool projects @vmarkovtsev ! |
Hi!
I think this project is similar to the toolchain we are currently using in our production - https://github.com/ekzhu/datasketch + https://github.com/src-d/minhashcuda Do you think it is possible to leverage minhashcuda in pysparnn?
The text was updated successfully, but these errors were encountered: