-
Notifications
You must be signed in to change notification settings - Fork 208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
missing API against TH #70
Comments
Hey @soumith , so running |
Yes they are yet to be implemented on cutorch. I'll get to that next week. They can be done with a thrust:: scan |
Ok cool. On Sat, Nov 8, 2014 at 12:27 PM, Soumith Chintala [email protected]
|
Does anyone have an implementation of std and var yet? Otherwise I'll have a go next week. |
no, do not have it. this has been long overdue, but we should co-ordinate these, let me email you guys. |
Any progress on cumsum and cumprod? |
have not started them yet! |
Okay, I might have a go soon then, because we'd like to use it. Just waiting for the THCState PR to be merged. |
I will merge the THCState PR on Friday. All the patches have been prepared except for fbcunn, working on that as well. |
@soumith For reductions along a single dimension, we still have the restriction that the tensor must not have more than 4 dimensions. Did you say you're working on a fix for that? If so, what's the progress on it? |
@wickedfoo already has PR for that internally. Its all implemented. I'm on
|
His PR is for apply (and apply2) along an arbitrary dimension. Not reductions. It generalizes the copy kernels and changes all the tensor math to use these apply kernels where appropriate instead of make contiguous + thrust |
That's the status on that. He did not work yet on arbitrary reductions. If you want to tackle that, go for it. |
Thanks for the update. I'll have a go at the reductions kernel then. |
I'm on vacation too until next week but I don't think the generic apply On Thursday, February 19, 2015, Dominik Grewe [email protected]
|
If I remember correctly, you guys said you'd look into maskedFill etc, right? Any progress on that? |
for maskedFill etc. do you want the mask to be a float vector (because that's the only thing we have in cutorch at present), or do you want it to be 4 bytes packed into a float? |
I guess float vectors make the most sense, because that's what logical functions (gt, ge etc) return. |
I have maskedFill/Copy/Select done, and sort() I have power-of-2 sizes at present (but on input with an arbitrary number of dimensions), so still working on that. maskedFill, maskedCopy and sort avoid newContiguous on the input, but maskedSelect I chickened out and just used two passes and temporary space with a Thrust prefix scan. Re: "For reductions along a single dimension, we still have the restriction that the tensor must not have more than 4 dimensions. Did you say you're working on a fix for that? If so, what's the progress on it?" I have this fixed as well, took the copy kernel code and made a reduction kernel out of it, so no calls to newContiguous/copies etc. needed. Not a global reduction kernel (like a norm that reduces down to one point, but reduces along a dimension. sort() exploits similar code. I want to do the same shared memory optimization (so I can use coalesced reads) that you did if the reduction dimension is innermost/most contiguous though. |
Cool, looking forward to that. Yes, using the shared memory approach for reductions along contiguous dimensions is vital. |
When do you think you'll have a PR for maskedFill etc? |
it's in review, and jeff is still working on revamping our code-base to the state argument based change. And the cutorch TensorMath changes that remove most of the sync points on non-contiguous cases will also land at the same time. |
Any progress on the masked* functions? |
Theyre implemented. We are working on refactoring our code and syncing.with master, we will try to merge them this week. |
An update:
It will either be EOD today or most likely Monday/Tuesday. |
Looks like what's left is the small fish.
|
There are a number of linear algebra functions missing: symeig, eig, inverse etc. In Torch they seem to be implemented by wrapping Lapack. Could we do something similar for cutorch? There's MAGMA and CULA; does anyone have experience with these libraries? |
MAGMA looks best, we built MAGMA internally and it looks reasonably good. |
Also, on the CuDNN note, we can configure a header (like THGeneral.h.in ) if we find cudnn. Caffe has the cmake macros needed for finding cudnn already written: https://github.com/BVLC/caffe/blob/master/cmake/Cuda.cmake |
Hi guys. Noticed this thread when dealing with a script that needs diag, svd and eig on CudaTensors. I implemented diag myself in Lua using storage() and set(), but svd and eig are beyond my ken. What's the plan for that? |
One of my colleagues @SamGross is working on it by interfacing the magma cuda library. It'll happen over the next month or so when he finishes it up and sends a PR. |
This is not on this list, but I'm in the process of implementing THCudaTensor_multinomial as well. |
@wickedfoo Awesome. Would love to see multinomial in cuda. |
Just to confirm, scatter/gather arent implemented in cutorch, right? |
That's right. I meant to do it, but haven't had the time yet, sorry. |
For
Does that sound about right? Anything else I should bear in mind if I write a naive (Edit: what do you think is the most similar existing class/kernel to base this off? and/or thoughts on where to put this, ie filename(s)?) |
Magma looks cool. Has opencl version too it seems :-) |
Here is a |
Shoe-horned the lua wrapper into TensorMath.lua: hughperkins/cltorch@0e469f4 |
Did scatter too, since seems like more of the same?
(Edit: and scatterFill, in same files) |
Is there any update on adding these functions? I'm in need of the cross product on cudatensors. I can code it up if someone can point me towards the things that need to be done. All the layers I've written so far are on the lua side and I'm not sure how to make the connection between cuda and lua. Thanks! |
@abyravan i could get to cross next week. |
That would be great. Thanks a lot! Is there any sort of a tutorial or howto on adding new functionality for tensors? Would be useful to have :) |
nonzero is being implemented by FB, should be out in a few days. |
** NOTE on API changes and versioning ** Cutorch provides a CUDA backend for torch7. Cutorch provides the following: a new tensor type: torch.CudaTensor that acts like torch.FloatTensor, but all it's operations are on the GPU. Most of the tensor operations are supported by cutorch. There are a few missing ones, which are being implemented. The missing list can be found here: torch/cutorch#70 several other GPU tensor types, with limited functionality. Currently limited to copying/conversion, and several indexing and shaping operations. cutorch.* - Functions to set/get GPU, get device properties, memory usage, set/get low-level streams, set/get random number generator's seed, synchronization etc. They are described in more detail belo
Hi guys, any hope on implementing the conv2 (or xcorr2 for that matter)? |
Just adding a note that |
Thanks for supporting many of the math functions in |
The following math functions are missing in THC but present in TH:
When these are implemented, cwrap entries can be added that would make cutorch completely API compatible with torch
The text was updated successfully, but these errors were encountered: