-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ND convolution with im2col #2049
Conversation
5b75694
to
718802e
Compare
Is this using cudnn v2 as a backend for the Nd-Conv? if that's the case, I think nvidia's nd-conv (only 3D for now) is not as tuned as their 2D conv, from the release notes: "As a BETA preview in this release, the convolution forward, convolution |
No, this doesn't touch CuDNN, it only generalizes the im2col convolution implementation (which predates CuDNN). |
ah, sorry! my bad On Tue, Mar 31, 2015 at 5:34 PM, Jeff Donahue [email protected]
Youssef Barhomi |
Could you provide a demo to show how to use it? Otherwise, there will be huge high learning curve to test and use your work. Thanks! |
@jeffdonahue Does this also support 1D conv? |
@avalada yes, any N >= 0 should theoretically be supported. In practice, 0D convolution -- scalar multiplication -- probably doesn't work, but should and would make a great unit test. I expect 1-10D convolution to work out of the box with this; >10 won't work on GPU -- you'd have to add your case to the switch statements in 718802e. Also, 1D convolution is supported by the current implementation as well; just set either the width or height to 1. Theoretically, doing 1D convolution using an ND implementation could/should be more efficient than using a 2D implementation with a singleton dim, but with the apparently large overhead in the 2D case, I would be surprised if that's the case here -- you're probably better off sticking with the existing 2D implementation. (But I'd be very interested to know the comparison if you decide to benchmark.) |
Hey Jeff, Is there any chance you could link to an example prototxt making use of this pull request? It would be nice to have that to get started. |
I don't think there are changes needed in the prototxt to use this PR. Just set your dims using repeated values in the prototxt. i.e.:
The channel axis defaults to 1 (in this net there are 3 channels). If you want nd kernels just repeat kernel for each dim, instead of using kernel_h,kernel_w. The notes in caffe.proto describe it pretty well. |
Thanks @jmerkow -- there's a slight correction as
Or a full version with DummyData that you should be able to run (didn't test but it should work, possibly needing minor typo fixing):
|
@jeffdonahue thanks for the reference. Here is a debugged version of Jeff's prototxt if anyone else is interested (the layers needed names and the SoftmaxWithLoss layer doesn't like >4d blobs):
|
@jeffdonahue , If you change Line 150 in filler.hpp to remove legacy calls (i.e. use |
Thanks for sharing the prototxt @Russell91. I'm trying to use this with ND data (N>=3). |
@tomdeschamps I would try a hdf5 data layer. I believe those can be used to load ND images. |
Thanks @jmerkow. Yes, I'm trying to load it as ND using the hdf5 data layer, but I have an error in LegacyShape() in blob.hpp:141 "Cannot use legacy accessors on Blobs with > 4 axes". |
Sorry for the trouble -- there are indeed a lot of places in the code that still use the legacy blob dim accessors ( The legacy accessors should be removed from most places, definitely in |
Yes that seems to work with the check removed. I used @jmerkow nd-pooling branch. However HDF5Data seems to be handled very differently (no scaling, not sure we write whole batches in each .h5 file, etc...). Is there a documentation on how caffe deals with this format? |
@jeffdonahue @tomdeschamps tested, check remove can be compile successfully. I'm wondering that nd convolution and nd pooling in @jmerkow can be used together now? |
Hi, is it possible to implement an cudnn 3d convolution? Flatten 3d volumn into 2d dim or something like that. |
@jeffdonahue @Russell91 This is great work! Thanks. Today is my first day brewing Caffe! I'm trying to use this nd-conv with a "Flatten" layer. The "Flatten" layer does not work on blobs with > 4 axes so it fails when the n-dimensions are larger than 2. Is there a fix/solution for this? Would appreciate any advise/help.
|
I disabled the check on line 141 of blob.hpp (shown below) and it ran. Will that cause any problems? Actually looking back at the thread - this seems to be the consensus of others and it make it work.
|
@tomdeschamps, @jmerkow, you guys mentioned about loading ND images with hdf5 earlier. I try to load 3D volume images (Width, height, and Depth) too. I want to make sure a couple of things. I followed your conversations above. I do not think I can convert my 3D image files into lmdb or leveldb. Am I right? It looks like hdf5 data layer is only the way to load my 3D images to ND blob. Have you successfully loaded 3D images? If so, please give me some advise how to load 3D images. |
I think the nd-pooling branch is based on the nd-convolution branch. See On Sat, Jun 13, 2015 at 5:51 AM, dzhwinter [email protected] wrote:
|
@Tgaaly, I assume that you stacked a series of 2D images to make 3D image dataset in hdf5 and list hdf5 files in train.txt and test.txt for each classes. Is it so? Sorry, I have been working on something else. It's been for awhile for getting in touch with you. |
Hi, All I converted 3D images (2 CT-scanned human brain images) into HDF5 datasets with (Number of images, Width, Length, channel [not specified for grayscale]) in python. I created HDF5 dataset files for each 3D volume images and listed these HDF5 files on train.txt and test.txt. Then, I defined a net with below code from caffe/examples/02-brewing-logreg.ipynb Note: below codes are the original codes from the example so I modified this code for my dataset and net. def logreg(hdf5, batch_size): with open('examples/hdf5_classification/logreg_auto_train.prototxt', 'w') as f: with open('examples/hdf5_classification/logreg_auto_test.prototxt', 'w') as f: After that, I run test with below code from the same example caffe.set_mode_gpu() accuracy = 0 print("Accuracy: {:.3f}".format(accuracy)) I got the results like below and it looks alright. Can anyone confirm my way of building caffe 3D (Depth,Width,Height,Channel [channel ignored if gray scale image]) model is correct or not? Results: I1120 12:43:45.839637 28983 solver.cpp:734] Snapshotting solver state to binary proto file/hdf_FT_iter_10000.solverstate |
The 'channels' axis should be right after the batch axis, so the shape should be |
@jeffdonahue, Thanks for answering me my question, You stated earlier in this pull. Sorry, I missed it. I set my dimensions as follow: depth (stuck of 3D volume), channel (RGB), width, length. Caffe run okay and the result was better. I was probably training width and channel instead of height in my last run. Just one more question, can I use the same method for multi-channel or spectrum images in hdf5? |
You can load whatever you want with hdf5 as long as its sticks to the NxCxSxSx....(where S is a spatial dim). You can have multiple channels in each image, or multiple batches in each file. I typically stick to a batch of 1, and increase/decrease with the batch_size param. But I don't think you need to, for example if you want images grouped into pre-determined batches. |
@jeffdonahue @jmerkow @Tgaaly, I previously forgot to set convolution_param so my last train was not 3D. It was 2D because I did not add kernel_size. I referred Tgaaly's model layer above but I got the error below. [libprotobuf ERROR google/protobuf/text_format.cc:245] Error parsing text-format caffe.NetParameter: 32:16: Non-repeated field "kernel_size" is specified multiple times. I think my hdf5 dataset's dimension is wrong. I have my dataset is (number of files, channel, width, length). I think I am suppose set my dataset dinmesion (batch_size, channel, width, length, depth). Then, add 3 kernel_size in convolution_param. Do you have any suggestion? |
@ToruHironaka Branch: https://github.com/naibaf7/caffe (includes OpenCL support and 3D max pooling support) it's still work in progress but if you can provide me your network and how the data is formatted I might be able to prepare a working python script for you. |
@naibaf7 |
@ToruHironaka
Does anybody has a examples of this? Ps: I don’t know anything about python. |
You have to use hdf data format for 3D-convolution in this promotion of caffe. I wrote a python script to convert my CT image files (Width, Height, Depth) into hdf data file. Then, I could train my 3D hdf datasets with this promotion of caffe. It worked but I did not get good results yet. My accuracy was very low like 0.4~0.6 and loss was always high like 1.5 or 1.6. I am now troubleshooting my image-to-hdf python scripts. I tested my python script to create 2D dataset and trained in the official caffe. I got the accuracy about 0.87 and loss was about 0.62. Then, I used other person's image-to-hdf matlab script to create hdf datasets with the same images and trained exactly the same way as my python script test. It got accuracy about 0.88 and loss was about 0.2. I created lmdb datasets with the same images by using caffe conversion command, which you used for 2D images. I got the accuracty about 0.93 and loss was 0.35. So, my image-to-hdf python conversion script was obviously worst. I am finding my data conversion problem now. This promotion of caffe accepted 3D dataset in hdf5 and it worked. Also, many people confirmed it. You should try it out. If you need my help, let me know because it helps my data conversion problem. If your hdf datasets work, you got my answer. |
@ToruHironaka |
Has anyone successfully gotten ND-Pooling to work? ND Convolution works without issues (from the master branch of BVLC-Caffe) |
@pietromaximoff Code is here: |
@jeffdonahue Hi, i am new to caffe. I used the "nd-convolution" branch. It gives me the How can i resolve it |
@naibaf7 I get the following error when I try to use the opencl branch of caffe (with python): pycaffe.py:13: RuntimeWarning: to-Python converter for std::vector<int, std::allocator > already registered; second conversion method ignored. I have no idea whatsoever what this error is and how to resolve it. Do you know how I can fix this? |
@ToruHironaka Do you have an example of how to train such data format of CT images on Caffe with 3D convolution? |
@paulcx I wrote a python script for converting image files into hdf5 format and I followed this promotion's thread above. I could trained models but I did not get good results so I did something wrong. |
@ToruHironaka Is hdf5 the only format that work with this PR? Did you try N-D max pooling together with N-D convolution? |
@xjtuljy yes, hdf5 is the only format for this promotion. I tried to train my 3D-CNN for ND-Pooling with Promotion 2442 and 2824. They ran and seemed to be working but my result was bad, so I think I am doing something wrong with my training. |
This PR extends convolution to N spatial axes, where Caffe's current convolution supports only 2D convolution (with 2 spatial axes: height and width). For 2D convolution, this implementation doesn't compare favorably with the existing one -- I haven't done much benchmarking, but I believe it's 25-75% slower on both CPU and GPU. So before this could be merged, I'd need to restore the existing implementation and use it as the default "engine" for 2D convolutions (but this more destructive version makes it easier to tell what I was thinking from looking at the diff). If anyone has any suggestions on improving the performance or thoughts on why it might be so much slower, I'd love to hear them.
Edit: benchmarking this on alexnet, it's about 33% slower:
@ master:
@ nd-convolution: