Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

padding aware im2col and col2im functions #128

Merged
merged 8 commits into from
Feb 25, 2014
Merged

padding aware im2col and col2im functions #128

merged 8 commits into from
Feb 25, 2014

Conversation

mavenlin
Copy link
Contributor

This PR is to replace #99

@mavenlin
Copy link
Contributor Author

BTW, I wonder how the net_speed_benchmark get each forward backward time under GPU mode. Ain't the kernels calls asynchronous? I added cuda_timer under util to measure kernel execution time.

@shelhamer
Copy link
Member

Thanks for rebased PR. I'm fine with keeping the original and padding aware versions for the reasons you gave

1. original im2col is faster when pad=0;
2. backward compatibility. If a model definition file uses padding layer, it won't loss performance.

at least for now, but defer to @Yangqing's judgement.

@sguada
Copy link
Contributor

sguada commented Feb 20, 2014

@mavenlin net_speed_benchmark compute the GPU time by doing several forward/backward calls to each layer in order. It will be cleared if you look at the code https://github.com/BVLC/caffe/blob/master/examples/net_speed_benchmark.cpp#L72

@mavenlin
Copy link
Contributor Author

@sguada but I think all the kernel calls will be issued at once, the cpu code will move on without waiting for their return.

@mavenlin
Copy link
Contributor Author

@kloudkl i tried cudadevicesynchronize approach, but it seems not working, the results are wired, that's why i implemented the gpu timer approach and pack it in the CudaTimer class under util. But I didn't see cudadevicesynchronize in the forward backward functions, that's why I doubt the benchmark code won't work on gpu. Or please tell me if I was wrong.

@sguada
Copy link
Contributor

sguada commented Feb 21, 2014

@shelhamer
Copy link
Member

@sguada I'm actually not that into the article. Much of their story is fixing the problems they gave themselves by running in virtualization. To me, Caffe scores a victory achieving excellent performance without mucking about in drivers and kernel code.

The model search / hyperparameter optimization by bayesian optimization and random search by Bergstra and Adams are interesting in themselves though [1, 2, 3]. It could be a better way than humans turning knobs, but "graduate student decay" for conditioning has been the best performer so far.

[1] J. Snoek, H. Larochelle, and R. P. Adams. Practical bayesian optimization of machine learning algorithms. In NIPS, 2012.
[2] J. Bergstra, D. Yamins and D. D. Cox (2013).
Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures.
Proc. 30th International Conference on Machine Learning (ICML-13).
[3] Hyperopt by Bergstra

@shelhamer
Copy link
Member

@mavenlin we have discussed this and decided padding-aware is always the right thing to do. Please refactor to use the padding-aware version everywhere, remove the non-padding-aware version, and drop the obsolete padding layer, and then we will happily merge! Thanks again.

@sergeyk
Copy link
Contributor

sergeyk commented Feb 25, 2014

@forresti should benefit from this change as well, please chime in if any concerns!

@forresti
Copy link
Contributor

Very nice -- sounds useful. :)

@mavenlin
Copy link
Contributor Author

@shelhamer a few questions:

  1. Do we remove the padding layer from the code or just leave it there but mark as obsolete?
  2. Are we keeping im2col function as well as padded_im2col function, or just replace the im2col function?

I think I'll keep the name im2col and add pad parameter to that function, and remove the padded_im2col, as the name padded_xxx is quite ugly.

@shelhamer
Copy link
Member

  1. Remove the padding layer. It's completely unneeded now that we're padding aware, and keeping it around could only be confusing.
  2. We're replacing im2col (but keeping the name).

Note the example prototxt will need to be updated and the reference model. This can be prepared and double-checked when this migrates from dev to master.

Thanks again for your work.

@mavenlin
Copy link
Contributor Author

@shelhamer It is resolved and rebased on dev.
@kloudkl I remove the cuda_timer which is no longer used, your may start a PR for the timer you wrote, which wraps both cpu and gpu timer. I think it is necessary to have a convenient timer.

@shelhamer shelhamer self-assigned this Feb 25, 2014
shelhamer added a commit that referenced this pull request Feb 25, 2014
im2col and col2im learn to pad and padding layer is obsolete
@shelhamer shelhamer merged commit ae56141 into BVLC:dev Feb 25, 2014
@forresti
Copy link
Contributor

Great work, all.

One question: How much faster is the original im2col() when pad=0?

@mavenlin
Copy link
Contributor Author

@forresti You can refer to #99, the second table.

@forresti
Copy link
Contributor

ah, perfect. looks good! thanks for creating this, @mavenlin !

@kloudkl
Copy link
Contributor

kloudkl commented Feb 25, 2014

@mavenlin is really insightful to find out the opportunities (this one and #119) to significantly optimize both the memory usage and computation time!

shelhamer added a commit to shelhamer/caffe that referenced this pull request Feb 26, 2014
im2col and col2im learn to pad and padding layer is obsolete
shelhamer added a commit that referenced this pull request Feb 26, 2014
im2col and col2im learn to pad and padding layer is obsolete
shelhamer added a commit that referenced this pull request Feb 26, 2014
im2col and col2im learn to pad and padding layer is obsolete
@sguada
Copy link
Contributor

sguada commented Feb 26, 2014

Sorry @mavenlin @shelhamer I think we should leave the padding layer, otherwise it will break all the models I have already. Maybe we can think in removing when we do the change in the protobuf and write the conversion script.

@shelhamer
Copy link
Member

@sguada I don't think this should hold on that whole endeavour. This is a simple change to networks–perhaps we could write the first conversion script for this change and have a little practice for the great shift. However, we all do have a lot happening right now so I am going to keep this contribution in dev for the moment.

@mavenlin, your contribution will make it to master once models are updated. Thanks again!

@sguada
Copy link
Contributor

sguada commented Feb 26, 2014

@shelhamer This is the error I get now with all my previous prototxt (~100) and pretrained models
F0226 13:42:26.569927 1998293344 layer_factory.cpp:61] Unknown layer name: padding
*** Check failure stack trace: ***

For the future I think we should try to avoid removing old stuff and breaking caffe with each PR.

We could mark it as DEPRECATED to indicate it will be removed in the future and then remove it when we change the version and have conversion tool.

Otherwise this interfere too much with developing.

@shelhamer
Copy link
Member

@sguada Yes, this does break current models. That is why this change is in dev and not master.

We have dev so we can work freely and experiment and get the job done. Once changes survive and are proven, they will go to master. The whole point of this scheme is to keep from breaking, so we agree on that.

You raise the good point that dev, while experimental, should try to be as unbroken as possible to not impede development. This means that deprecation is important not only in master but in dev too.

Luckily, this break is a relatively simple one: remove padding layers and add a padding field. We can fix this within dev and even write a script to do it that will lay the groundwork for #169. That's why this will be done first and not wait on the much greater amount of work needed for the complete schema change.

Deprecation will be done from the beginning for changes relating to #169, so that a healthy level of order will be preserved in dev.

Thank you for bringing this up so we can improve the contributing workflow.

@sguada
Copy link
Contributor

sguada commented Feb 26, 2014

@shelhamer I agree with you, dev is there for that purpose, but still once one PR is merged in dev subsequent PR depend on it. Any advice how to handle that?

#169 is a great plan, and I'm totally for it, but maybe marking things as deprecated in dev would allow an easier transition.

Just opened a new issue for removing padding layers and add a padding field #170

@shelhamer
Copy link
Member

@sguada Agreed: deprecation should be done from now on for merge into dev since further development depends on it. I should have recommended it for this merge.

#170 is a priority to smooth out dev.

shelhamer added a commit that referenced this pull request Feb 26, 2014
im2col and col2im learn to pad and padding layer is obsolete
@sguada
Copy link
Contributor

sguada commented Feb 27, 2014

Congrats @mavenlin this merge brings the memory requirements #119 for training imagenet from 4167MB down to 3475MB (updated) and from 3631MB to 3001MB (updated) when not using a test-net.

#119 (comment)

@shelhamer worthy the change.

@Yangqing
Copy link
Member

Kudos!

On Wednesday, February 26, 2014, Sergio Guadarrama [email protected]
wrote:

Congrats @mavenlin https://github.com/mavenlin this merge brings the
memory requirements #119 #119 for
training imagenet from 4167MB down to 3264MB and from 3631MB to 2838 MB
when not using test-net.

#119 (comment)#119 (comment)

Reply to this email directly or view it on GitHubhttps://github.com//pull/128#issuecomment-36216614
.

Sent from Gmail Mobile - apologeez for any typoz.

@shelhamer shelhamer mentioned this pull request Mar 18, 2014
mitmul pushed a commit to mitmul/caffe that referenced this pull request Sep 30, 2014
im2col and col2im learn to pad and padding layer is obsolete
coder-james pushed a commit to coder-james/caffe that referenced this pull request Nov 28, 2016
Fix windows build in appveyor by disabling parallel build
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants