Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Splitting Tensors/Grouped Convolutions #3

Closed
pGit1 opened this issue Jul 26, 2017 · 3 comments
Closed

Splitting Tensors/Grouped Convolutions #3

pGit1 opened this issue Jul 26, 2017 · 3 comments

Comments

@pGit1
Copy link

pGit1 commented Jul 26, 2017

@titu1994 ,

Awesome work!

In figure 3 of the paper:

image

Can you shed light on how these three different iterations of Aggregated Transforms are equivalent? From looking at your code it looks like you choose to implement method b. Is this accurate? Also I saw another implementation that uses lambda layers to do something more akin to item c. That is, if the previous layer channel dimension is 64-d for instance and C=32 (cardinality groups) then this would result in 64/32= 2 feature maps per cardinality group as input to the 32 different convolutions. These feature maps would not overlap and the sum of them across the cardinality groups will always equal 64-d in our example.

How is this the same as having 32 different convolutions all with 64-d channels as input? Your thoughts would be much appreciated!

EDIT: Other implementation - https://gist.github.com/mjdietzx/0cb95922aac14d446a6530f87b3a04ce

@titu1994
Copy link
Owner

I believe they are equivalent because 1) the model built using B) has the exact same number of parameters as that with C). C) is a a more succinct implementation, bit acheives the same thing.

However, if one tries to port weights from the torch code to Keras, they would find the lambda layer would fit, whereas this would not. I will look into, and update my code to match the lambda layer version when possible, as weight translation would be easier.

@pGit1
Copy link
Author

pGit1 commented Jul 27, 2017

Interesting thank you so much! Just making sure I understand all this code that is going on. It seems the lambda layer is indeed doing what item c in fig 3 is displaying. Awesome to be on the right track! Thanks again!

@pGit1 pGit1 closed this as completed Jul 27, 2017
@gabriel19852005
Copy link

just find this page: microsoft/MMdnn#58, may help with the weights convertion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants