How to select the subset of input channels in each head? #9

prstrive · 2020-11-28T08:23:16Z

Thanks for your perfect work! But in the paper, the input channels will be re-weighted by the SE block firstly. And then select the top-k subset to the normal convolution to get the output of each head. But your code just pass the whole re-weighted input channels to the normal convolution, whose shape is (C_out // num_heads, C_in, k, k). If so, the amount of calculation and parameters will not decrease. Therefore, I don't notice the select progress. Could you please explain to me?

hellozhuo · 2020-12-02T18:57:52Z

Yes, all the weights were passed to the normal convolution, but in these weights, some of them were assigned as 0. Please see https://github.com/zhuogege1943/dgc/blob/ba074863dc289f5875202288aa286ca22b94e15b/layers.py#L123
Those weights didn't contribute to the output. This is only convenient for training.
In testing, we can prune those 0 weight filters without affecting the results.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to select the subset of input channels in each head? #9

How to select the subset of input channels in each head? #9

prstrive commented Nov 28, 2020

hellozhuo commented Dec 2, 2020

How to select the subset of input channels in each head? #9

How to select the subset of input channels in each head? #9

Comments

prstrive commented Nov 28, 2020

hellozhuo commented Dec 2, 2020