You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Seems like you have made a mistake in the design of the Inception-C block in the Inception-v4 network. The attached picture shall explain the mistake pretty clearly, you mixed up a 3x1 convolutional layer with a 1x3 convolutional layer.
This should not be too big of a deal in terms of performance or functionality, however, it could lead to bugs, if you build your network from scratch and use the pretrained weights linked on this site.
Tips for other developers,
If you use this .prototxt file and these pretrained weights, you will be fine and your CNN will work perfectly. However, if you build your Inception-v4 network from scratch and build it accordingly to the paper and use these pretrained weights, caffe will not be able to match the weights and will throw following error: Cannot copy param 0 weights from layer 'inception_c1_3x1_2'; shape mismatch. Source param shape is 448 384 3 1 (516096); target param shape is 512 448 3 1 (688128). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer.
To fix this problem and still be able to use the pretrained weights, simply build the Inception-C part of your Inception-v4 network according to the picture on this post.
I hope I was able of saving someone a few hours of debugging.
Cheers,
d3lt4-papa
Click to view actual code
Code of one Inception-C block in Inception-v4 according to original paper
Hey Guys,
Seems like you have made a mistake in the design of the Inception-C block in the Inception-v4 network. The attached picture shall explain the mistake pretty clearly, you mixed up a 3x1 convolutional layer with a 1x3 convolutional layer.
This should not be too big of a deal in terms of performance or functionality, however, it could lead to bugs, if you build your network from scratch and use the pretrained weights linked on this site.
Tips for other developers,
If you use this .prototxt file and these pretrained weights, you will be fine and your CNN will work perfectly. However, if you build your Inception-v4 network from scratch and build it accordingly to the paper and use these pretrained weights, caffe will not be able to match the weights and will throw following error:
Cannot copy param 0 weights from layer 'inception_c1_3x1_2'; shape mismatch. Source param shape is 448 384 3 1 (516096); target param shape is 512 448 3 1 (688128). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer.
To fix this problem and still be able to use the pretrained weights, simply build the Inception-C part of your Inception-v4 network according to the picture on this post.
I hope I was able of saving someone a few hours of debugging.
Cheers,
d3lt4-papa
Click to view actual code
Code of one Inception-C block in Inception-v4 according to original paper
Code of one Inception-C block, if you want to be able to use these pretrained weights
The text was updated successfully, but these errors were encountered: