-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution! Would you put the mobilenet v2 class in the gluon model zoo? python/mxnet/gluon/model_zoo/vision/mobilenet.py
I have put the MobileNetV2 class in the gluon vision model zoo. I don't have pretrained models, so this should be modified:
BTW, pylint is so strict that I have to disable some rules. |
# under the License. | ||
|
||
# -*- coding:utf-8 -*- | ||
# pylint: disable= arguments-differ,line-too-long,invalid-name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please provide reason for disabling pylint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please delete duplicate code in example, and follow exactly the convention in existing interfaces here:
https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/model_zoo/vision/mobilenet.py#L78-L160
Don't worry about the doc and pretrained model as long as it's not added to the api doc yet.
return x | ||
|
||
|
||
def get_mobilenetv2(w, pretrained=False, ctx=cpu(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a version argument to get_mobilenet.
return net | ||
|
||
|
||
def mobilenetv2_1_0(**kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
provide the same variants as the v1 mobilenet
# under the License. | ||
|
||
# -*- coding:utf-8 -*- | ||
# pylint: disable= arguments-differ,line-too-long,invalid-name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use file-level lint disable. Please use inline lint disables to identify the lines that violate the lint rules and we can try to decide whether to fix or disable.
# -*- coding:utf-8 -*- | ||
# pylint: disable= arguments-differ,line-too-long,invalid-name | ||
''' | ||
MobileNetV2, implemented in Gluon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete duplicate and load from gluon model zoo instead.
from ....context import cpu | ||
from ...block import HybridBlock | ||
from ... import nn | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iosrt made this change.
|
||
# Helpers | ||
# pylint: disable= too-many-arguments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is mobilenet v1 code, add this to pass the style check. Code not changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove and use our pylintrc as standard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping
strides = [1, 2] * 3 + [1] * 5 + [2, 1] | ||
# pylint: disable= invalid-name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is mobilenet v1 code, add this to pass the style check.
autopep8 made those changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use make pylint
as the standard for checking code style. we have a 100 char-per-line limit.
Requested changes have been made. I am trying to make my code more clear by merging commit logs. |
|
||
c_bgn = int(160 * multiplier) | ||
c_end = int(320 * multiplier) | ||
self.features.add(BottleNeck(c_in=c_bgn, c_out=c_end, t=6, s=1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the construction looks like it could be refactored. can you use similar implementation from mobilenet v1?
Parameters | ||
---------- | ||
ver : int, default 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
version. same below
strides = [1, 2] * 2 + [1] * 2 + [2] + [1] * 3 + [1] * 3 + [2] + [1] * 3 | ||
|
||
for c_bgn, c_end, t, s in zip(c_bgns, c_ends, ts, strides): | ||
self.features.add(BottleNeck(c_in=c_bgn, c_out=c_end, t=t, s=s)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code here looks ugly and very hard to follow the data flow.
Is this kind of code really necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
20 lines of self.features.add(BottleNeck...
is unnecessarily verbose. I personally prefer the more compact code.
Could you use the same variable naming as in V1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer compact code too. But shouldn't we keep our code simply and clear.
Some people use the word 'scary' to describe code inside caffe and torch. Maybe it will be helpful to keep the code simple and stupid.
This kind of code take twice the time to write and more time to understand. As a programmer, I am a bit lazy. So, personally, I prefer previous version.
self.features.add(nn.GlobalAvgPool2D()) | ||
|
||
self.output = nn.Conv2D(classes, 1, use_bias=False, prefix='pred_') | ||
self.flatten = nn.Flatten(prefix='flat_') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't introduce new field. flatten should be included in output.
t : int | ||
Layer expansion ratio. | ||
s : int | ||
strides |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s -> stride
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I don't feel comfortable having unchecked code, this is the way we apparently go for examples. Approved from my side.
self.features.add(nn.GlobalAvgPool2D()) | ||
|
||
self.output = nn.HybridSequential(prefix='output_') | ||
self.output.add( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with self.output.name_scope():
and manual prefix can be skipped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A name_scope is indeed necessary.
channels_group = [int(x * multiplier) for x in [16] + [24] * 2 + [32] * 3 | ||
+ [64] * 4 + [96] * 3 + [160] * 3 + [320]] | ||
ts = [1] + [6] * 16 | ||
strides = [1, 2] * 2 + [1] * 2 + [2] + [1] * 3 + [1] * 3 + [2] + [1] * 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, I am comparing http://ethereon.github.io/netscope/#/gist/d01b5b8783b4582a42fe07bd46243986
with
plot.gv.pdf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The strides are correct in paper. We are applying stride 2 to last block of previous group of blocks.
BTW, we are using output node as top-most node in model graph. While the output node should be the last few lines in our code. So the graph is looking like upside down. Should it be more convenient if we revert it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dwSun from 28^2x32 to 28^2x64, it should be stride 1. I am observing 14^2x64 feature map, which corresponds to stride 2.
The input sizes and strides in this table is conflicting, but I am not sure which one is correct unless we compute the Mult-add and compare with the claims in paper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My mistake. Those 2 lines indeed can't make sense. Will correct it in next commit.
for reference, this is the struct I am using:
|
LGTM. Would you mind adding the variants to |
@zhreshold please approve once you confirm that it's implemented correctly. |
unittest finished. |
self.features = nn.HybridSequential(prefix='features_') | ||
with self.features.name_scope(): | ||
_add_conv(self.features, int(32 * multiplier), kernel=3, | ||
stride=2, pad=1, active=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no relu
for the first conv?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The MobileNetV2 paper not mentioned relu after first conv2d.
Also the channels of first conv2d is 32, while the channels of first bottleneck is 16, this doesn't sense.
I have tested 3 different model architectures:
- with relu in first conv2d
- without relu in first conv2d
- without relu and use 16 instead of 32 as output channels in first conv2d
The result is a bit confusing, since performance among those architectures looks similar.
Here is the result:
The dataset contains 100 classes, 19W pics, all gray-scale image. This dataset maybe too small. But I don't have enough resource to test on coco or imagenet.
Add relu in first conv2d as requested. |
fix pylint style check error. add docstring, disable some check. delete duplicate code in example.
Any plan to release the pretrained model? I try to train the model by my own and it looks not so well... For more details, please look at |
@Jing-Luo It's currently WIP. |
* MobileNetV2 * add mobilenetv2 to gluon vision model zoo. * code reformat use autopep8 and isort. fix pylint style check error. add docstring, disable some check. delete duplicate code in example. * change ver to version. * use exist helper. * model code refactor. * fix line too long. * remove invalid name option. * merge output operations. * change variables name * fix line too long. * s -> stride * add output name_scope * remove relu in first conv2d. * change block name from BottleNeck to LinearBottleneck. * resolve conflict * fix parameter name. * correct strides * add mobilenetv2 to unittest. * use autopep8 to reformat code. * add mobilenetv2 symbols to gluon.vision * add relu in 1st conv2d. * code refactor by using helpers. * split mobilenet v1 and v2 apis.
* MobileNetV2 * add mobilenetv2 to gluon vision model zoo. * code reformat use autopep8 and isort. fix pylint style check error. add docstring, disable some check. delete duplicate code in example. * change ver to version. * use exist helper. * model code refactor. * fix line too long. * remove invalid name option. * merge output operations. * change variables name * fix line too long. * s -> stride * add output name_scope * remove relu in first conv2d. * change block name from BottleNeck to LinearBottleneck. * resolve conflict * fix parameter name. * correct strides * add mobilenetv2 to unittest. * use autopep8 to reformat code. * add mobilenetv2 symbols to gluon.vision * add relu in 1st conv2d. * code refactor by using helpers. * split mobilenet v1 and v2 apis.
Description
MobileNetV2 model from the
"Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation" <https://arxiv.org/abs/1801.04381>
_ paper.Checklist
Essentials
pylint
)Comments