Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

MobileNetV2 #9614

Merged
merged 24 commits into from
Feb 21, 2018
Merged

MobileNetV2 #9614

merged 24 commits into from
Feb 21, 2018

Conversation

dwSun
Copy link
Contributor

@dwSun dwSun commented Jan 29, 2018

Description

MobileNetV2 model from the
"Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation" <https://arxiv.org/abs/1801.04381>_ paper.

Checklist

Essentials

  • Passed code style checking (pylint)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Code is well-documented:
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Comments

  • Not sure this is the correct implement, but this model did works.

@dwSun dwSun requested a review from szha as a code owner January 29, 2018 14:56
Copy link
Member

@szha szha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! Would you put the mobilenet v2 class in the gluon model zoo? python/mxnet/gluon/model_zoo/vision/mobilenet.py

@szha szha self-assigned this Jan 29, 2018
@dwSun
Copy link
Contributor Author

dwSun commented Jan 30, 2018

I have put the MobileNetV2 class in the gluon vision model zoo.

I don't have pretrained models, so this should be modified:

We provide pre-trained models for all the listed models.
These models can constructed by passing pretrained=True:

BTW, pylint is so strict that I have to disable some rules.

# under the License.

# -*- coding:utf-8 -*-
# pylint: disable= arguments-differ,line-too-long,invalid-name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide reason for disabling pylint.

Copy link
Member

@szha szha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please delete duplicate code in example, and follow exactly the convention in existing interfaces here:
https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/model_zoo/vision/mobilenet.py#L78-L160

Don't worry about the doc and pretrained model as long as it's not added to the api doc yet.

return x


def get_mobilenetv2(w, pretrained=False, ctx=cpu(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a version argument to get_mobilenet.

return net


def mobilenetv2_1_0(**kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

provide the same variants as the v1 mobilenet

# under the License.

# -*- coding:utf-8 -*-
# pylint: disable= arguments-differ,line-too-long,invalid-name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use file-level lint disable. Please use inline lint disables to identify the lines that violate the lint rules and we can try to decide whether to fix or disable.

# -*- coding:utf-8 -*-
# pylint: disable= arguments-differ,line-too-long,invalid-name
'''
MobileNetV2, implemented in Gluon.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete duplicate and load from gluon model zoo instead.

from ....context import cpu
from ...block import HybridBlock
from ... import nn

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iosrt made this change.


# Helpers
# pylint: disable= too-many-arguments
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mobilenet v1 code, add this to pass the style check. Code not changed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove and use our pylintrc as standard.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping

strides = [1, 2] * 3 + [1] * 5 + [2, 1]
# pylint: disable= invalid-name
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mobilenet v1 code, add this to pass the style check.
autopep8 made those changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use make pylint as the standard for checking code style. we have a 100 char-per-line limit.

@dwSun
Copy link
Contributor Author

dwSun commented Jan 31, 2018

Requested changes have been made. I am trying to make my code more clear by merging commit logs.
Seems the situation not going as I wish.


c_bgn = int(160 * multiplier)
c_end = int(320 * multiplier)
self.features.add(BottleNeck(c_in=c_bgn, c_out=c_end, t=6, s=1))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the construction looks like it could be refactored. can you use similar implementation from mobilenet v1?

Parameters
----------
ver : int, default 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

version. same below

strides = [1, 2] * 2 + [1] * 2 + [2] + [1] * 3 + [1] * 3 + [2] + [1] * 3

for c_bgn, c_end, t, s in zip(c_bgns, c_ends, ts, strides):
self.features.add(BottleNeck(c_in=c_bgn, c_out=c_end, t=t, s=s))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code here looks ugly and very hard to follow the data flow.
Is this kind of code really necessary?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

20 lines of self.features.add(BottleNeck... is unnecessarily verbose. I personally prefer the more compact code.

Could you use the same variable naming as in V1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer compact code too. But shouldn't we keep our code simply and clear.
Some people use the word 'scary' to describe code inside caffe and torch. Maybe it will be helpful to keep the code simple and stupid.
This kind of code take twice the time to write and more time to understand. As a programmer, I am a bit lazy. So, personally, I prefer previous version.

self.features.add(nn.GlobalAvgPool2D())

self.output = nn.Conv2D(classes, 1, use_bias=False, prefix='pred_')
self.flatten = nn.Flatten(prefix='flat_')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't introduce new field. flatten should be included in output.

t : int
Layer expansion ratio.
s : int
strides
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@szha szha Feb 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s -> stride

Copy link
Contributor

@marcoabreu marcoabreu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I don't feel comfortable having unchecked code, this is the way we apparently go for examples. Approved from my side.

self.features.add(nn.GlobalAvgPool2D())

self.output = nn.HybridSequential(prefix='output_')
self.output.add(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with self.output.name_scope():
and manual prefix can be skipped.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A name_scope is indeed necessary.

channels_group = [int(x * multiplier) for x in [16] + [24] * 2 + [32] * 3
+ [64] * 4 + [96] * 3 + [160] * 3 + [320]]
ts = [1] + [6] * 16
strides = [1, 2] * 2 + [1] * 2 + [2] + [1] * 3 + [1] * 3 + [2] + [1] * 3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a mistake in paper:
screen shot 2018-02-13 at 5 54 45 pm
Therefore the strides here should be:

strides = [1, 2] * 2 + [1] * 3 + [2] + [1] * 2 + [1] * 3 + [2] + [1] * 3

More inputs and eyes are welcome if you find this not correct @dwSun @szha

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@dwSun dwSun Feb 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The strides are correct in paper. We are applying stride 2 to last block of previous group of blocks.

BTW, we are using output node as top-most node in model graph. While the output node should be the last few lines in our code. So the graph is looking like upside down. Should it be more convenient if we revert it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dwSun from 28^2x32 to 28^2x64, it should be stride 1. I am observing 14^2x64 feature map, which corresponds to stride 2.
The input sizes and strides in this table is conflicting, but I am not sure which one is correct unless we compute the Mult-add and compare with the claims in paper.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My mistake. Those 2 lines indeed can't make sense. Will correct it in next commit.

@dwSun
Copy link
Contributor Author

dwSun commented Feb 15, 2018

for reference, this is the struct I am using:

input shape     stride  output shape
# conv2d
224x224x3       2       112x112x32

# bottleneck
112x112x32      1       112x112x16

# bottleneck
112x112x16      2       56x56x24
56x56x24        1       56x56x24

# bottleneck
56x56x24        2       28x28x32
28x28x32        1       28x28x32
28x28x32        1       28x28x32

# bottleneck
28x28x32        1       28x28x64
28x28x64        1       28x28x64
28x28x64        1       28x28x64
28x28x64        1       28x28x64

# bottleneck
28x28x64        2       14x14x96
14x14x96        1       14x14x96
14x14x96        1       14x14x96

# bottleneck
14x14x96        2       7x7x160
7x7x160         1       7x7x160
7x7x160         1       7x7x160

# bottleneck
7x7x160         1       7x7x320

# conv2d
7x7x320         1       7x7x1280

# avgpool
7x7x1280        _       1x1x1280

# the output
1x1x1280        _       1x1xk

@szha
Copy link
Member

szha commented Feb 15, 2018

LGTM. Would you mind adding the variants to tests/python/unittest/test_gluon_model_zoo.py? Click the unchecked checkbox at the top of this PR once you're done.

@szha
Copy link
Member

szha commented Feb 15, 2018

@zhreshold please approve once you confirm that it's implemented correctly.

@dwSun
Copy link
Contributor Author

dwSun commented Feb 15, 2018

unittest finished.

self.features = nn.HybridSequential(prefix='features_')
with self.features.name_scope():
_add_conv(self.features, int(32 * multiplier), kernel=3,
stride=2, pad=1, active=False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no relu for the first conv?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MobileNetV2 paper not mentioned relu after first conv2d.
Also the channels of first conv2d is 32, while the channels of first bottleneck is 16, this doesn't sense.
I have tested 3 different model architectures:

  • with relu in first conv2d
  • without relu in first conv2d
  • without relu and use 16 instead of 32 as output channels in first conv2d

The result is a bit confusing, since performance among those architectures looks similar.
Here is the result:
test

The dataset contains 100 classes, 19W pics, all gray-scale image. This dataset maybe too small. But I don't have enough resource to test on coco or imagenet.

@dwSun
Copy link
Contributor Author

dwSun commented Feb 18, 2018

Add relu in first conv2d as requested.

dwSun added 4 commits February 21, 2018 20:58
fix pylint style check error.
add docstring, disable some check.
delete duplicate code in example.
@szha szha merged commit 5ca6efe into apache:master Feb 21, 2018
@Jing-Luo
Copy link

Any plan to release the pretrained model? I try to train the model by my own and it looks not so well...

For more details, please look at

https://discuss.gluon.ai/t/topic/5295

@szha
Copy link
Member

szha commented Mar 20, 2018

@Jing-Luo It's currently WIP.

rahul003 pushed a commit to rahul003/mxnet that referenced this pull request Jun 4, 2018
* MobileNetV2

* add mobilenetv2 to gluon vision model zoo.

* code reformat use autopep8 and isort.
fix pylint style check error.
add docstring, disable some check.
delete duplicate code in example.

* change ver to version.

* use exist helper.

* model code refactor.

* fix line too long.

* remove invalid name option.

* merge output operations.

* change variables name

* fix line too long.

* s -> stride

* add output name_scope

* remove relu in first conv2d.

* change block name from BottleNeck to LinearBottleneck.

* resolve conflict

* fix parameter name.

* correct strides

* add mobilenetv2 to unittest.

* use autopep8 to reformat code.

* add mobilenetv2 symbols to gluon.vision

* add relu in 1st conv2d.

* code refactor by using helpers.

* split mobilenet v1 and v2 apis.
zheng-da pushed a commit to zheng-da/incubator-mxnet that referenced this pull request Jun 28, 2018
* MobileNetV2

* add mobilenetv2 to gluon vision model zoo.

* code reformat use autopep8 and isort.
fix pylint style check error.
add docstring, disable some check.
delete duplicate code in example.

* change ver to version.

* use exist helper.

* model code refactor.

* fix line too long.

* remove invalid name option.

* merge output operations.

* change variables name

* fix line too long.

* s -> stride

* add output name_scope

* remove relu in first conv2d.

* change block name from BottleNeck to LinearBottleneck.

* resolve conflict

* fix parameter name.

* correct strides

* add mobilenetv2 to unittest.

* use autopep8 to reformat code.

* add mobilenetv2 symbols to gluon.vision

* add relu in 1st conv2d.

* code refactor by using helpers.

* split mobilenet v1 and v2 apis.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants