-
Notifications
You must be signed in to change notification settings - Fork 2.5k
First attempt at adding a mobilenet backbone #242
Conversation
Just received an email about this thread. I put an Apache License 2.0 in my repo. Feel free to use my repo and contact me if you meet any problem :). |
Hello there, Thank you for your work guys ! Since the review is taking some time to be done, I tried your implementation, I succeeded in running your code but my 'loss_classifier' is always stuck at 0.5~0.6 and it is never decreasing ... does it behave like that for you too ? any idea on possible reason ? Maybe if we debug we can make thing progress ! |
I think the net architecture isn't good yet (e.g. which backbone layers are used as feature planes). In lieu of anyone providing a better starting point, one could take a look at published mobilenet RCNNs, but I haven't gotten around to doing so. |
@wat3rBro do you have any tips or advices for a better mobile architecture? |
I was able to get the net to train properly by using the last layers with 24, 32, 64 and 1280 channels as feature planes.
|
@kimnik6 awesome! What was the accuracy that you obtained with such configuration? |
It's still training, right now it's at a box AP of 23.8. However, on a GTX1070 it still takes 0.095s/image for inference (compared to 0.132s/image for the Res50_FPN). |
Maybe the benefits will be more noticeable when running it on the CPU? But anyway, very interesting findings! |
Finished training, box AP is at 26.5. I'll try it without the FPN to see if it gets any faster. The RPN seems to be dominating the runtime of the whole net right now. |
I think dropping the FPN and amending the Backbone for the feature layers would be good next steps. |
@t-vi: do you already have an idea how you would do it? For a first attempt based off of the ResNetC4 structure, I took the last of the MobileNet layers with 96 channels for my output. I have not completely understood the differences/reasons for choosing the ROI_BOX_HEAD, so I just continued using the FPN2MLPFeatureExtractor. My config-file looks like this
It's still training, currently at a box mAP at 19.9 at 0.049s/image. Classifier and box_reg losses look good, however objectness and rpn_box_reg losses are rather high. If anyone has any ideas for a better structure, let me know. Also, I retrained the MobileNet with FPN with a larger batch-size and slightly different schedule, got it up to a box mAP of 29.8 (still 0.095s/image). |
@kimnik6 I believe that you might be using only the first layer of the I'd recommend taking the config from R50-C4, and modify slightly the backbone to not return the 4 feature levels, only returning the last feature map. |
I advise to strictly prefer @fmassa's advice over anything I say as he has a lot more insight into this. With that in mind, I looked a bit at TF's mobile detection model and what it does is it amends the Mobilenet (v1 in their case) with four blocks that each half the number of channels and the resolution. So there, you start with 786 channels and the first Resblock is with 768 input channels, 384 output channels, a bottleneck of 192 channels and halfs the input resolution. |
@fmassa That was my initial guess as well, but now I think it just simply is higher than the other models, as I checked the model multiple times and the results aren't actually too bad (right now at 26.8mAP with 0.049s/image, waiting for the learning rate to be dropped one last time). @t-vi Could you post the link to the model/script your referring to? At what point are the extra blocks inserted/added? |
I looked at the |
Alright, training finished at 27.8 mAP (without FPN). I will leave it at that for now.
without FPN:
I'll try to upload the weights soon |
Awesome work @kimnik6! Do you have a github branch I can update this PR with or should I copy the files? |
You can just copy the code. Thanks for the suggestion @t-vi, I uploaded them under https://github.com/kimnik6/maskrcnn-benchmark-mobilenet/releases. |
hi @kimnik6 |
Without the FPN, multi-GPU didn't work, so I only trained it on one GPU. The initial weights I used are from https://github.com/tonylins/pytorch-mobilenet-v2, you can find my trained models at https://github.com/kimnik6/maskrcnn-benchmark-mobilenet/releases. |
hi @kimnik6
and train the mobile-net with retinanet and get
also i train it from scratch using GN and fpn, get
but the speed is not fast. |
Concerning the speed, the rest of the network really slows down the process. The fastest I got the MobileNet to work was ~20FPS without the FPN (without changing the image dimensions) |
hi @fmassa |
@zimenglan-sysu-512 did you change anything in the training config? |
hi @fmassa |
Ok. I'll let @newstzpz comment on the expected numbers for those models |
Hi @zimenglan-sysu-512 , thanks for experimenting with our model. The number you got is expected as it was target for mobile and not GPU friendly. We have models, although still not very GPU friendly, runs at 0.188ms with mAP 33.5 measuring in Caffe2 using V100. Please see here for more details. Our model supports various efficient building blocks and architecture, including the one in your PR. In addition, we also support using efficient building blocks in RPN and roi heads (bbox, mask etc.). This is important for fast speed on mobile. Here is an example of how to define the architecture:
We provide a few baseline architectures, please feel free to check them out here and all the supported building blocks here. |
This is obsolete now. |
As discussed, this is a very rough attempt at putting in a mobilenet backbone.
model.features
(which I made into a ModuleList instead of a sequential, you could probably leave it as sequential and it would work, too, but I don't intend to call it, so this seems cleaner),model.features
list.I must say that I absolutely loved how the training works with the stock mscoco out of the box and is set up to be extensible!