-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SqueezeNet implementation for Faster RCNN #345
Comments
Please update if you get this working. Thanks |
@huvers ,I tried many possible methods of net architecture,none of them is effective, do you figure it out? |
@mengzhangjian - Not yet, I'm still working on it too. I'll put up a $100 bounty through paypal to whomever can get this working :) |
@mengzhangjian I am working on this right now, what is a result you achieve on which dataset? |
@karaspd ,I use it for face detection in WIDER face dataset, in the above config, I got nothing. |
@mengzhangjian it is a little bit strange. I am training and testing on kitti dataset with 3 different object difficulties. I used the squeezenet v1.1 and I placed roi-pool layer after fire9 with pooled_w: 7 |
@karaspd Would you mind sharing your train.prototxt for SqueezeNetv1.1 ? |
@karaspd Hrmmm, that wouldn't run for my case (I modified it for 5 classes). How did you come up with "rpn_cls_score" to output 120? |
if you want to consider 60 anchors you need to change scales and aspect ratio in rpn codes. However you can ignore it and just consider 9 anchors like before and test if you have any problem. change 120 to 18 and 240 to 36. |
@karaspd Yes, I've tried changing them back to the usual 18 and 36 (as well as the rpn_cls_prob_reshape layer back to 18), but I'm still seeing no convergence in learning, and an error on the validation set: File "/home/huvers/PycharmProjects/py-faster-rcnn/tools/../lib/datasets/voc_eval.py", line 148, in voc_eval The same data works just fine training with VGG16. Regardless, thanks for your help. |
I saw this error before, are you sure you are using a right pre-trained network? squeezenetv1.1? |
Yes, it is Squeezenetv1.1. So I discovered it was actually the learning rate that was giving it issues. Originally, lr = .001, which was apparently too much. Reducing this to lr = .0001 led to convergence and the previous error disappeared. Currently running a large number of iterations to see how well it ultimately performs on my custom dataset. |
That's great, let me know about your results. |
@karaspd Thank you for your contribution,I will test for my face data and tell you the result. |
Hi @mengzhangjian @huvers, how is the training going? Did you get any interesting results? |
Hi @karaspd My results haven't been very good so far. On my custom data (5 classes) VGG16 would achieve map=.80, while Squeezenet v1.1 can only do map=.55. I have to turn the confidence down very low for testing, and get a large number of false positives as a result. I've experimented by making pooled_w,h = 13, which did not work well (map=.38), as well as adjusting the number of rpn convolution features without success. Have you had better success? |
Hi @huvers, I am getting a large number of false positives same as you and the average map that I got on kitti dataset is 0.64. |
@karaspd ,Sorry for that.I just took holidays a few days, I test it for my face data and the detection result is effective,I will later get concrete result. Thanks |
Recently, I'm working on the integration of squeezenet and faster-rcnn. After checking your model @mengzhangjian ,I find out that after the roi-pooling layer ,you didn't add extra fully connected layer, would this decrease the detection performance? |
Hi @absorbguo @mengzhangjian were you able finally able to integrate SqueezeNet with faster RCNN ? is there a place where I can look at prototxt files ? |
@siddharthm83 am trying to use SqueezeNet with Faster RCNN.The following is my training prototxt.I modify the roi_pool layer:pooled_w:13,pooled_h:13.And the training process has no mistakes. While the detection result is very bad,can anyone help me solve this?
name: "VGG_ILSVRC_16_layers"
layer {
name: 'input-data'
type: 'Python'
top: 'data'
top: 'im_info'
top: 'gt_boxes'
python_param {
module: 'roi_data_layer.layer'
layer: 'RoIDataLayer'
param_str: "'num_classes': 2"
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
convolution_param {
num_output: 96
kernel_size: 7
stride: 2
weight_filler {
type: "xavier"
}
}
}
layer {
name: "relu_conv1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "fire2/squeeze1x1"
type: "Convolution"
bottom: "pool1"
top: "fire2/squeeze1x1"
convolution_param {
num_output: 16
kernel_size: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "fire2/relu_squeeze1x1"
type: "ReLU"
bottom: "fire2/squeeze1x1"
top: "fire2/squeeze1x1"
}
layer {
name: "fire2/expand1x1"
type: "Convolution"
bottom: "fire2/squeeze1x1"
top: "fire2/expand1x1"
convolution_param {
num_output: 64
kernel_size: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "fire2/relu_expand1x1"
type: "ReLU"
bottom: "fire2/expand1x1"
top: "fire2/expand1x1"
}
layer {
name: "fire2/expand3x3"
type: "Convolution"
bottom: "fire2/squeeze1x1"
top: "fire2/expand3x3"
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
}
}
}
layer {
name: "fire2/relu_expand3x3"
type: "ReLU"
bottom: "fire2/expand3x3"
top: "fire2/expand3x3"
}
layer {
name: "fire2/concat"
type: "Concat"
bottom: "fire2/expand1x1"
bottom: "fire2/expand3x3"
top: "fire2/concat"
}
layer {
name: "fire3/squeeze1x1"
type: "Convolution"
bottom: "fire2/concat"
top: "fire3/squeeze1x1"
convolution_param {
num_output: 16
kernel_size: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "fire3/relu_squeeze1x1"
type: "ReLU"
bottom: "fire3/squeeze1x1"
top: "fire3/squeeze1x1"
}
layer {
name: "fire3/expand1x1"
type: "Convolution"
bottom: "fire3/squeeze1x1"
top: "fire3/expand1x1"
convolution_param {
num_output: 64
kernel_size: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "fire3/relu_expand1x1"
type: "ReLU"
bottom: "fire3/expand1x1"
top: "fire3/expand1x1"
}
layer {
name: "fire3/expand3x3"
type: "Convolution"
bottom: "fire3/squeeze1x1"
top: "fire3/expand3x3"
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
}
}
}
layer {
name: "fire3/relu_expand3x3"
type: "ReLU"
bottom: "fire3/expand3x3"
top: "fire3/expand3x3"
}
layer {
name: "fire3/concat"
type: "Concat"
bottom: "fire3/expand1x1"
bottom: "fire3/expand3x3"
top: "fire3/concat"
}
layer {
name: "fire4/squeeze1x1"
type: "Convolution"
bottom: "fire3/concat"
top: "fire4/squeeze1x1"
convolution_param {
num_output: 32
kernel_size: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "fire4/relu_squeeze1x1"
type: "ReLU"
bottom: "fire4/squeeze1x1"
top: "fire4/squeeze1x1"
}
layer {
name: "fire4/expand1x1"
type: "Convolution"
bottom: "fire4/squeeze1x1"
top: "fire4/expand1x1"
convolution_param {
num_output: 128
kernel_size: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "fire4/relu_expand1x1"
type: "ReLU"
bottom: "fire4/expand1x1"
top: "fire4/expand1x1"
}
layer {
name: "fire4/expand3x3"
type: "Convolution"
bottom: "fire4/squeeze1x1"
top: "fire4/expand3x3"
convolution_param {
num_output: 128
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
}
}
}
layer {
name: "fire4/relu_expand3x3"
type: "ReLU"
bottom: "fire4/expand3x3"
top: "fire4/expand3x3"
}
layer {
name: "fire4/concat"
type: "Concat"
bottom: "fire4/expand1x1"
bottom: "fire4/expand3x3"
top: "fire4/concat"
}
layer {
name: "pool4"
type: "Pooling"
bottom: "fire4/concat"
top: "pool4"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "fire5/squeeze1x1"
type: "Convolution"
bottom: "pool4"
top: "fire5/squeeze1x1"
convolution_param {
num_output: 32
kernel_size: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "fire5/relu_squeeze1x1"
type: "ReLU"
bottom: "fire5/squeeze1x1"
top: "fire5/squeeze1x1"
}
layer {
name: "fire5/expand1x1"
type: "Convolution"
bottom: "fire5/squeeze1x1"
top: "fire5/expand1x1"
convolution_param {
num_output: 128
kernel_size: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "fire5/relu_expand1x1"
type: "ReLU"
bottom: "fire5/expand1x1"
top: "fire5/expand1x1"
}
layer {
name: "fire5/expand3x3"
type: "Convolution"
bottom: "fire5/squeeze1x1"
top: "fire5/expand3x3"
convolution_param {
num_output: 128
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
}
}
}
layer {
name: "fire5/relu_expand3x3"
type: "ReLU"
bottom: "fire5/expand3x3"
top: "fire5/expand3x3"
}
layer {
name: "fire5/concat"
type: "Concat"
bottom: "fire5/expand1x1"
bottom: "fire5/expand3x3"
top: "fire5/concat"
}
layer {
name: "fire6/squeeze1x1"
type: "Convolution"
bottom: "fire5/concat"
top: "fire6/squeeze1x1"
convolution_param {
num_output: 48
kernel_size: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "fire6/relu_squeeze1x1"
type: "ReLU"
bottom: "fire6/squeeze1x1"
top: "fire6/squeeze1x1"
}
layer {
name: "fire6/expand1x1"
type: "Convolution"
bottom: "fire6/squeeze1x1"
top: "fire6/expand1x1"
convolution_param {
num_output: 192
kernel_size: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "fire6/relu_expand1x1"
type: "ReLU"
bottom: "fire6/expand1x1"
top: "fire6/expand1x1"
}
layer {
name: "fire6/expand3x3"
type: "Convolution"
bottom: "fire6/squeeze1x1"
top: "fire6/expand3x3"
convolution_param {
num_output: 192
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
}
}
}
layer {
name: "fire6/relu_expand3x3"
type: "ReLU"
bottom: "fire6/expand3x3"
top: "fire6/expand3x3"
}
layer {
name: "fire6/concat"
type: "Concat"
bottom: "fire6/expand1x1"
bottom: "fire6/expand3x3"
top: "fire6/concat"
}
layer {
name: "fire7/squeeze1x1"
type: "Convolution"
bottom: "fire6/concat"
top: "fire7/squeeze1x1"
convolution_param {
num_output: 48
kernel_size: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "fire7/relu_squeeze1x1"
type: "ReLU"
bottom: "fire7/squeeze1x1"
top: "fire7/squeeze1x1"
}
layer {
name: "fire7/expand1x1"
type: "Convolution"
bottom: "fire7/squeeze1x1"
top: "fire7/expand1x1"
convolution_param {
num_output: 192
kernel_size: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "fire7/relu_expand1x1"
type: "ReLU"
bottom: "fire7/expand1x1"
top: "fire7/expand1x1"
}
layer {
name: "fire7/expand3x3"
type: "Convolution"
bottom: "fire7/squeeze1x1"
top: "fire7/expand3x3"
convolution_param {
num_output: 192
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
}
}
}
layer {
name: "fire7/relu_expand3x3"
type: "ReLU"
bottom: "fire7/expand3x3"
top: "fire7/expand3x3"
}
layer {
name: "fire7/concat"
type: "Concat"
bottom: "fire7/expand1x1"
bottom: "fire7/expand3x3"
top: "fire7/concat"
}
layer {
name: "fire8/squeeze1x1"
type: "Convolution"
bottom: "fire7/concat"
top: "fire8/squeeze1x1"
convolution_param {
num_output: 64
kernel_size: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "fire8/relu_squeeze1x1"
type: "ReLU"
bottom: "fire8/squeeze1x1"
top: "fire8/squeeze1x1"
}
layer {
name: "fire8/expand1x1"
type: "Convolution"
bottom: "fire8/squeeze1x1"
top: "fire8/expand1x1"
convolution_param {
num_output: 256
kernel_size: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "fire8/relu_expand1x1"
type: "ReLU"
bottom: "fire8/expand1x1"
top: "fire8/expand1x1"
}
layer {
name: "fire8/expand3x3"
type: "Convolution"
bottom: "fire8/squeeze1x1"
top: "fire8/expand3x3"
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
}
}
}
layer {
name: "fire8/relu_expand3x3"
type: "ReLU"
bottom: "fire8/expand3x3"
top: "fire8/expand3x3"
}
layer {
name: "fire8/concat"
type: "Concat"
bottom: "fire8/expand1x1"
bottom: "fire8/expand3x3"
top: "fire8/concat"
}
layer {
name: "pool8"
type: "Pooling"
bottom: "fire8/concat"
top: "pool8"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "fire9/squeeze1x1"
type: "Convolution"
bottom: "pool8"
top: "fire9/squeeze1x1"
convolution_param {
num_output: 64
kernel_size: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "fire9/relu_squeeze1x1"
type: "ReLU"
bottom: "fire9/squeeze1x1"
top: "fire9/squeeze1x1"
}
layer {
name: "fire9/expand1x1"
type: "Convolution"
bottom: "fire9/squeeze1x1"
top: "fire9/expand1x1"
convolution_param {
num_output: 256
kernel_size: 1
weight_filler {
type: "xavier"
}
}
}
layer {
name: "fire9/relu_expand1x1"
type: "ReLU"
bottom: "fire9/expand1x1"
top: "fire9/expand1x1"
}
layer {
name: "fire9/expand3x3"
type: "Convolution"
bottom: "fire9/squeeze1x1"
top: "fire9/expand3x3"
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
}
}
}
layer {
name: "fire9/relu_expand3x3"
type: "ReLU"
bottom: "fire9/expand3x3"
top: "fire9/expand3x3"
}
layer {
name: "fire9/concat"
type: "Concat"
bottom: "fire9/expand1x1"
bottom: "fire9/expand3x3"
top: "fire9/concat"
}
layer {
name: "drop9"
type: "Dropout"
bottom: "fire9/concat"
top: "fire9/concat"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "conv10"
type: "Convolution"
bottom: "fire9/concat"
top: "conv10"
convolution_param {
num_output: 1000
pad: 1
kernel_size: 1
weight_filler {
type: "gaussian"
mean: 0.0
std: 0.01
}
}
}
layer {
name: "relu_conv10"
type: "ReLU"
bottom: "conv10"
top: "conv10"
}
========= RPN ============
layer {
name: "rpn_conv/3x3"
type: "Convolution"
bottom: "conv10"
top: "rpn/output"
param { lr_mult: 1.0 }
param { lr_mult: 2.0 }
convolution_param {
num_output: 512
kernel_size: 3 pad: 1 stride: 1
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}
layer {
name: "rpn_relu/3x3"
type: "ReLU"
bottom: "rpn/output"
top: "rpn/output"
}
layer {
name: "rpn_cls_score"
type: "Convolution"
bottom: "rpn/output"
top: "rpn_cls_score"
param { lr_mult: 1.0 }
param { lr_mult: 2.0 }
convolution_param {
num_output: 18 # 2(bg/fg) * 9(anchors)
kernel_size: 1 pad: 0 stride: 1
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}
layer {
name: "rpn_bbox_pred"
type: "Convolution"
bottom: "rpn/output"
top: "rpn_bbox_pred"
param { lr_mult: 1.0 }
param { lr_mult: 2.0 }
convolution_param {
num_output: 36 # 4 * 9(anchors)
kernel_size: 1 pad: 0 stride: 1
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}
layer {
bottom: "rpn_cls_score"
top: "rpn_cls_score_reshape"
name: "rpn_cls_score_reshape"
type: "Reshape"
reshape_param { shape { dim: 0 dim: 2 dim: -1 dim: 0 } }
}
layer {
name: 'rpn-data'
type: 'Python'
bottom: 'rpn_cls_score'
bottom: 'gt_boxes'
bottom: 'im_info'
bottom: 'data'
top: 'rpn_labels'
top: 'rpn_bbox_targets'
top: 'rpn_bbox_inside_weights'
top: 'rpn_bbox_outside_weights'
python_param {
module: 'rpn.anchor_target_layer'
layer: 'AnchorTargetLayer'
param_str: "'feat_stride': 16"
}
}
layer {
name: "rpn_loss_cls"
type: "SoftmaxWithLoss"
bottom: "rpn_cls_score_reshape"
bottom: "rpn_labels"
propagate_down: 1
propagate_down: 0
top: "rpn_cls_loss"
loss_weight: 1
loss_param {
ignore_label: -1
normalize: true
}
}
layer {
name: "rpn_loss_bbox"
type: "SmoothL1Loss"
bottom: "rpn_bbox_pred"
bottom: "rpn_bbox_targets"
bottom: 'rpn_bbox_inside_weights'
bottom: 'rpn_bbox_outside_weights'
top: "rpn_loss_bbox"
loss_weight: 1
smooth_l1_loss_param { sigma: 3.0 }
}
========= RoI Proposal ============
layer {
name: "rpn_cls_prob"
type: "Softmax"
bottom: "rpn_cls_score_reshape"
top: "rpn_cls_prob"
}
layer {
name: 'rpn_cls_prob_reshape'
type: 'Reshape'
bottom: 'rpn_cls_prob'
top: 'rpn_cls_prob_reshape'
reshape_param { shape { dim: 0 dim: 18 dim: -1 dim: 0 } }
}
layer {
name: 'proposal'
type: 'Python'
bottom: 'rpn_cls_prob_reshape'
bottom: 'rpn_bbox_pred'
bottom: 'im_info'
top: 'rpn_rois'
top: 'rpn_scores'
python_param {
module: 'rpn.proposal_layer'
layer: 'ProposalLayer'
param_str: "'feat_stride': 16"
}
}
layer {
name: 'roi-data'
type: 'Python'
bottom: 'rpn_rois'
bottom: 'gt_boxes'
top: 'rois'
top: 'labels'
top: 'bbox_targets'
top: 'bbox_inside_weights'
top: 'bbox_outside_weights'
python_param {
module: 'rpn.proposal_target_layer'
layer: 'ProposalTargetLayer'
param_str: "'num_classes': 2"
}
}
========= RCNN ============
layer {
name: "roi_pool5"
type: "ROIPooling"
bottom: "conv10"
bottom: "rois"
top: "pool10"
roi_pooling_param {
pooled_w: 13
pooled_h: 13
spatial_scale: 0.0625 # 1/16
}
}
layer {
name: "cls_score"
type: "InnerProduct"
bottom: "pool10"
top: "cls_score"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "bbox_pred"
type: "InnerProduct"
bottom: "pool10"
top: "bbox_pred"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 8
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "loss_cls"
type: "SoftmaxWithLoss"
bottom: "cls_score"
bottom: "labels"
propagate_down: 1
propagate_down: 0
top: "loss_cls"
loss_weight: 1
}
layer {
name: "loss_bbox"
type: "SmoothL1Loss"
bottom: "bbox_pred"
bottom: "bbox_targets"
bottom: "bbox_inside_weights"
bottom: "bbox_outside_weights"
top: "loss_bbox"
loss_weight: 1
}
The text was updated successfully, but these errors were encountered: