This repository contains training scripts for lightweight SSD-based face detector. The detector is based on the MobileNetV2 backbone and has single SSD head with manually designed anchors. As a result it has computational complexity 0.51 GMACs and 1.03 M of parameters.
- Download the WIDER Face and unpack it to
data
folder. - Annotation in the VOC format can be found in this
repo. Move the annotation files from
WIDER_train_annotations
andWIDER_val_annotations
folders to theAnnotation
folders inside the corresponding directoriesWIDER_train
andWIDER_val
. Also annotation listsval.txt
andtrain.txt
should be copied todata/WIDERFace
fromWIDER_train_annotations
andWIDER_val_annotations
. The directory should be like this:
data
└── WIDERFace
├── WIDER_train
│ ├──0--Parade
│ ├── ...
│ └── Annotations
├── WIDER_val
│ ├──0--Parade
│ ├── ...
│ └── Annotations
├── val.txt
└── train.txt
- Download pre-trained MobileNetV2 weights
mobilenet_v2.pth.tar
from: https://github.com/tonylins/pytorch-mobilenet-v2. Move the file with weights to the foldersnapshots
. Or checkpoint that was trained on Wider. - To train the detector on a single GPU run in terminal:
python3 ../../external/mmdetection/tools/train.py \
configs/mobilenetv2_tiny_ssd300_wider_face.py
- To dumb detection of your model:
python3 ../../external/mmdetection/tools/test.py \
configs/mobilenetv2_tiny_ssd300_wider_face.py \
<CHECKPOINT> \
--out result.pkl
- Then run
python3 ../../external/mmdetection/tools/voc_eval.py \
result.pkl \
configs/mobilenetv2_tiny_ssd300_wider_face.py
One should observe 0.305 AP on validation set. For more detailed results and comparison with vanilla SSD300 see ../../external/mmdetection/configs/wider_face/README.md
.
- Convert PyTorch model to ONNX format: run script in terminal
python3 tools/onnx_export.py \
configs/mobilenetv2_tiny_ssd300_wider_face.py
<CHECKPOINT> \
face_detector.onnx
- Convert ONNX model to OpenVINO format with Model Optimizer: run in terminal
mo.py --input_model face_detector.onnx \
--scale 255 \
--reverse_input_channels \
--output_dir=./IR \
--data_type=FP32
This produces model face_detector.xml
and weights face_detector.bin
in single-precision floating-point format
(FP32). The obtained model expects normalized image in planar BGR format.
To run the demo connect a webcam end execute command:
python3 tools/detection_live_demo.py \
configs/mobilenetv2_tiny_ssd300_wider_face.py \
<CHECKPOINT> \
--cam_id 0
To get per layer computational complexity estimations run the following command:
python3 tools/count_flops.py configs/mobilenetv2_tiny_ssd300_wider_face.py
- Dataset should have the same data layout as WIDER Face in VOC format described in this instruction.
- Fine-tuning steps are the same as step 2 for training, but some adjustments in config are needed:
- specify initial checkpoint containing a valid detector in
load_from
field of configconfigs/mobilenetv2_tiny_ssd300_wider_face.py
- edit
data
section of config to pass a custom dataset.
- specify initial checkpoint containing a valid detector in