This repository contains training scripts for lightweight SSD-based face detector. The detector is based on the MobileNetV2 backbone and has a single SSD head with manually designed anchors. As a result, its computational complexity is 0.51 GMACs and it has 1.03 M of parameters.
-
Download the WIDER Face and unpack it to the
data
folder. -
Annotation in the Pascal Visual Objects in Context (Pascal VOC) format can be found in this repository. Move the annotation files from the
WIDER_train_annotations
andWIDER_val_annotations
folders to theAnnotation
folders inside the corresponding directoriesWIDER_train
andWIDER_val
. Also, copy the annotation listsval.txt
andtrain.txt
todata/WIDERFace
fromWIDER_train_annotations
andWIDER_val_annotations
. The directory should be like this:data └── WIDERFace ├── WIDER_train │ ├──0--Parade │ ├── ... │ └── Annotations ├── WIDER_val │ ├──0--Parade │ ├── ... │ └── Annotations ├── val.txt └── train.txt
- Download pretrained MobileNetV2 weights
mobilenet_v2.pth.tar
. Move the file with weights to the foldersnapshots
. Or use the checkpoint that was trained on Wider. - To train the detector on a single GPU, run in your terminal:
python3 ../../external/mmdetection/tools/train.py \ configs/mobilenetv2_tiny_ssd300_wider_face.py
-
To dump detection of your model:
python3 ../../external/mmdetection/tools/test.py \ configs/mobilenetv2_tiny_ssd300_wider_face.py \ <CHECKPOINT> \ --out result.pkl
-
Then run the following:
python3 ../../external/mmdetection/tools/voc_eval.py \ result.pkl \ configs/mobilenetv2_tiny_ssd300_wider_face.py
Observe 0.305 AP on the validation set. For more detailed results and comparison with vanilla SSD300, see ../../external/mmdetection/configs/wider_face/README.md
.
-
Convert PyTorch* model to the ONNX* format by running the script:
python3 tools/onnx_export.py \ configs/mobilenetv2_tiny_ssd300_wider_face.py <CHECKPOINT> \ face_detector.onnx
-
Convert ONNX model to the OpenVINO™ format with the Model Optimizer with the command below:
mo.py --input_model face_detector.onnx \ --scale 255 \ --reverse_input_channels \ --output_dir=./IR \ --data_type=FP32
This produces model face_detector.xml
and weights face_detector.bin
in single-precision floating-point format
(FP32). The obtained model expects normalized image in planar BGR format.
To run the demo, connect a webcam end execute the command:
python3 tools/detection_live_demo.py \
configs/mobilenetv2_tiny_ssd300_wider_face.py \
<CHECKPOINT> \
--cam_id 0
To get per-layer computational complexity estimations, run the following command:
python3 tools/count_flops.py configs/mobilenetv2_tiny_ssd300_wider_face.py
- Dataset should have the same data layout as WIDER Face in the Pascal VOC format described in this instruction.
- Fine-tuning steps are the same as the step 2 for training, but some adjustments in
config
are needed:- specify initial checkpoint containing a valid detector in
load_from
field of configconfigs/mobilenetv2_tiny_ssd300_wider_face.py
- edit the
data
section of config to pass a custom dataset.
- specify initial checkpoint containing a valid detector in