Skip to content

CharlesShang/TFFRCNN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TFFRCNN

This is an experimental Tensor Flow implementation of Faster RCNN (TFFRCNN), mainly based on the work of smallcorgi and rbgirshick. I have re-organized the libraries under lib path, making each of python modules independent to each other, so you can understand, re-write the code easily.

For details about R-CNN please refer to the paper Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks by Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun.

What's New

  • Resnet networks support
  • KITTI object detection dataset support
  • Position Sensitive ROI Pooling (psroi_pooling), not testing yet
  • Hard Example Mining
  • Data Augment
  • PVANet
  • Tensorflow 1.0
  • R-FCN
  • Multi-layer Architecture (HyperNet)
  • more hacks...

Acknowledgments:

  1. py-faster-rcnn

  2. Faster-RCNN_TF

  3. ROI pooling

Requirements: software

  1. Requirements for Tensorflow (see: Tensorflow)

  2. Python packages you might not have: cython, python-opencv, easydict (recommend to install: Anaconda)

Requirements: hardware

  1. For training the end-to-end version of Faster R-CNN with VGG16, 3G of GPU memory is sufficient (using CUDNN)

Installation (sufficient for the demo)

  1. Clone the Faster R-CNN repository
git clone https://github.com/CharlesShang/TFFRCNN.git
  1. Build the Cython modules
    cd TFFRCNN/lib
    make # compile cython and roi_pooling_op, you may need to modify make.sh for your platform

Demo

After successfully completing basic installation, you'll be ready to run the demo.

To run the demo

cd $TFFRCNN
python ./faster_rcnn/demo.py --model model_path

The demo performs detection using a VGG16 network trained for detection on PASCAL VOC 2007.

Download list

  1. VGG16 trained on ImageNet

  2. VGG16 - TFFRCNN (0.689 mAP on VOC07).

  3. VGG16 - TFFRCNN (0.748 mAP on VOC07)

  4. Resnet50 trained on ImageNet

  5. Resnet50 - TFFRCNN (0.712 mAP on VOC07)

  6. PVANet trained on ImageNet, converted from caffemodel

Training on Pascal VOC 2007

  1. Download the training, validation, test data and VOCdevkit

    wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
    wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
    wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar
  2. Extract all of these tars into one directory named VOCdevkit

    tar xvf VOCtrainval_06-Nov-2007.tar
    tar xvf VOCtest_06-Nov-2007.tar
    tar xvf VOCdevkit_08-Jun-2007.tar
  3. It should have this basic structure

    $VOCdevkit/                           # development kit
    $VOCdevkit/VOCcode/                   # VOC utility code
    $VOCdevkit/VOC2007                    # image sets, annotations, etc.
    # ... and several other directories ...
  4. Create symlinks for the PASCAL VOC dataset

    cd $TFFRCNN/data
    ln -s $VOCdevkit VOCdevkit2007
  5. Download pre-trained model VGG16 and put it in the path ./data/pretrain_model/VGG_imagenet.npy

  6. Run training scripts

    cd $TFFRCNN
    python ./faster_rcnn/train_net.py --gpu 0 --weights ./data/pretrain_model/VGG_imagenet.npy --imdb voc_2007_trainval --iters 70000 --cfg  ./experiments/cfgs/faster_rcnn_end2end.yml --network VGGnet_train --set EXP_DIR exp_dir
  7. Run a profiling

    cd $TFFRCNN
    # install a visualization tool
    sudo apt-get install graphviz  
    ./experiments/profiling/run_profiling.sh 
    # generate an image ./experiments/profiling/profile.png

Training on KITTI detection dataset

  1. Download the KITTI detection dataset

    http://www.cvlibs.net/datasets/kitti/eval_object.php
    
  2. Extract all of these tar into ./TFFRCNN/data/ and the directory structure looks like this:

    KITTI
        |-- training
                |-- image_2
                    |-- [000000-007480].png
                |-- label_2
                    |-- [000000-007480].txt
        |-- testing
                |-- image_2
                    |-- [000000-007517].png
                |-- label_2
                    |-- [000000-007517].txt
    
  3. Convert KITTI into Pascal VOC format

    cd $TFFRCNN
    ./experiments/scripts/kitti2pascalvoc.py \
    --kitti $TFFRCNN/data/KITTI --out $TFFRCNN/data/KITTIVOC
  4. The output directory looks like this:

    KITTIVOC
        |-- Annotations
                |-- [000000-007480].xml
        |-- ImageSets
                |-- Main
                    |-- [train|val|trainval].txt
        |-- JPEGImages
                |-- [000000-007480].jpg
    
  5. Training on KITTIVOC is just like on Pascal VOC 2007

    python ./faster_rcnn/train_net.py \
    --gpu 0 \
    --weights ./data/pretrain_model/VGG_imagenet.npy \
    --imdb kittivoc_train \
    --iters 160000 \
    --cfg ./experiments/cfgs/faster_rcnn_kitti.yml \
    --network VGGnet_train