Skip to content
forked from daijifeng001/MNC

Instance-aware Semantic Segmentation via Multi-task Network Cascades

License

Notifications You must be signed in to change notification settings

AndreiBarsan/MNC

 
 

Repository files navigation

Instance-aware Semantic Segmentation via Multi-task Network Cascades

By Jifeng Dai, Kaiming He, Jian Sun

This python version is re-implemented by Haozhi Qi when he was an intern at Microsoft Research.

Fork Information

This fork of the original repository contains a few small improvements in the demo script, allowing it to be used to batch process data in a particular folder.

It also comes with an associated install script which can be used to set this code up on a machine, without requiring root privileges. The script is tailored for users of ETH Zurich's Euryale mini-cluster (using slurm and Linux modules), but it can easily be tweaked to work on arbitrary systems.

Moreover, this fork's Caffe is more up-to-date than the original's, allowing it to work with cuDNN 5, leading to improved performance (at least in terms of inference speed, reaching about 170ms for a 1242x375 image, as compared to the 300ms mentioned in the original paper).

Introduction

MNC is an instance-aware semantic segmentation system based on deep convolutional networks, which won the first place in COCO segmentation challenge 2015, and test at a fraction of a second per image. We decompose the task of instance-aware semantic segmentation into related sub-tasks, which are solved by multi-task network cascades (MNC) with shared features. The entire MNC network is trained end-to-end with error gradients across cascaded stages.

example

MNC was initially described in a CVPR 2016 oral paper.

This repository contains a python implementation of MNC, which is ~10% slower than the original matlab implementation.

This repository includes a bilinear RoI warping layer, which enables gradient back-propagation with respect to RoI coordinates.

Misc.

This code has been tested on Linux (Ubuntu 14.04), using K40/Titan X GPUs.

The code is built based on py-faster-rcnn.

MNC is released under the MIT License (refer to the LICENSE file for details).

Citing MNC

If you find MNC useful in your research, please consider citing:

@inproceedings{dai2016instance,
    title={Instance-aware Semantic Segmentation via Multi-task Network Cascades},
    author={Dai, Jifeng and He, Kaiming and Sun, Jian},
    booktitle={CVPR},
    year={2016}
}

Main Results

training data test data mAP^[email protected] mAP^[email protected] time (K40) time (Titian X)
MNC, VGG-16 VOC 12 train VOC 12 val 65.0% 46.3% 0.42sec/img 0.33sec/img

Installation guide

  1. Clone the MNC repository:
# Make sure to clone with --recursive
git clone --recursive https://github.com/daijifeng001/MNC.git
  1. Install Python packages: numpy, scipy, cython, python-opencv, easydict, yaml.

  2. Build the Cython modules and the gpu_nms, gpu_mask_voting modules by:

cd $MNC_ROOT/lib
make
  1. Install Caffe and pycaffe dependencies (see: Caffe installation instructions for official installation guide)

Note: Caffe must be built with support for Python layers!

# In your Makefile.config, make sure to have this line uncommented
WITH_PYTHON_LAYER := 1
# CUDNN is recommended in building to reduce memory footprint
USE_CUDNN := 1
  1. Build Caffe and pycaffe:
    cd $MNC_ROOT/caffe-mnc
    # If you have all of the requirements installed
    # and your Makefile.config in place, then simply do:
    make -j8 && make pycaffe

Demo

First, download the trained MNC model.

./data/scripts/fetch_mnc_model.sh

Run the demo:

cd $MNC_ROOT
./tools/demo.py

Result demo images will be stored to data/demo/.

The demo performs instance-aware semantic segmentation with a trained MNC model (using VGG-16 net). The model is pre-trained on ImageNet, and finetuned on VOC 2012 train set with additional annotations from SBD. The mAP^r of the model is 65.0% on VOC 2012 validation set. The test speed per image is ~0.33sec on Titian X and ~0.42sec on K40.

Training

This repository contains code to end-to-end train MNC for instance-aware semantic segmentation, where gradients across cascaded stages are counted in training.

Preparation:

  1. Run ./data/scripts/fetch_imagenet_models.sh to download the ImageNet pre-trained VGG-16 net.
  2. Download the VOC 2007 dataset to ./data/VOCdevkit2007
  3. Run ./data/scripts/fetch_sbd_data.sh to download the VOC 2012 dataset together with the additional segmentation annotations in SBD to ./data/VOCdevkitSDS.

1. End-to-end training of MNC for instance-aware semantic segmentation

To end-to-end train a 5-stage MNC model (on VOC 2012 train), use experiments/scripts/mnc_5stage.sh. Final mAP^[email protected] should be ~65.0% (mAP^[email protected] should be ~46.3%), on VOC 2012 validation.

cd $MNC_ROOT
./experiments/scripts/mnc_5stage.sh [GPU_ID] VGG16 [--set ...]
# GPU_ID is the GPU you want to train on
# --set ... allows you to specify fast_rcnn.config options, e.g.
#   --set EXP_DIR seed_rng 1701 RNG_SEED 1701

2. Training of CFM for instance-aware semantic segmentation

The code also includes an entry to train a convolutional feature masking (CFM) model for instance aware semantic segmentation.

@inproceedings{dai2015convolutional,
    title={Convolutional Feature Masking for Joint Object and Stuff Segmentation},
    author={Dai, Jifeng and He, Kaiming and Sun, Jian},
    booktitle={CVPR},
    year={2015}
}
2.1. Download pre-computed MCG proposals

Download and process the pre-computed MCG proposals.

cd $MNC_ROOT
./data/scripts/fetch_mcg_data.sh
python ./tools/prepare_mcg_maskdb.py --para_job 24 --db train --output data/cache/voc_2012_train_mcg_maskdb/
python ./tools/prepare_mcg_maskdb.py --para_job 24 --db val --output data/cache/voc_2012_val_mcg_maskdb/

Resulting proposals would be at folder data/MCG/.

2.2. Train the model

Run experiments/scripts/cfm.sh to train on VOC 2012 train set. Final mAP^[email protected] should be ~60.5% (mAP^[email protected] should be ~42.6%), on VOC 2012 validation.

cd $MNC_ROOT
./experiments/scripts/cfm.sh [GPU_ID] VGG16 [--set ...]
# GPU_ID is the GPU you want to train on
# --set ... allows you to specify fast_rcnn.config options, e.g.
#   --set EXP_DIR seed_rng 1701 RNG_SEED 1701

3. End-to-end training of Faster-RCNN for object detection

Faster-RCNN can be viewed as a 2-stage cascades composed of region proposal network (RPN) and object detection network. Run script experiments/scripts/faster_rcnn_end2end.sh to train a Faster-RCNN model on VOC 2007 trainval. Final mAP^b should be ~69.1% on VOC 2007 test.

cd $MNC_ROOT
./experiments/scripts/faster_rcnn_end2end.sh [GPU_ID] VGG16 [--set ...]
# GPU_ID is the GPU you want to train on
# --set ... allows you to specify fast_rcnn.config options, e.g.
#   --set EXP_DIR seed_rng1701 RNG_SEED 1701

About

Instance-aware Semantic Segmentation via Multi-task Network Cascades

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 88.5%
  • Cuda 8.1%
  • Shell 3.2%
  • Other 0.2%