The major contributors of this repository include Haozhi Qi, Yi Li, Guodong Zhang, Haochen Zhang, Jifeng Dai, and Yichen Wei.
FCIS is a fully convolutional end-to-end solution for instance segmentation, which won the first place in COCO segmentation challenge 2016.
FCIS is initially described in a CVPR 2017 spotlight paper. It is worth noticing that:
- FCIS provides a simple, fast and accurate framework for instance segmentation.
- Different from MNC, FCIS performs instance mask estimation and categorization jointly and simultanously, and estimates class-specific masks.
- We did not exploit the various techniques & tricks in the Mask RCNN system, like increasing RPN anchor numbers (from 12 to 15), training on anchors out of image boundary, enlarging the image (shorter side from 600 to 800 pixels), utilizing FPN features and aligned ROI pooling. These techniques & tricks should be orthogonal to our simple baseline.
- Visual results on the first 5k images from COCO test set of our COCO 2016 challenge entry: OneDrive.
- Slides in ImageNet ILSVRC and COCO workshop 2016: OneDrive.
This is an official implementation for Fully Convolutional Instance-aware Semantic Segmentation (FCIS) based on MXNet. It is worth noticing that:
- The original implementation is based on our internal Caffe version on Windows. There are slight differences in the final accuracy and running time due to the plenty details in platform switch.
- The code is tested on official MXNet@(commit 62ecb60) with the extra operators for FCIS.
- We trained our model based on the ImageNet pre-trained ResNet-v1-101 using a model converter. The converted model produces slightly lower accuracy (Top-1 Error on ImageNet val: 24.0% v.s. 23.6%).
- This repository used code from MXNet rcnn example and mx-rfcn.
© Microsoft, 2017. Licensed under an Apache-2.0 license.
If you find FCIS useful in your research, please consider citing:
@inproceedings{li2016fully,
Author = {Yi Li, Haozhi Qi, Jifeng Dai, Xiangyang Ji and Yichen Wei}
Title = {Fully Convolutional Instance-aware Semantic Segmentation},
Conference = {CVPR},
year = {2017}
}
training data | testing data | mAP^[email protected] | mAP^[email protected] | time | |
---|---|---|---|---|---|
FCIS, ResNet-v1-101 | VOC 2012 train | VOC 2012 val | 66.0 | 51.9 | 0.23s |
training data | testing data | mAP^r | mAP^[email protected] | mAP^[email protected] | mAP^r@S | mAP^r@M | mAP^r@L | |
---|---|---|---|---|---|---|---|---|
FCIS, ResNet-v1-101, OHEM | coco trainval35k | coco minival | 28.7 | 50.5 | 28.8 | 7.7 | 31.0 | 50.1 |
FCIS, ResNet-v1-101, OHEM | coco trainval35k | coco test-dev | 29.0 | 51.2 | 29.5 | 7.7 | 30.6 | 48.9 |
Running time is counted on a single Maxwell Titan X GPU (mini-batch size is 1 in inference).
-
MXNet from the offical repository. We tested our code on MXNet@(commit 62ecb60). Due to the rapid development of MXNet, it is recommended to checkout this version if you encounter any issues. We may maintain this repository periodically if MXNet adds important feature in future release.
-
Python packages might missing: cython, opencv-python >= 3.2.0, easydict. If
pip
is set up on your system, those packages should be able to be fetched and installed by runningpip install Cython pip install opencv-python==3.2.0.6 pip install easydict==1.6 pip install hickle
-
For Windows users, Visual Studio 2015 is needed to compile cython module.
Any NVIDIA GPUs with at least 5GB memory should be OK
- Clone the FCIS repository
git clone https://github.com/msracver/FCIS.git
- For Windows users, run
cmd .\init.bat
. For Linux user, runsh ./init.sh
. The scripts will build cython module automatically and create some folders. - Copy operators in
./fcis/operator_cxx
to$(YOUR_MXNET_FOLDER)/src/operator/contrib
and recompile MXNet. - Please install MXNet following the official guide of MXNet. For advanced users, you may put your Python packge into
./external/mxnet/$(YOUR_MXNET_PACKAGE)
, and modifyMXNET_VERSION
in./experiments/fcis/cfgs/*.yaml
to$(YOUR_MXNET_PACKAGE)
. Thus you can switch among different versions of MXNet quickly.
-
To run the demo with our trained model (on COCO trainval35k), please download the model manually from OneDrive, and put it under folder
model/
.Make sure it looks like this:
./model/fcis_coco-0000.params
-
Run
python ./fcis/demo.py
-
Please download VOC 2012 dataset with additional annotations from SBD. Move
inst, cls, img
folders to VOCdevit and make sure it looks like this:Please use the train&val split in this repo, which follows the protocal of SDS.
.data/VOCdevkit/VOCSDS/img/ .data/VOCdevkit/VOCSDS/inst/ .data/VOCdevkit/VOCSDS/cls/
-
Please download COCO dataset and annotations for the 5k image minival subset and val2014 minus minival (val35k). Make sure it looks like this:
.data/coco/ .data/coco/annotations/instances_valminusminival2014.json .data/coco/annotations/instances_minival2014.json
-
Please download ImageNet-pretrained ResNet-v1-101 model manually from OneDrive, and put it under folder
./model
. Make sure it looks like this:./model/pretrained_model/resnet_v1_101-0000.params
- All of our experiment settings (GPU #, dataset, etc.) are kept in yaml config files at folder
./experiments/fcis/cfgs
. - Two config files have been provided so far: FCIS@COCO with OHEM and FCIS@VOC without OHEM. We use 8 and 4 GPUs to train models on COCO and on VOC, respectively.
- To perform experiments, run the python scripts with the corresponding config file as input. For example, to train and test FCIS on COCO with ResNet-v1-101, use the following command
A cache folder would be created automatically to save the model and the log under
python experiments/fcis/fcis_end2end_train_test.py --cfg experiments/fcis/cfgs/resnet_v1_101_coco_fcis_end2end_ohem.yaml
output/fcis/coco/
oroutput/fcis/voc/
. - Please find more details in config files and in our code.
Code has been tested under:
- Ubuntu 14.04 with a Maxwell Titan X GPU and Intel Xeon CPU E5-2620 v2 @ 2.10GHz
- Windows Server 2012 R2 with 8 K40 GPUs and Intel Xeon CPU E5-2650 v2 @ 2.60GHz
- Windows Server 2012 R2 with 4 Pascal Titan X GPUs and Intel Xeon CPU E5-2650 v4 @ 2.30GHz