Merge pull request #14 from facebookresearch/master

merge from source
Ricardozzf · Apr 8, 2019 · f7f2656 · f7f2656
2 parents d81587f + f917a55
commit f7f2656
Show file tree

Hide file tree

Showing 27 changed files with 527 additions and 63 deletions.
diff --git a/INSTALL.md b/INSTALL.md
@@ -1,7 +1,7 @@
 ## Installation
 
 ### Requirements:
-- PyTorch 1.0 from a nightly release. Installation instructions can be found in https://pytorch.org/get-started/locally/
+- PyTorch 1.0 from a nightly release. It **will not** work with 1.0 nor 1.0.1. Installation instructions can be found in https://pytorch.org/get-started/locally/
 - torchvision from master
 - cocoapi
 - yacs
@@ -24,18 +24,13 @@ conda activate maskrcnn_benchmark
 conda install ipython
 
 # maskrcnn_benchmark and coco api dependencies
-pip install -r requirements.txt
+pip install ninja yacs cython matplotlib tqdm
 
 # follow PyTorch installation in https://pytorch.org/get-started/locally/
 # we give the instructions for CUDA 9.0
-conda install pytorch-nightly cudatoolkit=9.0 -c pytorch
+conda install -c pytorch pytorch-nightly torchvision cudatoolkit=9.0
 
 export INSTALL_DIR=$PWD
-# install torchvision
-cd $INSTALL_DIR
-git clone https://github.com/pytorch/vision.git
-cd vision
-python setup.py install
 
 # install pycocotools
 cd $INSTALL_DIR
@@ -47,12 +42,14 @@ python setup.py build_ext install
 cd $INSTALL_DIR
 git clone https://github.com/facebookresearch/maskrcnn-benchmark.git
 cd maskrcnn-benchmark
+
 # the following will install the lib with
 # symbolic links, so that you can modify
 # the files if you want and won't need to
 # re-build it
 python setup.py build develop
 
+
 unset INSTALL_DIR
 
 # or if you are on macOS
@@ -61,13 +58,17 @@ unset INSTALL_DIR
 
 ### Option 2: Docker Image (Requires CUDA, Linux only)
 
-Build image with defaults (`CUDA=9.0`, `CUDNN=7`):
+Build image with defaults (`CUDA=9.0`, `CUDNN=7`, `FORCE_CUDA=1`):
 
     nvidia-docker build -t maskrcnn-benchmark docker/
 
 Build image with other CUDA and CUDNN versions:
 
-    nvidia-docker build -t maskrcnn-benchmark --build-arg CUDA=9.2 --build-arg CUDNN=7 docker/ 
+    nvidia-docker build -t maskrcnn-benchmark --build-arg CUDA=9.2 --build-arg CUDNN=7 docker/
+
+Build image with FORCE_CUDA disabled:
+
+    nvidia-docker build -t maskrcnn-benchmark --build-arg FORCE_CUDA=0 docker/
 
 Build and run image with built-in jupyter notebook(note that the password is used to log in jupyter notebook):
 

diff --git a/MODEL_ZOO.md b/MODEL_ZOO.md
@@ -33,7 +33,28 @@ backbone | type | lr sched | im / gpu | train mem(GB) | train time (s/iter) | to
 -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | --
 R-50-FPN | Keypoint | 1x | 2 | 5.7 | 0.3771 | 9.4 | 0.10941 | 53.7 | 64.3 | 9981060
 
+### Light-weight Model baselines
 
+We provided pre-trained models for selected FBNet models. 
+* All the models are trained from scratched with BN using the training schedule specified below. 
+* Evaluation is performed on a single NVIDIA V100 GPU with `MODEL.RPN.POST_NMS_TOP_N_TEST` set to `200`. 
+
+The following inference time is reported:
+  * inference total batch=8: Total inference time including data loading, model inference and pre/post preprocessing using 8 images per batch.
+  * inference model batch=8: Model inference time only and using 8 images per batch.
+  * inference model batch=1: Model inference time only and using 1 image per batch.
+  * inferenee caffe2 batch=1: Model inference time for the model in Caffe2 format using 1 image per batch. The Caffe2 models fused the BN to Conv and purely run on C++/CUDA by using Caffe2 ops for rpn/detection post processing.
+
+The pre-trained models are available in the link in the model id.
+
+backbone | type | resolution | lr sched | im / gpu | train mem(GB) | train time (s/iter) | total train time (hr) | inference total batch=8 (s/im) | inference model batch=8 (s/im) | inference model batch=1 (s/im) | inference caffe2 batch=1 (s/im) | box AP | mask AP | model id
+-- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | --
+[R-50-C4](configs/e2e_faster_rcnn_R_50_C4_1x.yaml) (reference) | Fast | 800 | 1x | 1 | 5.8 | 0.4036 | 20.2 | 0.0875 | **0.0793** | 0.0831 | **0.0625** | 34.4 | - | f35857197
+[fbnet_chamv1a](configs/e2e_faster_rcnn_fbnet_chamv1a_600.yaml) | Fast | 600 | 0.75x | 12 | 13.6 | 0.5444 | 20.5 | 0.0315 | **0.0260** | 0.0376 | **0.0188** | 33.5 | - | [f100940543](https://download.pytorch.org/models/maskrcnn/e2e_faster_rcnn_fbnet_chamv1a_600.pth)
+[fbnet_default](configs/e2e_faster_rcnn_fbnet_600.yaml) | Fast | 600 | 0.5x | 16 | 11.1 | 0.4872 | 12.5 | 0.0316 | **0.0250** | 0.0297 | **0.0130** | 28.2 | - | [f101086388](https://download.pytorch.org/models/maskrcnn/e2e_faster_rcnn_fbnet_600.pth)
+[R-50-C4](configs/e2e_mask_rcnn_R_50_C4_1x.yaml) (reference) | Mask | 800 | 1x | 1 | 5.8 | 0.452 | 22.6 | 0.0918 | **0.0848** | 0.0844 | - | 35.2 | 31.0 | f35858791
+[fbnet_xirb16d](configs/e2e_mask_rcnn_fbnet_xirb16d_dsmask_600.yaml) | Mask | 600 | 0.5x | 16 | 13.4 | 1.1732 | 29 | 0.0386 | **0.0319** | 0.0356 | - | 30.7 | 26.9 | [f101086394](https://download.pytorch.org/models/maskrcnn/e2e_mask_rcnn_fbnet_xirb16d_dsmask.pth)
+[fbnet_default](configs/e2e_mask_rcnn_fbnet_600.yaml) | Mask | 600 | 0.5x | 16 | 13.0 | 0.9036 | 23.0 | 0.0327 | **0.0269** | 0.0385 | - | 29.0 | 26.1 | [f101086385](https://download.pytorch.org/models/maskrcnn/e2e_mask_rcnn_fbnet_600.pth)
 
 ## Comparison with Detectron and mmdetection
 

diff --git a/README.md b/README.md
@@ -198,11 +198,21 @@ That's it. You can also add extra fields to the boxlist, such as segmentation ma
 
 For a full example of how the `COCODataset` is implemented, check [`maskrcnn_benchmark/data/datasets/coco.py`](maskrcnn_benchmark/data/datasets/coco.py).
 
-### Note:
+Once you have created your dataset, it needs to be added in a couple of places:
+- [`maskrcnn_benchmark/data/datasets/__init__.py`](maskrcnn_benchmark/data/datasets/__init__.py): add it to `__all__`
+- [`maskrcnn_benchmark/config/paths_catalog.py`](maskrcnn_benchmark/config/paths_catalog.py): `DatasetCatalog.DATASETS` and corresponding `if` clause in `DatasetCatalog.get()`
+
+### Testing
 While the aforementioned example should work for training, we leverage the
 cocoApi for computing the accuracies during testing. Thus, test datasets
 should currently follow the cocoApi for now.
 
+To enable your dataset for testing, add a corresponding if statement in [`maskrcnn_benchmark/data/datasets/evaluation/__init__.py`](maskrcnn_benchmark/data/datasets/evaluation/__init__.py):
+```python
+if isinstance(dataset, datasets.MyDataset):
+        return coco_evaluation(**args)
+```
+
 ## Finetuning from Detectron weights on custom datasets
 Create a script `tools/trim_detectron_model.py` like [here](https://gist.github.com/wangg12/aea194aa6ab6a4de088f14ee193fd968).
 You can decide which keys to be removed and which keys to be kept by modifying the script.
@@ -221,7 +231,7 @@ Please consider citing this project in your publications if it helps your resear
 ```
 @misc{massa2018mrcnn,
 author = {Massa, Francisco and Girshick, Ross},
-title = {{maskrnn-benchmark: Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch}},
+title = {{maskrcnn-benchmark: Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch}},
 year = {2018},
 howpublished = {\url{https://github.com/facebookresearch/maskrcnn-benchmark}},
 note = {Accessed: [Insert date here]}

diff --git a/configs/caffe2/e2e_faster_rcnn_X_101_32x8d_FPN_1x_caffe2.yaml b/configs/caffe2/e2e_faster_rcnn_X_101_32x8d_FPN_1x_caffe2.yaml
@@ -5,6 +5,9 @@ MODEL:
     CONV_BODY: "R-101-FPN"
   RESNETS:
     BACKBONE_OUT_CHANNELS: 256
+    STRIDE_IN_1X1: False
+    NUM_GROUPS: 32
+    WIDTH_PER_GROUP: 8
   RPN:
     USE_FPN: True
     ANCHOR_STRIDE: (4, 8, 16, 32, 64)
@@ -20,10 +23,6 @@ MODEL:
     POOLER_SAMPLING_RATIO: 2
     FEATURE_EXTRACTOR: "FPN2MLPFeatureExtractor"
     PREDICTOR: "FPNPredictor"
-  RESNETS:
-    STRIDE_IN_1X1: False
-    NUM_GROUPS: 32
-    WIDTH_PER_GROUP: 8
 DATASETS:
   TEST: ("coco_2014_minival",)
 DATALOADER:

diff --git a/configs/e2e_faster_rcnn_fbnet.yaml b/configs/e2e_faster_rcnn_fbnet.yaml
@@ -15,7 +15,7 @@ MODEL:
     PRE_NMS_TOP_N_TRAIN: 6000
     PRE_NMS_TOP_N_TEST: 6000
     POST_NMS_TOP_N_TRAIN: 2000
-    POST_NMS_TOP_N_TEST: 1000
+    POST_NMS_TOP_N_TEST: 100
     RPN_HEAD: FBNet.rpn_head
   ROI_HEADS:
     BATCH_SIZE_PER_IMAGE: 512

diff --git a/configs/e2e_faster_rcnn_fbnet_600.yaml b/configs/e2e_faster_rcnn_fbnet_600.yaml
@@ -15,7 +15,7 @@ MODEL:
     PRE_NMS_TOP_N_TRAIN: 6000
     PRE_NMS_TOP_N_TEST: 6000
     POST_NMS_TOP_N_TRAIN: 2000
-    POST_NMS_TOP_N_TEST: 1000
+    POST_NMS_TOP_N_TEST: 200
     RPN_HEAD: FBNet.rpn_head
   ROI_HEADS:
     BATCH_SIZE_PER_IMAGE: 256

diff --git a/configs/e2e_faster_rcnn_fbnet_chamv1a_600.yaml b/configs/e2e_faster_rcnn_fbnet_chamv1a_600.yaml
@@ -0,0 +1,44 @@
+MODEL:
+  META_ARCHITECTURE: "GeneralizedRCNN"
+  BACKBONE:
+    CONV_BODY: FBNet
+  FBNET:
+    ARCH: "cham_v1a"
+    BN_TYPE: "bn"
+    WIDTH_DIVISOR: 8
+    DW_CONV_SKIP_BN: True
+    DW_CONV_SKIP_RELU: True
+  RPN:
+    ANCHOR_SIZES: (32, 64, 128, 256, 512)
+    ANCHOR_STRIDE: (16, )
+    BATCH_SIZE_PER_IMAGE: 256
+    PRE_NMS_TOP_N_TRAIN: 6000
+    PRE_NMS_TOP_N_TEST: 6000
+    POST_NMS_TOP_N_TRAIN: 2000
+    POST_NMS_TOP_N_TEST: 200
+    RPN_HEAD: FBNet.rpn_head
+  ROI_HEADS:
+    BATCH_SIZE_PER_IMAGE: 128
+  ROI_BOX_HEAD:
+    POOLER_RESOLUTION: 6
+    FEATURE_EXTRACTOR: FBNet.roi_head
+    NUM_CLASSES: 81
+DATASETS:
+  TRAIN: ("coco_2014_train", "coco_2014_valminusminival")
+  TEST: ("coco_2014_minival",)
+SOLVER:
+  BASE_LR: 0.045
+  WARMUP_FACTOR: 0.1
+  WEIGHT_DECAY: 0.0001
+  STEPS: (90000, 120000)
+  MAX_ITER: 135000
+  IMS_PER_BATCH: 96  # for 8GPUs
+# TEST:
+#   IMS_PER_BATCH: 8
+INPUT:
+  MIN_SIZE_TRAIN: (600, )
+  MAX_SIZE_TRAIN: 1000
+  MIN_SIZE_TEST: 600
+  MAX_SIZE_TEST: 1000
+  PIXEL_MEAN: [103.53, 116.28, 123.675]
+  PIXEL_STD: [57.375, 57.12, 58.395]
diff --git a/configs/e2e_mask_rcnn_fbnet.yaml b/configs/e2e_mask_rcnn_fbnet.yaml
@@ -8,15 +8,15 @@ MODEL:
     WIDTH_DIVISOR: 8
     DW_CONV_SKIP_BN: True
     DW_CONV_SKIP_RELU: True
-    DET_HEAD_LAST_SCALE: -1.0
+    DET_HEAD_LAST_SCALE: 0.0
   RPN:
     ANCHOR_SIZES: (16, 32, 64, 128, 256)
     ANCHOR_STRIDE: (16, )
     BATCH_SIZE_PER_IMAGE: 256
     PRE_NMS_TOP_N_TRAIN: 6000
     PRE_NMS_TOP_N_TEST: 6000
     POST_NMS_TOP_N_TRAIN: 2000
-    POST_NMS_TOP_N_TEST: 1000
+    POST_NMS_TOP_N_TEST: 100
     RPN_HEAD: FBNet.rpn_head
   ROI_HEADS:
     BATCH_SIZE_PER_IMAGE: 256

diff --git a/configs/e2e_mask_rcnn_fbnet_600.yaml b/configs/e2e_mask_rcnn_fbnet_600.yaml
@@ -0,0 +1,52 @@
+MODEL:
+  META_ARCHITECTURE: "GeneralizedRCNN"
+  BACKBONE:
+    CONV_BODY: FBNet
+  FBNET:
+    ARCH: "default"
+    BN_TYPE: "bn"
+    WIDTH_DIVISOR: 8
+    DW_CONV_SKIP_BN: True
+    DW_CONV_SKIP_RELU: True
+    DET_HEAD_LAST_SCALE: 0.0
+  RPN:
+    ANCHOR_SIZES: (32, 64, 128, 256, 512)
+    ANCHOR_STRIDE: (16, )
+    BATCH_SIZE_PER_IMAGE: 256
+    PRE_NMS_TOP_N_TRAIN: 6000
+    PRE_NMS_TOP_N_TEST: 6000
+    POST_NMS_TOP_N_TRAIN: 2000
+    POST_NMS_TOP_N_TEST: 200
+    RPN_HEAD: FBNet.rpn_head
+  ROI_HEADS:
+    BATCH_SIZE_PER_IMAGE: 256
+  ROI_BOX_HEAD:
+    POOLER_RESOLUTION: 6
+    FEATURE_EXTRACTOR: FBNet.roi_head
+    NUM_CLASSES: 81
+  ROI_MASK_HEAD:
+    POOLER_RESOLUTION: 6
+    FEATURE_EXTRACTOR: FBNet.roi_head_mask
+    PREDICTOR: "MaskRCNNConv1x1Predictor"
+    RESOLUTION: 12
+    SHARE_BOX_FEATURE_EXTRACTOR: False
+  MASK_ON: True
+DATASETS:
+  TRAIN: ("coco_2014_train", "coco_2014_valminusminival")
+  TEST: ("coco_2014_minival",)
+SOLVER:
+  BASE_LR: 0.06
+  WARMUP_FACTOR: 0.1
+  WEIGHT_DECAY: 0.0001
+  STEPS: (60000, 80000)
+  MAX_ITER: 90000
+  IMS_PER_BATCH: 128  # for 8GPUs
+# TEST:
+#   IMS_PER_BATCH: 8
+INPUT:
+  MIN_SIZE_TRAIN: (600, )
+  MAX_SIZE_TRAIN: 1000
+  MIN_SIZE_TEST: 600
+  MAX_SIZE_TEST: 1000
+  PIXEL_MEAN: [103.53, 116.28, 123.675]
+  PIXEL_STD: [57.375, 57.12, 58.395]
diff --git a/configs/e2e_mask_rcnn_fbnet_xirb16d_dsmask.yaml b/configs/e2e_mask_rcnn_fbnet_xirb16d_dsmask.yaml
@@ -16,7 +16,7 @@ MODEL:
     PRE_NMS_TOP_N_TRAIN: 6000
     PRE_NMS_TOP_N_TEST: 6000
     POST_NMS_TOP_N_TRAIN: 2000
-    POST_NMS_TOP_N_TEST: 1000
+    POST_NMS_TOP_N_TEST: 100
     RPN_HEAD: FBNet.rpn_head
   ROI_HEADS:
     BATCH_SIZE_PER_IMAGE: 512

diff --git a/configs/e2e_mask_rcnn_fbnet_xirb16d_dsmask_600.yaml b/configs/e2e_mask_rcnn_fbnet_xirb16d_dsmask_600.yaml
@@ -0,0 +1,52 @@
+MODEL:
+  META_ARCHITECTURE: "GeneralizedRCNN"
+  BACKBONE:
+    CONV_BODY: FBNet
+  FBNET:
+    ARCH: "xirb16d_dsmask"
+    BN_TYPE: "bn"
+    WIDTH_DIVISOR: 8
+    DW_CONV_SKIP_BN: True
+    DW_CONV_SKIP_RELU: True
+    DET_HEAD_LAST_SCALE: 0.0
+  RPN:
+    ANCHOR_SIZES: (32, 64, 128, 256, 512)
+    ANCHOR_STRIDE: (16, )
+    BATCH_SIZE_PER_IMAGE: 256
+    PRE_NMS_TOP_N_TRAIN: 6000
+    PRE_NMS_TOP_N_TEST: 6000
+    POST_NMS_TOP_N_TRAIN: 2000
+    POST_NMS_TOP_N_TEST: 200
+    RPN_HEAD: FBNet.rpn_head
+  ROI_HEADS:
+    BATCH_SIZE_PER_IMAGE: 256
+  ROI_BOX_HEAD:
+    POOLER_RESOLUTION: 6
+    FEATURE_EXTRACTOR: FBNet.roi_head
+    NUM_CLASSES: 81
+  ROI_MASK_HEAD:
+    POOLER_RESOLUTION: 6
+    FEATURE_EXTRACTOR: FBNet.roi_head_mask
+    PREDICTOR: "MaskRCNNConv1x1Predictor"
+    RESOLUTION: 12
+    SHARE_BOX_FEATURE_EXTRACTOR: False
+  MASK_ON: True
+DATASETS:
+  TRAIN: ("coco_2014_train", "coco_2014_valminusminival")
+  TEST: ("coco_2014_minival",)
+SOLVER:
+  BASE_LR: 0.06
+  WARMUP_FACTOR: 0.1
+  WEIGHT_DECAY: 0.0001
+  STEPS: (60000, 80000)
+  MAX_ITER: 90000
+  IMS_PER_BATCH: 128  # for 8GPUs
+# TEST:
+#   IMS_PER_BATCH: 8
+INPUT:
+  MIN_SIZE_TRAIN: (600, )
+  MAX_SIZE_TRAIN: 1000
+  MIN_SIZE_TEST: 600
+  MAX_SIZE_TEST: 1000
+  PIXEL_MEAN: [103.53, 116.28, 123.675]
+  PIXEL_STD: [57.375, 57.12, 58.395]
diff --git a/configs/quick_schedules/e2e_faster_rcnn_X_101_32x8d_FPN_quick.yaml b/configs/quick_schedules/e2e_faster_rcnn_X_101_32x8d_FPN_quick.yaml
@@ -5,6 +5,9 @@ MODEL:
     CONV_BODY: "R-101-FPN"
   RESNETS:
     BACKBONE_OUT_CHANNELS: 256
+    STRIDE_IN_1X1: False
+    NUM_GROUPS: 32
+    WIDTH_PER_GROUP: 8
   RPN:
     USE_FPN: True
     ANCHOR_STRIDE: (4, 8, 16, 32, 64)
@@ -21,10 +24,6 @@ MODEL:
     POOLER_SAMPLING_RATIO: 2
     FEATURE_EXTRACTOR: "FPN2MLPFeatureExtractor"
     PREDICTOR: "FPNPredictor"
-  RESNETS:
-    STRIDE_IN_1X1: False
-    NUM_GROUPS: 32
-    WIDTH_PER_GROUP: 8
 DATASETS:
   TRAIN: ("coco_2014_minival",)
   TEST: ("coco_2014_minival",)

diff --git a/configs/quick_schedules/e2e_mask_rcnn_X_101_32x8d_FPN_quick.yaml b/configs/quick_schedules/e2e_mask_rcnn_X_101_32x8d_FPN_quick.yaml
@@ -5,6 +5,9 @@ MODEL:
     CONV_BODY: "R-101-FPN"
   RESNETS:
     BACKBONE_OUT_CHANNELS: 256
+    STRIDE_IN_1X1: False
+    NUM_GROUPS: 32
+    WIDTH_PER_GROUP: 8
   RPN:
     USE_FPN: True
     ANCHOR_STRIDE: (4, 8, 16, 32, 64)
@@ -29,10 +32,6 @@ MODEL:
     POOLER_SAMPLING_RATIO: 2
     RESOLUTION: 28
     SHARE_BOX_FEATURE_EXTRACTOR: False
-  RESNETS:
-    STRIDE_IN_1X1: False
-    NUM_GROUPS: 32
-    WIDTH_PER_GROUP: 8
   MASK_ON: True
 DATASETS:
   TRAIN: ("coco_2014_minival",)