Skip to content

Commit

Permalink
Merge pull request facebookresearch#2 from chengyangfu/retina_maskrcnn
Browse files Browse the repository at this point in the history
Retina maskrcnn
  • Loading branch information
Cheng-Yang Fu authored Jan 7, 2019
2 parents f2fd7ed + 2d96517 commit 95b8982
Show file tree
Hide file tree
Showing 77 changed files with 4,159 additions and 49 deletions.
6 changes: 3 additions & 3 deletions ABSTRACTIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,15 +31,15 @@ a specific image, as well as the size of the image as a `(width, height)` tuple.
It also contains a set of methods that allow to perform geometric
transformations to the bounding boxes (such as cropping, scaling and flipping).
The class accepts bounding boxes from two different input formats:
- `xyxy`, where each box is encoded as a `x1`, `y1`, `x2` and `y2` coordinates)
- `xyxy`, where each box is encoded as a `x1`, `y1`, `x2` and `y2` coordinates, and
- `xywh`, where each box is encoded as `x1`, `y1`, `w` and `h`.

Additionally, each `BoxList` instance can also hold arbitrary additional information
for each bounding box, such as labels, visibility, probability scores etc.

Here is an example on how to create a `BoxList` from a list of coordinates:
```python
from maskrcnn_baseline.structures.bounding_box import BoxList, FLIP_LEFT_RIGHT
from maskrcnn_benchmark.structures.bounding_box import BoxList, FLIP_LEFT_RIGHT

width = 100
height = 200
Expand All @@ -49,7 +49,7 @@ boxes = [
[10, 10, 50, 50]
]
# create a BoxList with 3 boxes
bbox = BoxList(boxes, size=(width, height), mode='xyxy')
bbox = BoxList(boxes, image_size=(width, height), mode='xyxy')

# perform some box transformations, has similar API as PIL.Image
bbox_scaled = bbox.resize((width * 2, height * 3))
Expand Down
48 changes: 48 additions & 0 deletions configs/retina/retinanet_R-101-FPN_1x.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
MODEL:
META_ARCHITECTURE: "RetinaNet"
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-101"
RPN_ONLY: True
BACKBONE:
CONV_BODY: "R-101-FPN"
OUT_CHANNELS: 256
RPN:
USE_FPN: True
FG_IOU_THRESHOLD: 0.5
BG_IOU_THRESHOLD: 0.4
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
PRE_NMS_TOP_N_TRAIN: 2000
PRE_NMS_TOP_N_TEST: 1000
POST_NMS_TOP_N_TEST: 1000
FPN_POST_NMS_TOP_N_TEST: 1000
ROI_HEADS:
USE_FPN: True
BATCH_SIZE_PER_IMAGE: 256
ROI_BOX_HEAD:
POOLER_RESOLUTION: 7
POOLER_SCALES: (0.25, 0.125, 0.0625, 0.03125)
POOLER_SAMPLING_RATIO: 2
FEATURE_EXTRACTOR: "FPN2MLPFeatureExtractor"
PREDICTOR: "FPNPredictor"
DATASETS:
TRAIN: ("coco_2017_train",)
TEST: ("coco_2017_val",)
INPUT:
MIN_SIZE_TRAIN: (800, )
MAX_SIZE_TRAIN: 1333
MIN_SIZE_TEST: 800
MAX_SIZE_TEST: 1333
DATALOADER:
SIZE_DIVISIBILITY: 32
SOLVER:
# Assume 4 gpus
BASE_LR: 0.005
WEIGHT_DECAY: 0.0001
STEPS: (120000, 160000)
MAX_ITER: 180000
IMS_PER_BATCH: 8
RETINANET:
RETINANET_ON: True
SCALES_PER_OCTAVE: 3
STRADDLE_THRESH: -1


46 changes: 46 additions & 0 deletions configs/retina/retinanet_R-50-FPN_1x.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
MODEL:
META_ARCHITECTURE: "RetinaNet"
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50"
RPN_ONLY: True
BACKBONE:
CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
RPN:
USE_FPN: True
FG_IOU_THRESHOLD: 0.5
BG_IOU_THRESHOLD: 0.4
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
PRE_NMS_TOP_N_TRAIN: 2000
PRE_NMS_TOP_N_TEST: 1000
POST_NMS_TOP_N_TEST: 1000
FPN_POST_NMS_TOP_N_TEST: 1000
ROI_HEADS:
USE_FPN: True
BATCH_SIZE_PER_IMAGE: 256
ROI_BOX_HEAD:
POOLER_RESOLUTION: 7
POOLER_SCALES: (0.25, 0.125, 0.0625, 0.03125)
POOLER_SAMPLING_RATIO: 2
FEATURE_EXTRACTOR: "FPN2MLPFeatureExtractor"
PREDICTOR: "FPNPredictor"
DATASETS:
TRAIN: ("coco_2017_train",)
TEST: ("coco_2017_val",)
INPUT:
MIN_SIZE_TRAIN: (800,)
MAX_SIZE_TRAIN: 1333
MIN_SIZE_TEST: 800
MAX_SIZE_TEST: 1333
DATALOADER:
SIZE_DIVISIBILITY: 32
SOLVER:
# Assume 4 gpus
BASE_LR: 0.01
WEIGHT_DECAY: 0.0001
STEPS: (60000, 80000)
MAX_ITER: 90000
IMS_PER_BATCH: 16
RETINANET:
RETINANET_ON: True
SCALES_PER_OCTAVE: 3
STRADDLE_THRESH: -1
47 changes: 47 additions & 0 deletions configs/retina/retinanet_R-50-FPN_1x_adjust_std011.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
MODEL:
META_ARCHITECTURE: "RetinaNet"
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50"
RPN_ONLY: True
BACKBONE:
CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
RPN:
USE_FPN: True
FG_IOU_THRESHOLD: 0.5
BG_IOU_THRESHOLD: 0.4
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
PRE_NMS_TOP_N_TRAIN: 2000
PRE_NMS_TOP_N_TEST: 1000
POST_NMS_TOP_N_TEST: 1000
FPN_POST_NMS_TOP_N_TEST: 1000
ROI_HEADS:
USE_FPN: True
BATCH_SIZE_PER_IMAGE: 256
ROI_BOX_HEAD:
POOLER_RESOLUTION: 7
POOLER_SCALES: (0.25, 0.125, 0.0625, 0.03125)
POOLER_SAMPLING_RATIO: 2
FEATURE_EXTRACTOR: "FPN2MLPFeatureExtractor"
PREDICTOR: "FPNPredictor"
DATASETS:
TRAIN: ("coco_2017_train",)
TEST: ("coco_2017_val",)
INPUT:
MIN_SIZE_TRAIN: (800,)
MAX_SIZE_TRAIN: 1333
MIN_SIZE_TEST: 800
MAX_SIZE_TEST: 1333
DATALOADER:
SIZE_DIVISIBILITY: 32
SOLVER:
# Assume 4 gpus
BASE_LR: 0.01
WEIGHT_DECAY: 0.0001
STEPS: (60000, 80000)
MAX_ITER: 90000
IMS_PER_BATCH: 16
RETINANET:
RETINANET_ON: True
SCALES_PER_OCTAVE: 3
STRADDLE_THRESH: -1
SELFADJUST_SMOOTH_L1: True
48 changes: 48 additions & 0 deletions configs/retina/retinanet_R-50-FPN_1x_adjust_std100.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
MODEL:
META_ARCHITECTURE: "RetinaNet"
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50"
RPN_ONLY: True
BACKBONE:
CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
RPN:
USE_FPN: True
FG_IOU_THRESHOLD: 0.5
BG_IOU_THRESHOLD: 0.4
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
PRE_NMS_TOP_N_TRAIN: 2000
PRE_NMS_TOP_N_TEST: 1000
POST_NMS_TOP_N_TEST: 1000
FPN_POST_NMS_TOP_N_TEST: 1000
ROI_HEADS:
USE_FPN: True
BATCH_SIZE_PER_IMAGE: 256
ROI_BOX_HEAD:
POOLER_RESOLUTION: 7
POOLER_SCALES: (0.25, 0.125, 0.0625, 0.03125)
POOLER_SAMPLING_RATIO: 2
FEATURE_EXTRACTOR: "FPN2MLPFeatureExtractor"
PREDICTOR: "FPNPredictor"
DATASETS:
TRAIN: ("coco_2017_train",)
TEST: ("coco_2017_val",)
INPUT:
MIN_SIZE_TRAIN: (800,)
MAX_SIZE_TRAIN: 1333
MIN_SIZE_TEST: 800
MAX_SIZE_TEST: 1333
DATALOADER:
SIZE_DIVISIBILITY: 32
SOLVER:
# Assume 4 gpus
BASE_LR: 0.01
WEIGHT_DECAY: 0.0001
STEPS: (60000, 80000)
MAX_ITER: 90000
IMS_PER_BATCH: 16
RETINANET:
RETINANET_ON: True
SCALES_PER_OCTAVE: 3
STRADDLE_THRESH: -1
BBOX_REG_BETA: 1.0
SELFADJUST_SMOOTH_L1: True
47 changes: 47 additions & 0 deletions configs/retina/retinanet_R-50-FPN_1x_adjustl1.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
MODEL:
META_ARCHITECTURE: "RetinaNet"
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50"
RPN_ONLY: True
BACKBONE:
CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
RPN:
USE_FPN: True
FG_IOU_THRESHOLD: 0.5
BG_IOU_THRESHOLD: 0.4
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
PRE_NMS_TOP_N_TRAIN: 2000
PRE_NMS_TOP_N_TEST: 1000
POST_NMS_TOP_N_TEST: 1000
FPN_POST_NMS_TOP_N_TEST: 1000
ROI_HEADS:
USE_FPN: True
BATCH_SIZE_PER_IMAGE: 256
ROI_BOX_HEAD:
POOLER_RESOLUTION: 7
POOLER_SCALES: (0.25, 0.125, 0.0625, 0.03125)
POOLER_SAMPLING_RATIO: 2
FEATURE_EXTRACTOR: "FPN2MLPFeatureExtractor"
PREDICTOR: "FPNPredictor"
DATASETS:
TRAIN: ("coco_2017_train",)
TEST: ("coco_2017_val",)
INPUT:
MIN_SIZE_TRAIN: (800,)
MAX_SIZE_TRAIN: 1333
MIN_SIZE_TEST: 800
MAX_SIZE_TEST: 1333
DATALOADER:
SIZE_DIVISIBILITY: 32
SOLVER:
# Assume 4 gpus
BASE_LR: 0.01
WEIGHT_DECAY: 0.0001
STEPS: (60000, 80000)
MAX_ITER: 90000
IMS_PER_BATCH: 16
RETINANET:
RETINANET_ON: True
SCALES_PER_OCTAVE: 3
STRADDLE_THRESH: -1
SELFADJUST_SMOOTH_L1: True
48 changes: 48 additions & 0 deletions configs/retina/retinanet_R-50-FPN_1x_beta100.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
MODEL:
META_ARCHITECTURE: "RetinaNet"
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50"
RPN_ONLY: True
BACKBONE:
CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
RPN:
USE_FPN: True
FG_IOU_THRESHOLD: 0.5
BG_IOU_THRESHOLD: 0.4
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
PRE_NMS_TOP_N_TRAIN: 2000
PRE_NMS_TOP_N_TEST: 1000
POST_NMS_TOP_N_TEST: 1000
FPN_POST_NMS_TOP_N_TEST: 1000
ROI_HEADS:
USE_FPN: True
BATCH_SIZE_PER_IMAGE: 256
ROI_BOX_HEAD:
POOLER_RESOLUTION: 7
POOLER_SCALES: (0.25, 0.125, 0.0625, 0.03125)
POOLER_SAMPLING_RATIO: 2
FEATURE_EXTRACTOR: "FPN2MLPFeatureExtractor"
PREDICTOR: "FPNPredictor"
DATASETS:
TRAIN: ("coco_2017_train",)
TEST: ("coco_2017_val",)
INPUT:
MIN_SIZE_TRAIN: (800,)
MAX_SIZE_TRAIN: 1333
MIN_SIZE_TEST: 800
MAX_SIZE_TEST: 1333
DATALOADER:
SIZE_DIVISIBILITY: 32
SOLVER:
# Assume 4 gpus
BASE_LR: 0.01
WEIGHT_DECAY: 0.0001
STEPS: (60000, 80000)
MAX_ITER: 90000
IMS_PER_BATCH: 16
RETINANET:
RETINANET_ON: True
SCALES_PER_OCTAVE: 3
STRADDLE_THRESH: -1
BBOX_REG_BETA: 1.0
SELFADJUST_SMOOTH_L1: False
47 changes: 47 additions & 0 deletions configs/retina/retinanet_R-50-FPN_1x_low_quality_0.2.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
MODEL:
META_ARCHITECTURE: "RetinaNet"
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50"
RPN_ONLY: True
BACKBONE:
CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
RPN:
USE_FPN: True
FG_IOU_THRESHOLD: 0.5
BG_IOU_THRESHOLD: 0.4
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
PRE_NMS_TOP_N_TRAIN: 2000
PRE_NMS_TOP_N_TEST: 1000
POST_NMS_TOP_N_TEST: 1000
FPN_POST_NMS_TOP_N_TEST: 1000
ROI_HEADS:
USE_FPN: True
BATCH_SIZE_PER_IMAGE: 256
ROI_BOX_HEAD:
POOLER_RESOLUTION: 7
POOLER_SCALES: (0.25, 0.125, 0.0625, 0.03125)
POOLER_SAMPLING_RATIO: 2
FEATURE_EXTRACTOR: "FPN2MLPFeatureExtractor"
PREDICTOR: "FPNPredictor"
DATASETS:
TRAIN: ("coco_2017_train",)
TEST: ("coco_2017_val",)
INPUT:
MIN_SIZE_TRAIN: (800,)
MAX_SIZE_TRAIN: 1333
MIN_SIZE_TEST: 800
MAX_SIZE_TEST: 1333
DATALOADER:
SIZE_DIVISIBILITY: 32
SOLVER:
# Assume 4 gpus
BASE_LR: 0.01
WEIGHT_DECAY: 0.0001
STEPS: (60000, 80000)
MAX_ITER: 90000
IMS_PER_BATCH: 16
RETINANET:
RETINANET_ON: True
SCALES_PER_OCTAVE: 3
STRADDLE_THRESH: -1
LOW_QUALITY_THRESHOLD: 0.4
Loading

0 comments on commit 95b8982

Please sign in to comment.