Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

Support more datasets #232

Merged
merged 18 commits into from
Dec 5, 2018
Merged

Support more datasets #232

merged 18 commits into from
Dec 5, 2018

Conversation

henrywang1
Copy link
Contributor

I would like to make a proposal for the issues about supporting other datasets. e.g. #168 and #207.
#168 doesn't have a clear solution so far, and #207 doesn't support evaluating segmentation results.

Inspired by Detectron, I propose that adding a new config option, "DATASETS.FORCE_USE_JSON_ANNOTATION".
If the option is True, maskrcnn-benchmark will use COCO API to load the specified dataset and evaluate the result with the COCO styled AP. (i.e. AP, AP50, AP75, APs, APm, APl for bbox and segm)

In addition to the newly added option, I also

  • Provide the steps to run on other datasets (maskrcnn_benchmark/data/Readme.md)
  • Add the example configuration for Cityscapes
  • Organize the codes for converting Cityscapes Datasets to COCO style

You could try the below command options to evaluate the changes.

python train_net.py --config-file "configs/pascal_voc/e2e_mask_rcnn_R_50_FPN_1x.yaml" DATASETS.FORCE_USE_JSON_ANNOTATION True
python train_net.py --config-file "configs/e2e_faster_rcnn_R_50_FPN_1x.yaml" DATASETS.FORCE_USE_JSON_ANNOTATION True
python train_net.py --config-file "configs/e2e_mask_rcnn_R_50_FPN_1x.yaml" DATASETS.FORCE_USE_JSON_ANNOTATION True

Thanks !

@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Nov 29, 2018
Copy link
Contributor

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the PR, it will definitely be very helpful to the community!

I have left a few comments, let me know what you think.

The biggest comment is that I'd rather avoid having the new config flag, and instead I'd go with a new name for the dataset, like voc_2007_train_cocostyle.

Thoughts?

maskrcnn_benchmark/data/datasets/voc.py Outdated Show resolved Hide resolved
maskrcnn_benchmark/engine/trainer.py Outdated Show resolved Hide resolved
maskrcnn_benchmark/config/defaults.py Outdated Show resolved Hide resolved
maskrcnn_benchmark/config/paths_catalog.py Outdated Show resolved Hide resolved
maskrcnn_benchmark/config/paths_catalog.py Outdated Show resolved Hide resolved
maskrcnn_benchmark/config/paths_catalog.py Outdated Show resolved Hide resolved
maskrcnn_benchmark/config/paths_catalog.py Outdated Show resolved Hide resolved
tools/instances2dict_with_polygons.py Outdated Show resolved Hide resolved
@henrywang1
Copy link
Contributor Author

Hi @fmassa
I agree with all your comments, so I made some changes.
While I was modifying the code, I think that

Instead of having two entries in path_catalog.py

"voc_2012_train": ("voc/VOC2012", 'train'),
"voc_2012_train_cocostyle": ("voc/VOC2012/JPEGImages", "voc/VOC2012/Annotations/pascal_test2012.json"),

If we use dictionary to replace tuple, it will be more readable

"voc_2012_train": {
    "data_dir": "voc/VOC2012",
    "img_dir": "voc/VOC2012/JPEGImages",
    "ann_file": "voc/VOC2012/Annotations/pascal_train2012.json",
    "split": "train"
},

In addition, I think using code to append *_cocostyle entries to DATASETS is clearer.

coco_style_datasets = {
    key + "_cocostyle": value for key, value in DATASETS.items() if "coco" not in key
}	    }
DATASETS={**DATASETS, **coco_style_datasets}

What do you think about all the updates?
Btw, I will modify maskrcnn_benchmark/data/Readme.md, once we reach an agreement.

Copy link
Contributor

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great, thanks!

I have a few more comments, let me know what you think, I'm open to suggestions!

maskrcnn_benchmark/config/paths_catalog.py Outdated Show resolved Hide resolved
maskrcnn_benchmark/config/paths_catalog.py Show resolved Hide resolved
maskrcnn_benchmark/config/paths_catalog.py Show resolved Hide resolved
maskrcnn_benchmark/config/paths_catalog.py Outdated Show resolved Hide resolved
@henrywang1
Copy link
Contributor Author

henrywang1 commented Dec 5, 2018

Thanks for the review and all the comments.
I've updated the code, the format and logic have changed to:

  • voc_xxx and voc_xxx_cocostyle
  • cityscapes_xxx_cocostyle

Copy link
Contributor

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot!

@fmassa fmassa merged commit 0f61b00 into facebookresearch:master Dec 5, 2018
@henrywang1 henrywang1 deleted the more-datasets branch December 5, 2018 18:32
Ricardozzf added a commit to Ricardozzf/maskrcnn-benchmark that referenced this pull request Dec 11, 2018
@fmassa fmassa mentioned this pull request Dec 12, 2018
nprasad2021 pushed a commit to nprasad2021/maskrcnn-benchmark that referenced this pull request Jan 29, 2019
* add force json option

* fix the same issue as facebookresearch#185

* bug fix

* cityscapes config

* update paths catalog

* discard config change

* organize code for more-datasets

* use better representation for coco-style datasets

* rename coco-style config

* remove import

* chmod 644

* make the config more verbose

* update readme

* rename

* chmod
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
CLA Signed Do not delete this pull request or issue due to inactivity.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants