This is the source code accompanying the paper SIREN: Shaping Representations for Detecting Out-of-Distribution Objects by Xuefeng Du, Gabriel Gozum, Yifei Ming, and Yixuan Li
The codebase is heavily based on DETReg.
Checkout our ICLR'22 work VOS and CVPR'22 work STUD on OOD detection for faster R-CNN models if you are interested!
pip install -r requirements.txt
In addition, install detectron2 following here.
PASCAL VOC
Download Pascal VOC dataset (2012trainval, 2007trainval, and 2007test):
mkdir VOC_DATASET_ROOT
cd VOC_DATASET_ROOT
wget http://host.robots.ox.ac.uk:8080/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
tar -xvf VOCtrainval_11-May-2012.tar
tar -xvf VOCtrainval_06-Nov-2007.tar
tar -xvf VOCtest_06-Nov-2007.tar
The VOC dataset folder should have the following structure:
└── VOC_DATASET_ROOT
|
├── VOCdevkit/
├── VOC2007
└── VOC2012
COCO
Download COCO2017 dataset from the official website.
Download the OOD dataset (json file) when the in-distribution dataset is Pascal VOC from here.
Download the OOD dataset (json file) when the in-distribution dataset is BDD-100k from here.
Put the two processed OOD json files to ./anntoations
The COCO dataset folder should have the following structure:
└── COCO_DATASET_ROOT
|
├── annotations
├── xxx (the original json files)
├── instances_val2017_ood_wrt_bdd_rm_overlap.json
└── instances_val2017_ood_rm_overlap.json
├── train2017
└── val2017
BDD-100k
Donwload the BDD-100k images from the official website.
Download the processed BDD-100k json files from here and here.
The BDD dataset folder should have the following structure:
└── BDD_DATASET_ROOT
|
├── images
├── val_bdd_converted.json
└── train_bdd_converted.json
OpenImages
Download our OpenImages validation splits here. We created a tarball that contains the out-of-distribution data splits used in our paper for hyperparameter tuning. Do not modify or rename the internal folders as those paths are hard coded in the dataset reader. The OpenImages dataset is created in a similar way following this paper.
The OpenImages dataset folder should have the following structure:
└── OEPNIMAGES_DATASET_ROOT
|
├── coco_classes
└── ood_classes_rm_overlap
Visualization of the OOD datasets
The OOD images with respect to different in-distribution datasets can be downloaded from ID-VOC-OOD-COCO, ID-VOC-OOD-openimages, ID-BDD-OOD-COCO, ID-BDD-OOD-openimages.
Firstly, enter the deformable detr folder by running
cd detr
Before training, 1) modify the file address for saving the checkpoint by changing "EXP_DIR" in the shell files inside ./configs/; 2) modify the address for the training and ood dataset in the main.py file.
cd models/ops & python setup.py build install & cd ../../
Vanilla Faster-RCNN with VOC as the in-distribution dataset
GPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/voc/vanilla.sh voc_id
Vanilla Faster-RCNN with BDD as the in-distribution dataset
GPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/bdd/vanilla.sh voc_id
SIREN on VOC
GPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/voc/siren.sh voc_id
Evaluation with the in-distribution dataset to be VOC
Firstly run on the in-distribution dataset:
./configs/voc/<config file>.sh voc_id --resume snapshots/voc/<config file>/checkpoint.pth --eval
Then run on the in-distribution training dataset (not required for vMF score):
./configs/voc/<config file>.sh voc_id --resume snapshots/voc/<config file>/checkpoint.pth --eval --maha_train
Finally run on the OOD dataset of COCO:
./configs/voc/<config file>.sh coco_ood --resume snapshots/voc/<config file>/checkpoint.pth --eval
Obtain the metrics by vMF and KNN score using:
python voc_coco_vmf.py --name xxx --pro_length xx --use_trained_params 1
"name" means the vanilla or siren
"pro_length" means the dimension of projected space. We use 16 for VOC and 64 for BDD.
"use_trained_params" denotes whether we use the learned vMF distributions for OOD detection.
Pretrained models
The pretrained models for Pascal-VOC can be downloaded from vanilla and SIREN.
The pretrained models for BDD-100k can be downloaded from vanilla and SIREN.
Following VOS for intallation and preparation.
SIREN on VOC
python train_net_gmm.py
--dataset-dir path/to/dataset/dir
--num-gpus 8
--config-file VOC-Detection/faster-rcnn/center64_0.1.yaml
--random-seed 0
--resume
Evaluation with the in-distribution dataset to be VOC
Firstly run on the in-distribution dataset:
python apply_net.py
--dataset-dir path/to/dataset/dir
--test-dataset voc_custom_val
--config-file VOC-Detection/faster-rcnn/center64_0.1.yaml
--inference-config Inference/standard_nms.yaml
--random-seed 0
--image-corruption-level 0
--visualize 0
Then run on the in-distribution training dataset (not required for vMF score):
python apply_net.py
--dataset-dir path/to/dataset/dir
--test-dataset voc_custom_train
--config-file VOC-Detection/faster-rcnn/center64_0.1.yaml
--inference-config Inference/standard_nms.yaml
--random-seed 0
--image-corruption-level 0
--visualize 0
Finally run on the OOD dataset:
python apply_net.py
--dataset-dir path/to/dataset/dir
--test-dataset coco_ood_val
--config-file VOC-Detection/faster-rcnn/center64_0.1.yaml
--inference-config Inference/standard_nms.yaml
--random-seed 0
--image-corruption-level 0
--visualize 0
Obtain the metrics by both vMF and KNN score using:
python voc_coco_vmf.py --name center64_0.1 --thres xxx
python voc_coco_knn.py --name center64_0.1 --thres xxx
Here the threshold is determined according to ProbDet. It will be displayed in the screen as you finish evaluating on the in-distribution dataset.
Pretrained models
The pretrained models for Pascal-VOC can be downloaded from SIREN-ResNet with the projected dimension to be 64.
If you found any part of this code is useful in your research, please consider citing our paper:
@inproceedings{du2022siren,
title={SIREN: Shaping Representations for Detecting Out-of-distribution Objects},
author={Du, Xuefeng and Gozum, Gabriel and Ming, Yifei and Li, Yixuan},
booktitle={Advances in Neural Information Processing Systems},
year={2022}
}