Official code for NeurIPS'21 paper "Combinatorial Optimization for Panoptic Segmentation: A Fully Differentiable Approach".
TLDR: Panoptic segmentation pipeline containing CNN and a combinatorial optimization problem (AMWC). The whole pipeline is trained with panoptic quality loss by backpropagating through AMWC.
Our codebase is built upon the detectron2 framework with some of our (minor) modifications
such as printing per-image panoptic quality. For more information please consult the documentation of detectron2 as the codebase is designed according to their guidelines.
The code is developed and tested on CUDA 10.1
. We use conda package manager to manage the dependencies. The steps to set-up the environment are as follows:
conda create -n cops python=3.7
conda activate cops
Which creates and activates the environment. Afterwards check the contents of install.sh
to install the dependencies (Make sure to fetch all submodules recursively).
Please follow the guidelines of detectron2 from detectron2_datasets to set-up the datasets.
- Please see
affinityNet/panoptic_affinity/config.py
to see all configurations parameters related to backbone, decoders, dataset etc. Instantiation of these parameters is done in config files present inaffinityNet/configs
folder. - Whole pipeline is defined in
affinityNet/panoptic_affinity/panoptic_seg_affinity.py
. - Panoptic quality surrogate loss, gradients of AMWC, instance segmentation confidence scores are computed from
affinityNet/panoptic_affinity/losses.py
. - In case of confusion w.r.t overall code structure, data generation etc. consulting detectron2 should help.
python train_net.py --config-file configs/Cityscapes-PanopticSegmentation/panoptic_affinity_pretrain.yaml --num-gpus 1 --resume
Assuming that the output of pretraining phase is saved to PRETRAINED_DIR
where WEIGHTS
is the name of checkpoint file. Then run the following command by replacing the values of above-mentioned variables:
python train_net.py --config-file configs/Cityscapes-PanopticSegmentation/panoptic_affinity_end_to_end.yaml --base-config-file ${PRETRAINED_DIR}/config.yaml --num-gpus 1 --resume MODEL.WEIGHTS ${PRETRAINED_DIR}/${WEIGHTS}
We use 4
GPUs. Please change appropriately according to your setup:
python train_net.py --config-file configs/COCO-PanopticSegmentation/panoptic_affinity_pretrain.yaml --num-gpus 4 --resume
python train_net.py --config-file configs/COCO-PanopticSegmentation/panoptic_affinity_end_to_end.yaml --base-config-file ${PRETRAINED_DIR}/config.yaml --num-gpus 1 --resume MODEL.WEIGHTS ${PRETRAINED_DIR}/${WEIGHTS}
Assuming MODEL_DIR
corresponds to the folder containing the checkpoint with name WEIGHTS
. Folder by name OUT_FOLDER
will be created which will contain the evaluation results. Setting MODEL.SAVE_RESULT_IMAGES
to True
will additionally save result images (can be slow).
python train_net.py --config-file ${MODEL_DIR}/config.yaml --num-gpus 1 --eval-only MODEL.WEIGHTS ${MODEL_DIR}/${WEIGHTS} OUTPUT_DIR ${MODEL_DIR}/${OUT_FOLDER} DATALOADER.EVAL_BATCH_SIZE 1 DATALOADER.NUM_WORKERS 0 MODEL.SAVE_RESULT_IMAGES False
Where DATALOADER.EVAL_BATCH_SIZE
controls batch size during inference. Set to larger than 1
to evaluate faster (assuming enough computational resources are available).
Pretrained models after full training and their results:
Dataset | Backbone | PQ | PQ_st | PQ_th | Per image inference time (s) | Checkpoint file |
---|---|---|---|---|---|---|
Cityscapes | ResNet50 | 62.1 | 67.2 | 55.1 | 1.8 | one_drive_link |
COCO | ResNet50 | 38.4 | 35.2 | 40.5 | 0.4 | one_drive_link |
See affinityNet/run_demo.sh