[Project page | Paper]
Yash Bhalgat, Iro Laina, João F. Henriques, Andrew Zisserman, Andrea Vedaldi
TL;DR: Our paper presents a novel "slow-fast" contrastive fusion method to lift 2D predictions to 3D for scalable instance segmentation, achieving significant improvements without requiring an upper bound on the number of objects in the scene.
You can download the Messy Rooms dataset from here. For all other datasets, refer to the instructions provided in Panoptic-Lifting
NOTE: In this codebase, the term "MOS" stands for "Many Object Scenes", which was the original name of the "Messy Rooms" dataset as referenced in the paper.
You can download the pretrained checkpoints from here.
Download the pretrained checkpoints and place them in the pretrained_checkpoints
folder. Then, run the following commands to evaluate the pretrained models:
python3 inference/render_panopli.py --ckpt_path pretrained_checkpoints/<SCENE NAME>/checkpoints/<CKPT NAME>.ckpt --cached_centroids_path pretrained_checkpoints/<SCENE NAME>/checkpoints/all_centroids.pkl
This will render the outputs to runs/<experiment>
folder. To calculate the metrics, run the following command:
python inference/evaluate.py --root_path ./data/<SCENE DATA PATH> --exp_path runs/<experiment>
If you find this work useful in your research, please cite our paper:
@inproceedings{
bhalgat2023contrastive,
title={Contrastive Lift: 3D Object Instance Segmentation by Slow-Fast Contrastive Fusion},
author={Bhalgat, Yash and Laina, Iro and Henriques, Jo{\~a}o F and Zisserman, Andrew and Vedaldi, Andrea},
booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
year={2023},
url={https://openreview.net/forum?id=bbbbbov4Xu}
}
This code is based on Panoptic-Lifting and TensoRF codebases. We thank the authors for releasing their code.