This is the official repository for the paper "Vision Relation Transformer for Unbiased Scene Graph Generation".
Check INSTALL.md for installation instructions.
Check DATASET.md for instructions of dataset preprocessing.
For VG dataset, the pretrained object detector we used is provided by Scene-Graph-Benchmark, you can download it from this link. For GQA dataset, we used the pretrained object detector provided by SHA-GCL-for-SGG which can be downloaded from this link. Modify the pretrained weight parameter MODEL.PRETRAINED_DETECTOR_CKPT
in configs yaml configs/VETO_final.yaml
to the path of corresponding pretrained rcnn weight to make sure you load the detection weight parameter correctly.
You can follow the following instructions to train your own, which takes 1 GPU to train each SGG model. The results should be very close to the reported results given in paper.
Following script trains VETO vanilla for PredCls (For SGCls set MODEL.ROI_RELATION_HEAD.USE_GT_BOX True MODEL.ROI_RELATION_HEAD.USE_GT_OBJECT_LABEL False, For SGDet set MODEL.ROI_RELATION_HEAD.USE_GT_BOX False MODEL.ROI_RELATION_HEAD.USE_GT_OBJECT_LABEL False)
python ./tools/relation_train_net.py --config-file
"configs/VETO_final.yaml"
MODEL.ROI_RELATION_HEAD.PREDICTOR VETOPredictor
GLOBAL_SETTING.DATASET_CHOICE 'VG' MODEL.ROI_RELATION_HEAD.USE_GT_BOX True
MODEL.ROI_RELATION_HEAD.USE_GT_OBJECT_LABEL True
GLOBAL_SETTING.BETA_LOSS False
SOLVER.IMS_PER_BATCH 12 TEST.IMS_PER_BATCH 1
SOLVER.MAX_ITER 125000 SOLVER.VAL_PERIOD 5000
SOLVER.CHECKPOINT_PERIOD 5000 DEBUG False
SOLVER.PRE_VAL False ENSEMBLE_LEARNING.ENABLED False
EXPERIMENT_NAME "VG_VETO_vanilla"
Following script trains VETO + Rwt for PredCls
python ./tools/relation_train_net.py --config-file
"configs/VETO_final.yaml"
MODEL.ROI_RELATION_HEAD.PREDICTOR VETOPredictor
GLOBAL_SETTING.DATASET_CHOICE 'VG' MODEL.ROI_RELATION_HEAD.USE_GT_BOX True
MODEL.ROI_RELATION_HEAD.USE_GT_OBJECT_LABEL True
GLOBAL_SETTING.BETA_LOSS True
SOLVER.IMS_PER_BATCH 12 TEST.IMS_PER_BATCH 1
SOLVER.MAX_ITER 125000 SOLVER.VAL_PERIOD 5000
SOLVER.CHECKPOINT_PERIOD 5000 DEBUG False
SOLVER.PRE_VAL False ENSEMBLE_LEARNING.ENABLED False
EXPERIMENT_NAME "VG_VETO_beta"
Following script trains VETO + MEET for PredCls
python ./tools/relation_train_net.py --config-file
"configs/VETO_final.yaml"
MODEL.ROI_RELATION_HEAD.PREDICTOR VETOPredictor_MEET
GLOBAL_SETTING.DATASET_CHOICE 'VG' MODEL.ROI_RELATION_HEAD.USE_GT_BOX True
MODEL.ROI_RELATION_HEAD.USE_GT_OBJECT_LABEL True
GLOBAL_SETTING.BETA_LOSS True
SOLVER.IMS_PER_BATCH 12 TEST.IMS_PER_BATCH 1
SOLVER.MAX_ITER 125000 SOLVER.VAL_PERIOD 5000
SOLVER.CHECKPOINT_PERIOD 5000 DEBUG False
SOLVER.PRE_VAL False ENSEMBLE_LEARNING.ENABLED True
EXPERIMENT_NAME "VG_VETO_MEET"
By replacing the parameter of MODEL.WEIGHT
to the trained model weight and selected dataset name in DATASETS.TEST
, you can directly eval the model on validation or test set.
If you find our work useful in your research, please consider citing
@InProceedings{Sudhakaran_2023_ICCV,
author = {Sudhakaran, Gopika and Dhami, Devendra Singh and Kersting, Kristian and Roth, Stefan},
title = {Vision Relation Transformer for Unbiased Scene Graph Generation},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2023},
pages = {21882-21893}
}
This repository is developed on top of the following code bases:
- Scene graph benchmarking framework develped by KaihuaTang
- A Toolkit for Scene Graph Benchmark in Pytorch by Rongjie Li
- Stacked Hybrid-Attention and Group Collaborative Learning for Unbiased Scene Graph Generation in Pytorch by [Xingning Dong](Stacked Hybrid-Attention and Group Collaborative Learning for Unbiased Scene Graph Generation)