This repository is for our paper Exploiting Scene Graphs for Human-Object Interaction Detection accepted by ICCV 2021.
$ conda install --yes -c pytorch pytorch=1.7.1 torchvision cudatoolkit=11.0
$ pip install tdqm sklearn panda Pillow
Check INSTALL.md to install maskrcnn. Then, adding the maskrcnn lib to your $PYTHONPATH, because our code uses the ROIAlign layer to extract the roi features.
If you want to use multiple gpus to train the model, you have to follow the instructions to install apex.
We use the off-the-shell object detection results of V-COCO and HICO from VSGnet, which can be downloaded from here.
The scene graph prediction results are generated by TDE. Note that we use all the training and testing images of Visual Genome to train the SG model. Our pre-trained TDE model can be downloaded from here.
$ python main.py --gpu_id 0 --learning_rate 0.01 --batch_size 5 --num_epochs 50
If you find this project helps your research, please kindly consider citing our papers in your publications.
@InProceedings{he2021exploiting,
author = {He, Tao and Gao, Lianli and Song, Jingkuan and Li, Yuan-Fang},
title = {Exploiting Scene Graphs for Human-Object Interaction Detection},
booktitle = {International Conference on Computer Vision(ICCV)},
year = {2021},
url = {https://arxiv.org/pdf/2108.08584}
}
This repository is developed on top of the other two projects: TDE by KaihuaTang and VSGnet by ASMIftekhar.