visual translation embedding network for visual relation detection, CVPR 2017, tensorflow
Shuffle-Then-Assemble: Learning Object-Agnostic Visual Relationship Features, ECCV, tensorflow
- Install ipython, if you do not have ipython, you can install this tool (strongly recommended:
pip install ipython
- Install TensorFlow v1.3.0 or newer type.
pip install tensorflow-gpu==1.3.0
3.Download this repository or clone with Git
git clone
- Install easydict
pip install easydict
a). Download the dataset form, and the file is named as ''.
b). Use the following commend to unzip the downloaded data:
unzip -d sg_dataset
c).In the path where you put vtranse folder, use the following commend to make a new folder 'dataset/VRD':
mkdir -p ~/dataset/VRD/json_dataset
mkdir -p ~/dataset/VRD/sg_dataset
d). Move the files in sg_dataset into the created dataset, by using the following commends:
mv sg_dataset/annotations_test.json dataset/VRD/json_dataset
mv sg_dataset/annotations_train.json dataset/VRD/json_dataset
mv sg_dataset/sg_test_images dataset/VRD/sg_dataset
mv sg_dataset/sg_train_images dataset/VRD/sg_dataset
e). Change the root path in file 'vtranse/model/': open this file and find the term '__C.DIR' which is named as '/home/yangxu/rd' to suitable path where you put this vtrase folder.
f). Pre-process the VRD dataset to the vrd_roidb.npz which can be used to train the network. Open ipython using following commend:
And then use following commend to pre-process data in vrd folder:
run process/
After runing this file, you will find that there is one 'vrd_roidb.npz' file in the foloder 'vtranse/input'
a). Download pre-trained model of faster-rcnn on VRD dataset from, and the file names are '', 'vrd_vgg_pretrained.ckpt.index', 'vrd_vgg_pretrained.ckpt.meta' and 'vrd_vgg_pretrained.ckpt.pkl'. After downloading them, using the following commend to move them into the 'vtranse/pre_trained' file:
mv vtranse/pretrained_para
mv vrd_vgg_pretrained.ckpt.index vtranse/pretrained_para
mv vrd_vgg_pretrained.ckpt.meta vtranse/pretrained_para
mv vrd_vgg_pretrained.ckpt.pkl vtranse/pretrained_para
b). Create a folder which is used to save the trained results
mkdir -p ~vtranse/pred_para/vrd_vgg
c). After downloading and moving files to suitable folder, using 'vtranse/train_file/' to train vtranse network on VRD dataset.
run train_file/
d). When training, you can see the results like that:
t: 100.0, rd_loss: 4.83309404731, acc: 0.0980000074953
t: 200.0, rd_loss: 3.81237616211, acc: 0.263000019006
t: 300.0, rd_loss: 3.51845422685, acc: 0.290333356783
t: 400.0, rd_loss: 3.31810754955, acc: 0.292666691653
t: 500.0, rd_loss: 3.48527273357, acc: 0.277666689083
t: 600.0, rd_loss: 3.06100189149, acc: 0.340666691475
t: 700.0, rd_loss: 3.02625158072, acc: 0.334666692317
t: 800.0, rd_loss: 3.06034492403, acc: 0.330333357863
t: 900.0, rd_loss: 3.16739703059, acc: 0.322666690871
a). After training vtranse, you will find files like 'vrd_vgg0001.ckpt' in the 'vtranse/pred_para/vrd_vgg' folder. And then you can test your trained model
b). Open the file 'vtranse/test_file/' and change the variable 'model_path' to the suitable pretrained model's name.
c). Create a folder to save the result of detected relationships using the following commend:
mkdir -p ~vtranse/pred_res
d). After changing the name of your model, using following commend to get the relationship detection results:
run test_file/
e). After testing, you can run the file 'vtranse/test_file/' to evaluate your detected result:
run test_file/
1). Download VG dataset. This dataset can be downloaded from their offical website: After downloading these files, you should using the following commend to put these images into the folder 'dataset/VG/images/VG_100K'
mkdir -p ~dataset/VG/images/VG_100K
mv images/VG100K dataset/VG/images/VG_100K
mv images/VG100K dataset/VG/images/VG_100K
2). Download training/testing split Since this dataset is so noisy, and I use one filtered type which is provided by, you can download the split form this link. After downloading this file, you can use the following commend to pre-process the vg dataset
mkdir -p ~dataset/VG/imdb
mv vg1_2_meta.h5 dataset/VG/imdb
run process/
3). Training, Testing and Evaluation After pre-processing Vg dataset, you can using similar process like VRD dataset to train, test and evaluate your model by using following commends:
run train_file/
run test_file/
run test_file/
author = {Hanwang Zhang, Zawlin Kyaw, Shih-Fu Chang, Tat-Seng Chua},
title = {Visual Translation Embedding Network for Visual Relation Detection},
booktitle = {CVPR},
year = {2017},
predicate | phrase | relation | |
published result | 44.76 | 22.42 | 15.20 |
implemented result | 46.48 | 24.32 | 16.27 |
predicate | phrase | relation | |
published result | 62.87 | 10.45 | 6.04 |
implemented result | 61.70 | 13.62 | 11.62 |
VRD project:
Visual Genome
Vtranse Caffe Type:
The faster rcnn code which I used to train the detection part in this file:
- If you have any problems of this programming, you can eamil to [email protected].