This project provides the code and datasets for 'CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detection', CVPR 2019. Paper link
Our code is implemented based on the Mask RCNN in Tensorflow and Keras. You can first install the maskrcnn according to the instruction or INSTALL.md
.
The COCO-CapSal dataset provides the saliency ground truth as well as the image captions for each image. It contains 5265 images for training and 1459 ones for validation. The annotations can be downloaded at BaiduYun or GoogleDrive. The folder 'capsal' contains the images, ground truth maps as well as the caprions (json file) of both training and validation sets.
For testing the CapSal model, first download the trained model at BaiduYun or Google
) and put it under the ./model
. Run test_capsal.py
to obtain the saliency maps of different datasets.
The saliency map is avaliable at Google or BaiduYun.
Run 'train.py'.
@InProceedings{Zhang_2019_CVPR,
author = {Zhang, Lu and Zhang, Jianming and Lin, Zhe and Lu, Huchuan and He, You},
title = {CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detection},
booktitle = CVPR,
year = {2019}}