If you use this code, please cite our work:
@inproceedings{kantorov2016,
title = {ContextLocNet: Context-aware Deep Network Models for Weakly Supervised Localization},
author = {Kantorov, V., Oquab, M., Cho M. and Laptev, I.},
booktitle = {Proc. European Conference on Computer Vision (ECCV), 2016},
year = {2016}
}
The results are available on the project website and in the paper (arXiv page). Please submit bugs and ask questions on GitHub directly, for other inquiries please contact Vadim Kantorov.
This is a joint work of Vadim Kantorov, Maxime Oquab, Minsu Cho, and Ivan Laptev.
- Install the dependencies: Torch with cuDNN support; HDF5; matio; protobuf; Luarocks packages rapidjson, hdf5, matio, loadcaffe, xml; MATLAB or octave binary in PATH (for computing detection mAP).
We strongly recommend using wigwam for this (fix the paths to nvcc
and libcudnn.so
before running the command):
wigwam install torch hdf5 matio protobuf octave -DPATH_TO_NVCC="/path/to/cuda/bin/nvcc" -DPATH_TO_CUDNN_SO="/path/to/cudnn/lib64/libcudnn.so"
wigwam install lua-rapidjson lua-hdf5 lua-matio lua-loadcaffe lua-xml
wigwam in # execute this to make the installed libraries available
- Clone this repository, change the current directory to
contextlocnet
, and compile the ROI pooling module:
git clone https://github.com/vadimkantorov/contextlocnet
cd contextlocnet
(cd ./model && luarocks make)
- Download the VOC 2007 dataset and Koen van de Sande's selective search windows for VOC 2007 and the VGG-F model by running the first command. Optionally download the VOC 2012 and Ross Girshick's selective search windows by manually downloading the VOC 2012 test data tarball to
data/common
and then running the second command:
make -f data/common/Makefile download_and_extract_VOC2007 download_VGGF
# make -f data/common/Makefile download_and_extract_VOC2012
- Choose a dataset, preprocess it, and convert the VGG-F model to the Torch format:
export DATASET=VOC2007
th preprocess.lua VOC VGGF
- Select a GPU and train a model (our best model is
model/contrastive_s.lua
, other choices aremodel/contrastive_a.lua
,model/additive.lua
, andmodel/wsddn_repro.lua
):
export CUDA_VISIBLE_DEVICES=0
th train.lua model/contrastive_s.lua # will produce data/model_epoch30.h5 and data/log.json
- Test the trained model and compute CorLoc and mAP:
SUBSET=trainval th test.lua data/model_epoch30.h5 # will produce data/scores_trainval.h5
th corloc.lua data/scores_trainval.h5 # will produce data/corloc.json
SUBSET=test th test.lua data/model_epoch30.h5 # will produce data/scores_test.h5
th detection_mAP.lua data/scores_test.h5 # will produce data/detection_mAP.json
Model | model_epoch30.h5 | log.json | corloc.json | detection_mAP.json |
---|---|---|---|---|
contrastive_s | link | link | link | link |
wsddn_repro | link | link | link | link |
We greatly thank Hakan Bilen, Relja Arandjelović and Soumith Chintala for fruitful discussion and help.
This work would not have been possible without prior work: Hakan Bilen's WSDDN, Spyros Gidaris's LocNet, Sergey Zagoruyko's loadcaffe, Facebook FAIR's fbnn/Optim.lua.
The code is released under the MIT license.