Fast Self-attentive Multimodal Retrieval

Note: The original code for our paper "Fast Self-attentive Multimodal Retrieval" is protected. For providing a public version, we forked this code from: https://github.com/fartashf/vsepp/ and adapted it by adding the self-attentive mechanism along with the main proposed methods.

Dependencies

We recommended to use Anaconda for the following packages.

Python 2.7
PyTorch (>0.1.12)
NumPy (>1.12.1)
TensorBoard
Punkt Sentence Tokenizer:

import nltk
nltk.download()
> d punkt

Download data

Download the dataset files and pre-trained models. We use splits produced by Andrej Karpathy. To use full image encoders, download the images from their original sources here, here and here.

wget http://lsa.pucrs.br/jonatas/seam-data/irv2_precomp.tar.gz
wget http://lsa.pucrs.br/jonatas/seam-data/resnet152_precomp.tar.gz
wget http://lsa.pucrs.br/jonatas/seam-data/vocab.tar.gz

We refer to the path of extracted files for *_precomp.tar.gz as $DATA_PATH and files for models.tar.gz (models are coming up soon) as $RUN_PATH. Extract vocab.tar.gz to ./vocab directory.

Training new models

Run train.py:

python train.py --data_path "$DATA_PATH" --data_name irv2_precomp --logger_name 
runs/seam-e/irv2_precomp/

Arguments used to train pre-trained models:

Method	Arguments
SEAM-E	`--text_encoder seam-e --att_units 300 --att_hops 30 --att_coef 0.5 --measure order --use_abs`
SEAM-C	`--text_encoder seam-c --att_units 300 --att_hops 10 --att_coef 0.5 --measure order --use_abs`
SEAM-G	`--text_encoder seam-g --att_units 300 --att_hops 30 --att_coef 0.5 --measure order --use_abs`
Order	`--text_encoder gru`

Available text encoders:

SEAM-E (seam-e): Self-attention directly over word-embeddings
SEAM-C (seam-c): Self-attention over two parallel convolutional layers and over the word inputs.
SEAM-G (seam-g): GRU + Self-attention

Note that some default arguments in this repository are different from the original one:

--learning_rate .001 --margin .05

Evaluate pre-trained models

from vocab import Vocabulary
import evaluation
evaluation.evalrank("$RUN_PATH/model_best.pth.tar", data_path="$DATA_PATH", split="test")'

To do cross-validation on MSCOCO, pass fold5=True with a model trained using --data_name coco.

Results

[Coming up soon] Results achieved using this repository (COCO-1cv test set):

Method	Features	R@1	R@10	R@1	R@10
SEAM-E	`resnet152_precomp`
SEAM-C	`resnet152_precomp`
SEAM-G	`resnet152_precomp`

Reference

If you found this code useful, please cite the following papers:

@article{wehrmann2018fast,
  title={Fast Self-Attentive Multimodal Retrieval},
  author={Wehrmann, Jônatas and Armani, Maurício and More, Martin D. and Barros, Rodrigo C.},
  journal={IEEE Winter Conf. on Applications of Computer Vision (WACV'18)},
  year={2018}
}

@article{faghri2017vse++,
  title={VSE++: Improved Visual-Semantic Embeddings},
  author={Faghri, Fartash and Fleet, David J and Kiros, Jamie Ryan and Fidler, Sanja},
  journal={arXiv preprint arXiv:1707.05612},
  year={2017}
}

License

Apache License 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
data.py		data.py
evaluation.py		evaluation.py
file_utils.py		file_utils.py
layers.py		layers.py
model.py		model.py
preprocessing.py		preprocessing.py
pretrain.py		pretrain.py
text_encoders.py		text_encoders.py
tokenizers.py		tokenizers.py
train.py		train.py
vocab.py		vocab.py
vocab.tar		vocab.tar

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fast Self-attentive Multimodal Retrieval

Dependencies

Download data

Training new models

Evaluate pre-trained models

Results

Reference

License

About

Releases

Packages

Languages

License

jwehrmann/seam-retrieval

Folders and files

Latest commit

History

Repository files navigation

Fast Self-attentive Multimodal Retrieval

Dependencies

Download data

Training new models

Evaluate pre-trained models

Results

Reference

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages