VIBIKNet

This repository contains the code for the Visual Bidirectional Kernelized Network for Visual Question Answering, presented at the VQA Challenge at CVPR'16. With this module, you can replicate our experiments and easily deploy new models. VIBIKNet is built upon the Keras (version 1.2) framework and tested for the Theano backend.

Installation

VIBIKNet requires the following libraries:

Additionally, if you want to run the tutorials and the visualization module, you'll need:

If you want to extract KCNN features you will need (see the following README for more info):

Matlab 2014a or newer or Octave
Caffe. Download into ~/code/caffe folder or change path in extractCNNfeatures.m
EdgeBoxes object detection. Download into /repository_root/edges folder.
Piotr Dollar toolbox. Download into /repository_root/piotr_toolbox folder.
Inria's Yael library for Matlab. Download into /repository_root/yael folder and compile for you system. See yael's release notes for more information.

If you want to use pretrained word embeddings, you can either train them by yourself using Glove or Word2Vec, or download pretrained word embeddings (recommended):

How to use

For extracting KCNN features, see the following README.
For training a new model, follow the train README.
For visualizing the results, follow the visualize_results notebook.

VIBIKNet model at the CVPR VQA Challenge

See CVPR poster here.

Examples

These answers have been automatically generated by VIBIKNet:

Project citation

If you use this project, please, cite the following publication:

Bolaños, M., Peris, Á., Casacuberta, F., & Radeva, P. 
VIBIKNet: Visual Bidirectional Kernelized Network for Visual Question Answering
Iberian Conference on Pattern Recognition and Image Analysis, IbPRIA '17 (In press).

References

Liu Z. 
Kernelized Deep Convolutional Neural Network for Describing Complex Images. 
arXiv preprint arXiv:1509.04581. 2015 Sep 15.

Peris Á, Bolaños M, Radeva P, Casacuberta F. 
Video Description using Bidirectional Recurrent Neural Networks. 
arXiv preprint arXiv:1604.03390. 2016 Apr 12.

Malinowski M, Rohrbach M, Fritz M. 
Ask your neurons: A neural-based approach to answering questions about images. 
In Proceedings of the IEEE International Conference on Computer Vision 2015 (pp. 1-9).

About

Joint collaboration between the Computer Vision at the University of Barcelona (CVUB) group at Universitat de Barcelona-CVC and the PRHLT Research Center at Universitat Politècnica de València.

Contact

Marc Bolaños (web page): [email protected]

Álvaro Peris: [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
VIBIKNet		VIBIKNet
docs		docs
features_extraction		features_extraction
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VIBIKNet

Installation

How to use

VIBIKNet model at the CVPR VQA Challenge

Examples

Project citation

References

About

Contact

About

Releases

Packages

Contributors 2

Languages

License

MarcBS/VIBIKNet

Folders and files

Latest commit

History

Repository files navigation

VIBIKNet

Installation

How to use

VIBIKNet model at the CVPR VQA Challenge

Examples

Project citation

References

About

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages