Skip to content

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

License

Notifications You must be signed in to change notification settings

alisonreboud/mmf

 
 

Repository files navigation

The results computed in this repository were published in

@inproceedings{reboud2021you,
  title={What You Say Is Not What You Do: Studying Visio-Linguistic Models for TV Series Summarization},
  author={Reboud, Alison and Troncy, Rapha{\"e}l},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={3149--3153},
  year={2021}
}

Why this study?

iccv

Installation

Follow installation instructions in the documentation. This repo is based on MMF, a modular framework for vision and language multimodal research from Facebook AI Research.

If you use MMF in your work or use any models published in MMF, please cite:

@misc{singh2020mmf,
  author =       {Singh, Amanpreet and Goswami, Vedanuj and Natarajan, Vivek and Jiang, Yu and Chen, Xinlei and Shah, Meet and
                 Rohrbach, Marcus and Batra, Dhruv and Parikh, Devi},
  title =        {MMF: A multimodal framework for vision and language research},
  howpublished = {\url{https://github.com/facebookresearch/mmf}},
  year =         {2020}
}

Running the study

We can start training by running the following command:

mmf_run config=mmf/configs/datasets/csi/dialogues.yaml model=vilbert dataset=csi run_type=train_val_test

The hyperparameters for training and for the experiment are in the experiment config projects/m4c/configs/textvqa/defaults.yaml. We can also set config params using command line args:

mmf_run config=mmf/configs/datasets/csi/dialogues.yaml \ datasets=vilbert \ model=csi \ run_type=train_val_test \ training.batch_size=32 \ training.max_updates=44000 \ training.log_interval=10 \ training.checkpoint_interval=100 \ training.evaluation_interval=1000

The commands for each of the pretraining/mode/text input combinations is available as shell scripts in the commands iccv

Results

iccv

Documentation

Learn more about MMF here.

License

MMF is licensed under BSD license available in LICENSE file

About

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 91.5%
  • Shell 7.0%
  • Other 1.5%