- Fixes INSTALL.md with more detailed explanation and correct commands.
- Typecasting in slowfast/datasets/decoder.py to fix non-writeable tensor warning.
- Adds registry for Berkeley Deep Drive (BDD) dataset.
- Adds registry for custom CARLA simulator dataset.
- Adds more experiment configurations.
PySlowFast is an open source video understanding codebase from FAIR that provides state-of-the-art video classification models with efficient training. This repository includes implementations of the following methods:
- SlowFast Networks for Video Recognition
- Non-local Neural Networks
- A Multigrid Method for Efficiently Training Video Models
- X3D: Progressive Network Expansion for Efficient Video Recognition
- Multiscale Vision Transformers
- A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning
- MViTv2: Improved Multiscale Vision Transformers for Classification and Detection
- Masked Feature Prediction for Self-Supervised Visual Pre-Training
- Masked Autoencoders As Spatiotemporal Learners
- Reversible Vision Transformers
The goal of PySlowFast is to provide a high-performance, light-weight pytorch codebase provides state-of-the-art video backbones for video understanding research on different tasks (classification, detection, and etc). It is designed in order to support rapid implementation and evaluation of novel video research ideas. PySlowFast includes implementations of the following backbone network architectures:
- SlowFast
- Slow
- C2D
- I3D
- Non-local Network
- X3D
- MViTv1 and MViTv2
- Rev-ViT and Rev-MViT
- We now Reversible Vision Transformers. Both Reversible ViT and MViT models released. See
projects/rev
. - We now support MAE for Video. See
projects/mae
for more information. - We now support MaskFeat. See
projects/maskfeat
for more information. - We now support MViTv2 in PySlowFast. See
projects/mvitv2
for more information. - We now support A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning. See
projects/contrastive_ssl
for more information. - We now support Multiscale Vision Transformers on Kinetics and ImageNet. See
projects/mvit
for more information. - We now support PyTorchVideo models and datasets. See
projects/pytorchvideo
for more information. - We now support X3D Models. See
projects/x3d
for more information. - We now support Multigrid Training for efficiently training video models. See
projects/multigrid
for more information. - PySlowFast is released in conjunction with our ICCV 2019 Tutorial.
PySlowFast is released under the Apache 2.0 license.
We provide a large set of baseline results and trained models available for download in the PySlowFast Model Zoo.
Please find installation instructions for PyTorch and PySlowFast in INSTALL.md. You may follow the instructions in DATASET.md to prepare the datasets.
Follow the example in GETTING_STARTED.md to start playing video models with PySlowFast.
We offer a range of visualization tools for the train/eval/test processes, model analysis, and for running inference with trained model. More information at Visualization Tools.
PySlowFast is written and maintained by Haoqi Fan, Yanghao Li, Bo Xiong, Wan-Yen Lo, Christoph Feichtenhofer.
If you find PySlowFast useful in your research, please use the following BibTeX entry for citation.
@misc{fan2020pyslowfast,
author = {Haoqi Fan and Yanghao Li and Bo Xiong and Wan-Yen Lo and
Christoph Feichtenhofer},
title = {PySlowFast},
howpublished = {\url{https://github.com/facebookresearch/slowfast}},
year = {2020}
}