[official thesis report] | [unpublished thesis draft]
This thesis makes the following contributions:
- An empirical study of pre-training, dataset scheduling, and data augmentations on four generations of optical flow models to provide an Improved Training Recipe.
- Understanding the efficacy of Transformer Neural Networks for the optical flow estimation task.
The majority of the code is supported by the EzFlow PyTorch Library which was developed as a prerequisite for the thesis study. This repository contains the training configuration files for all the experiments and the implementation of NAT-GM and SCCFlow end-to-end transformer architectures for optical flow estimation.
The improved training recipe can be found here: kubric_improved_aug
____
- Follow instructions to setup EzFlow and the conda environment from EzFlow Getting Started
- Install the following additional packages:
pip install git+https://github.com/huggingface/transformers pip3 install natten -f https://shi-labs.com/natten/wheels/cu113/torch1.10.1/index.html pip install timm
- If
natten
package fails to install, follow the setup directions from: https://www.shi-labs.com/natten/
The pretrained checkpoints for the improved results will be published in the EzFlow repository.
- FlowNet: Learning Optical Flow with Convolutional Networks
- PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume
- RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
- GMFlow: Learning Optical Flow via Global Matching
- Disentangling Architecture and Training for Optical Flow
- ViT: Vision Transformer
- Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
- Dino ViT: Emerging Properties in Self-Supervised Vision Transformers
- Deep ViT Features as Dense Visual Descriptors
- Neighborhood Attention Transformer
- Dilated Neighborhood Attention Transformer
- Kubric
@article{
author={Goswami,Prajnan},
year={2022},
title={Exploring Training Recipes and Transformer Neural Networks for Optical Flow Estimation},
journal={ProQuest Dissertations and Theses},
url={https://www.proquest.com/docview/2789009042?pq-origsite=gscholar&fromopenview=true},
}
@software{Shah_EzFlow_A_modular_2021,
author = {Shah, Neelay and Goswami, Prajnan and Jiang, Huaizu},
license = {MIT},
month = {11},
title = {{EzFlow: A modular PyTorch library for optical flow estimation using neural networks}},
url = {https://github.com/neu-vig/ezflow},
year = {2021}
}