Skip to content

UnFlow: Unsupervised Learning of Optical Flow with a Bidirectional Census Loss

License

Notifications You must be signed in to change notification settings

simonmeister/UnFlow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

UnFlow: Unsupervised Learning of Optical Flow with a Bidirectional Census Loss

This repository contains the TensorFlow implementation of the paper

UnFlow: Unsupervised Learning of Optical Flow with a Bidirectional Census Loss (AAAI 2018)

Simon Meister, Junhwa Hur, and Stefan Roth.

Slides

Download slides from AAAI 2018 talk.

Citation

If you find UnFlow useful in your research, please consider citing:

@inproceedings{Meister:2018:UUL,
  title  = {{UnFlow}: Unsupervised Learning of Optical Flow
            with a Bidirectional Census Loss},
  author = {Simon Meister and Junhwa Hur and Stefan Roth},
  address = {New Orleans, Louisiana},
  booktitle = {AAAI},
  month = feb,
  year = {2018}
}

License

UnFlow is released under the MIT License (refer to the LICENSE file for details).

Unofficial PyTorch code

There is a unofficial third party PyTorch implementation by Simon Niklaus.

Contents

  1. Introduction
  2. Usage
  3. Replicating our Models
  4. Navigating the Code

Introduction

Our paper describes a method to train end-to-end deep networks for dense optical flow without the need for ground truth optical flow.

NOTE (January 2020): There may be some hiccups with more recent versions of tensorflow due to the unstable custom op compilation API used for the correlation operation. If you experience these issues, please try one of the older tensorflow versions (see list of releases, e.g. 1.2 or 1.7) until I find time to fix these issues. I'm currently too busy with new projects to upgrade the code, and would be happy about any contributions.

This implementation supports all training and evaluation styles described in the paper. This includes

The supported network architectures are FlowNetS, FlowNetC, as well as stacked variants of these networks as introduced in FlowNet 2.0.

All datasets required for training and evaluation are downloaded on-demand by the code (except Cityscapes, which has to be downloaded manually if needed). Please ensure that the data directory specified in the configuration file has enough space to hold at least 150 GB (when using SYNTHIA and KITTI only).

Usage

Please refer to the configuration file template (config_template/config.ini) for a detailed description of the different operating modes.

Hardware requirements

  • at least one NVIDIA GPU (multi-GPU training is supported). We used the Titan X Pascal with 12GB memory.
  • for best performance, at least 8GB of GPU memory is recommended, for the stacked variants 11-12GB may be best
  • to run the code with less GPU memory, take a look at https://github.com/openai/gradient-checkpointing. One user reported successfully running the full CSS model with a GTX 960 with 4GB memory. Note that training will take longer in that case.

Software requirements

Prepare environment

  • copy config_template/config.ini to ./ and modify settings in the [dir], [run] and [compile] sections for your environment (see comments in the file).

Run & evaluate experiments

  • adapt settings in ./config.ini for your experiment
  • cd src
  • train with python run.py --ex my_experiment. Evaluation is run during training as specified in config.ini, but no visualizations are saved.
  • evaluate (multiple) experiments with python eval_gui.py --ex experiment_name_1[, experiment_name_2 [...]] and visually compare results side-by-side.

You can use the --help flag with run.py or eval_gui.py to view all available flags.

View tensorboard logs

  • view logs for all experiments with tensorboard --logdir=<log_dir>/ex

Pre-trained models

We provide checkpoints for the C_SYNTHIA, CS_SYNTHIA, CSS_SYNTHIA, C, CS, CSS and CSS_ft models (see "Replicating our models" for a description). To use them,

  • download this file and extract the contents to <log_dir>/ex/.

Now, you can evaluate and compare different models, e.g.

  • python eval_gui.py --ex C_SYNTHIA,C,CSS_ft.

Replicating our models

In the following, each list item gives an experiment name and parameters to set in ./config.ini. For each experiment, first modify the configuration parameters as specified, and then run python run.py --ex experiment_name.

First, create a series of experiments for the models pre-trained on SYNTHIA:

  • C_SYNTHIA: dataset = synthia, flownet = C,
  • CS_SYNTHIA: dataset = synthia, flownet = CS, finetune = C_SYNTHIA,
  • CSS_SYNTHIA: dataset = synthia, flownet = CSS, finetune = C_SYNTHIA,CS_SYNTHIA.

Next, create a series of experiments for the models trained on KITTI raw:

  • C: dataset = kitti, flownet = C, finetune = C_SYNTHIA,
  • CS: dataset = kitti, flownet = CS, finetune = C,CS_SYNTHIA,
  • CSS: dataset = kitti, flownet = CSS, finetune = C,CS,CSS_SYNTHIA.

Then, train the final fine-tuned model on KITTI 2012 / 2015:

  • CSS_ft: dataset = kitti_ft, flownet = CSS, finetune = C,CS,CSS, train_all = True. Please monitor the eval logs and stop the training as soon as the validation error starts to increase (for CSS, it should be at about 70K iterations).

If you want to train on Cityscapes:

  • C_Cityscapes: dataset = cityscapes, flownet = C, finetune = C_SYNTHIA.

Note that all models but CSS_ft were trained without ground truth optical flow, using our unsupervised proxy loss only.

Navigating the code

The core implementation of our method resides in src/e2eflow/core. The key files are:

  • losses.py: Proxy losses for unsupervised training,
  • flownet.py: FlowNet architectures with support for stacking,
  • image_warp.py: Backward-warping via differentiable bilinear image sampling,
  • supervised.py: Supervised loss and network computation,
  • unsupervised.py: Unsupervised (bi-directional) multi-resolution network and loss computation,
  • train.py: Implementation of training with online evaluation,
  • augment.py: Geometric and photometric data augmentations.