Skip to content

Latest commit

 

History

History
129 lines (88 loc) · 11.7 KB

README.md

File metadata and controls

129 lines (88 loc) · 11.7 KB

TOFlow (IJCV'2019)

Video Enhancement with Task-Oriented Flow

Task: Video Interpolation, Video Super-Resolution

Abstract

Many video enhancement algorithms rely on optical flow to register frames in a video sequence. Precise flow estimation is however intractable; and optical flow itself is often a sub-optimal representation for particular video processing tasks. In this paper, we propose task-oriented flow (TOFlow), a motion representation learned in a self-supervised, task-specific manner. We design a neural network with a trainable motion estimation component and a video processing component, and train them jointly to learn the task-oriented flow. For evaluation, we build Vimeo-90K, a large-scale, high-quality video dataset for low-level video processing. TOFlow outperforms traditional optical flow on standard benchmarks as well as our Vimeo-90K dataset in three video processing tasks: frame interpolation, video denoising/deblocking, and video super-resolution.

Results and models

Evaluated on Vimeo90k-triplet (RGB channels). The metrics are PSNR / SSIM .

Model Dataset Task Pretrained SPyNet PSNR Training Resources Download
tof_vfi_spynet_chair_nobn_1xb1_vimeo90k Vimeo90k-triplet Video Interpolation spynet_chairs_final 33.3294 1 (Tesla PG503-216) model | log
tof_vfi_spynet_kitti_nobn_1xb1_vimeo90k Vimeo90k-triplet Video Interpolation spynet_chairs_final 33.3339 1 (Tesla PG503-216) model | log
tof_vfi_spynet_sintel_clean_nobn_1xb1_vimeo90k Vimeo90k-triplet Video Interpolation spynet_chairs_final 33.3170 1 (Tesla PG503-216) model | log
tof_vfi_spynet_sintel_final_nobn_1xb1_vimeo90k Vimeo90k-triplet Video Interpolation spynet_chairs_final 33.3237 1 (Tesla PG503-216) model | log
tof_vfi_spynet_pytoflow_nobn_1xb1_vimeo90k Vimeo90k-triplet Video Interpolation spynet_chairs_final 33.3426 1 (Tesla PG503-216) model | log
Model Dataset Task Pretrained SPyNet SSIM Training Resources Download
tof_vfi_spynet_chair_nobn_1xb1_vimeo90k Vimeo90k-triplet Video Super-Resolution spynet_chairs_final 0.9465 1 (Tesla PG503-216) model | log
tof_vfi_spynet_kitti_nobn_1xb1_vimeo90k Vimeo90k-triplet Video Super-Resolution spynet_chairs_final 0.9466 1 (Tesla PG503-216) model | log
tof_vfi_spynet_sintel_clean_nobn_1xb1_vimeo90k Vimeo90k-triplet Video Super-Resolution spynet_chairs_final 0.9464 1 (Tesla PG503-216) model | log
tof_vfi_spynet_sintel_final_nobn_1xb1_vimeo90k Vimeo90k-triplet Video Super-Resolution spynet_chairs_final 0.9465 1 (Tesla PG503-216) model | log
tof_vfi_spynet_pytoflow_nobn_1xb1_vimeo90k Vimeo90k-triplet Video Super-Resolution spynet_chairs_final 0.9467 1 (Tesla PG503-216) model | log

Note: These pretrained SPyNets don't contain BN layer since batch_size=1, which is consistent with https://github.com/Coldog2333/pytoflow.

Evaluated on RGB channels. The metrics are PSNR / SSIM .

Model Dataset Task Vid4 Training Resources Download
tof_x4_vimeo90k_official vimeo90k Video Super-Resolution 24.4377 / 0.7433 - model

Quick Start

Train

Train Instructions

You can use the following commands to train a model with cpu or single/multiple GPUs.

TOF only supports video interpolation task for training now.

# cpu train
CUDA_VISIBLE_DEVICES=-1 python tools/train.py configs/tof/tof_spynet-chair-wobn_1xb1_vimeo90k-triplet.py

# single-gpu train
python tools/train.py configs/tof/tof_spynet-chair-wobn_1xb1_vimeo90k-triplet.py

# multi-gpu train
./tools/dist_train.sh configs/tof/tof_spynet-chair-wobn_1xb1_vimeo90k-triplet.py 8

For more details, you can refer to Train a model part in train_test.md.

Test

Test Instructions

You can use the following commands to test a model with cpu or single/multiple GPUs.

TOF supports two tasks for testing.

Task 1: Video Interpolation

# cpu test
CUDA_VISIBLE_DEVICES=-1 python tools/test.py configs/tof/tof_spynet-chair-wobn_1xb1_vimeo90k-triplet.py https://download.openmmlab.com/mmediting/video_interpolators/toflow/pretrained_spynet_chair_20220321-4d82e91b.pth

# single-gpu test
python tools/test.py configs/tof/tof_spynet-chair-wobn_1xb1_vimeo90k-triplet.py https://download.openmmlab.com/mmediting/video_interpolators/toflow/pretrained_spynet_chair_20220321-4d82e91b.pth

# multi-gpu test
./tools/dist_test.sh configs/tof/tof_spynet-chair-wobn_1xb1_vimeo90k-triplet.py https://download.openmmlab.com/mmediting/video_interpolators/toflow/pretrained_spynet_chair_20220321-4d82e91b.pth 8

Task 2: Video Super-Resolution

# cpu test
CUDA_VISIBLE_DEVICES=-1 python tools/test.py configs/tof/tof_x4_official_vimeo90k.py https://download.openmmlab.com/mmediting/restorers/tof/tof_x4_vimeo90k_official-a569ff50.pth

# single-gpu test
python tools/test.py configs/tof/tof_x4_official_vimeo90k.py https://download.openmmlab.com/mmediting/restorers/tof/tof_x4_vimeo90k_official-a569ff50.pth

# multi-gpu test
./tools/dist_test.sh configs/tof/tof_x4_official_vimeo90k.py https://download.openmmlab.com/mmediting/restorers/tof/tof_x4_vimeo90k_official-a569ff50.pth 8

For more details, you can refer to Test a pre-trained model part in train_test.md.

Citation

@article{xue2019video,
  title={Video enhancement with task-oriented flow},
  author={Xue, Tianfan and Chen, Baian and Wu, Jiajun and Wei, Donglai and Freeman, William T},
  journal={International Journal of Computer Vision},
  volume={127},
  number={8},
  pages={1106--1125},
  year={2019},
  publisher={Springer}
}