Perception-Oriented Video Frame Interpolation via Asymmetric Blending 🔗
Guangyang Wu, Xin Tao, Changlin Li, Wenyi Wang, Xiaohong Liu, Qingqing Zheng
In CVPR 2024
This repository represents the official implementation of the paper titled "Perception-Oriented Video Frame Interpolation via Asymmetric Blending", also denoted as "PerVFI".
We present PerVFI, a novel paradigm for perception-oriented video frame interpolation.
- Asymmetric synergistic blending scheme: reduce blurry and ghosting effects derived from unavoidable motion error.
- Generative model as decoder: reconstruct results sampled from a distribution to resolve temporal supervision misalignment during training.
- Future: network structure can be meticulously optimized to improve efficiency and performance in the future.
2024-9-7: VFIBenchmark is released! Feel free to reproduce metrices listed in paper.
2024-6-13: Paper Accepted! . Release the inference code (this repository).
2024-6-1: Added arXiv version: .
- ❗ 🔜 Inference code for customized flow estimator.
- 🔜 Google Colab demo.
- 🔜 Online interactive demo.
- Hugging Face Space (optional).
- Add GIFs in page for better visualization.
We offer several ways to interact with PerVFI:
- Run the demo locally (requires a GPU and Anaconda, see Installation Guide). Local development instructions with this codebase are given below.
- Extended demo on Google Colab (coming soon).
- Online interactive demo (coming soon).
The inference code was tested on:
- Ubuntu 22.04 LTS, Python 3.10.12, CUDA 11.7, GeForce RTX 4090
- MacOS 14.2, Python 3.10.12, M1 16G
We recommend running the code in WSL2:
- Install WSL following installation guide.
- Install CUDA support for WSL following installation guide.
- Find your drives in
/mnt/<drive letter>/
; check WSL FAQ for more details. Navigate to the working directory of choice.
Clone the repository (requires git):
git clone https://github.com/mulns/PerVFI.git
cd PerVFI
We provide several ways to install the dependencies.
-
Using Conda.
Windows users: Install the Linux version into the WSL.
After the installation, create the environment and install dependencies into it:
conda env create -f environment.yaml conda activate pervfi
-
Using pip: Alternatively, create a Python native virtual environment and install dependencies into it:
python -m venv venv/pervfi source venv/pervfi/bin/activate pip install -r requirements.txt
Keep the environment activated before running the inference script. Activate the environment again after restarting the terminal session.
Place your video images in a directory, for example, under input/in-the-wild_example
, and run the following inference command.
Download pre-trained models and place them to folder checkpoints
. This includes checkpoints for various optical flow estimators. You can choose one for simple use or all for comparison.
The Default checkpoint is trained only using Vimeo90K dataset.
python infer_video.py -m [OFE]+pervfi -data input -fps [OUT_FPS]
NOTE:
OFE
is a placeholder for optical flow estimator name. In this repo, we support RAFT, GMA, GMFlow. You can also use your preferred flow estimator (future feature).OUT_FPS
is a placeholder for frame rate (default to 10) of output video (maybe save with images).
The Vb checkpoint (faster) replaces the normalizing flow-generator with a multi-scale decoder to achieve faster inference speed, though with a compromise in perceptual quality:
python infer_video.py -m [OFE]+pervfi-vb -data input -fps [OUT_FPS]
You can find all results in output
. Enjoy!
Will be included in VFI-Benchmark.
Comming Soon~
Please refer to this instruction.
Please cite our paper:
@InProceedings{Wu_2024_CVPR,
author = {Wu, Guangyang and Tao, Xin and Li, Changlin and Wang, Wenyi and Liu, Xiaohong and Zheng, Qingqing},
title = {Perception-Oriented Video Frame Interpolation via Asymmetric Blending},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2024},
pages = {2753-2762}
}
This work is licensed under the Apache License, Version 2.0 (as defined in the LICENSE).
By downloading and using the code and model you agree to the terms in the LICENSE.