Skip to content

Official implementation of UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified Attention Control

License

Notifications You must be signed in to change notification settings

XuweiyiChen/UniCtrl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

UniCtrl

Paper Project Page Hugging Face

This repository is the implementation of

[TMLR 2024] UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified Attention Control

  • Authors: Tian Xia*, Xuweiyi Chen*, Sihan Xu**
  • Affiliation: University of Michigan, University of Virginia, PixAI.art,
  • *Equal contribution, **Correspondence
Original UniCtrl
Original UniCtrl
Original UniCtrl
Original UniCtrl

Updates🔥

  • Our code about UniCtrl is released and you can checkout our paper as well!

Overview 📖

overall_structure

We introduce UniCtrl, a novel, plug-and-play method that is universally applicable to improve the spatiotemporal consistency and motion diversity of videos generated by text-to-video models without additional training. UniCtrl ensures semantic consistency across different frames through cross-frame self-attention control, and meanwhile, enhances the motion quality and spatiotemporal consistency through motion injection and spatiotemporal synchronization.

Quick Start🔨

1. Clone Repo

git clone https://github.com/XuweiyiChen/UniCtrl.git
cd UniCtrl
cd examples/AnimateDiff

2. Prepare Environment

conda env create -f environment.yaml
conda activate animatediff_pt2

3. Download Checkpoints

Please refer to the official repo of AnimateDiff for the full setup guide. The setup guide is listed here.

Quickstart guide

git lfs install
git clone https://huggingface.co/runwayml/stable-diffusion-v1-5 models/StableDiffusion/

bash download_bashscripts/0-MotionModule.sh
bash download_bashscripts/5-RealisticVision.sh

🤗 Gradio Demo

We provide a Gradio Demo to demonstrate our method with UI.

python app.py

Alternatively, you can try the online demo hosted on Hugging Face: [demo link].

Citation 🖋️

If you find our repo useful for your research, please consider citing our paper:

 @misc{chen2024unictrl,
     title={UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified Attention Control}, 
     author={Xuweiyi Chen and Tian Xia and Sihan Xu},
     year={2024},
     eprint={2403.02332},
     archivePrefix={arXiv},
     primaryClass={cs.CV}
 }

Acknowledgement 🤍

This project is distributed under the MIT License. See LICENSE for more information.

The example code is built upon AnimateDiff and FreeInit. Thanks to the team for their impressive work!

About

Official implementation of UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified Attention Control

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •