UniCtrl

This repository is the implementation of

[TMLR 2024] UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified Attention Control

Authors: Tian Xia^*, Xuweiyi Chen^*, Sihan Xu^**
Affiliation: University of Michigan, University of Virginia, PixAI.art,
^*Equal contribution, ^**Correspondence

Project page | Paper | Demo


Original	UniCtrl


Original	UniCtrl

Updates🔥

Our code about UniCtrl is released and you can checkout our paper as well!

Overview 📖

We introduce UniCtrl, a novel, plug-and-play method that is universally applicable to improve the spatiotemporal consistency and motion diversity of videos generated by text-to-video models without additional training. UniCtrl ensures semantic consistency across different frames through cross-frame self-attention control, and meanwhile, enhances the motion quality and spatiotemporal consistency through motion injection and spatiotemporal synchronization.

Quick Start🔨

1. Clone Repo

git clone https://github.com/XuweiyiChen/UniCtrl.git
cd UniCtrl
cd examples/AnimateDiff

2. Prepare Environment

conda env create -f environment.yaml
conda activate animatediff_pt2

3. Download Checkpoints

Please refer to the official repo of AnimateDiff for the full setup guide. The setup guide is listed here.

Quickstart guide

git lfs install
git clone https://huggingface.co/runwayml/stable-diffusion-v1-5 models/StableDiffusion/

bash download_bashscripts/0-MotionModule.sh
bash download_bashscripts/5-RealisticVision.sh

🤗 Gradio Demo

We provide a Gradio Demo to demonstrate our method with UI.

python app.py

Alternatively, you can try the online demo hosted on Hugging Face: [demo link].

Citation 🖋️

If you find our repo useful for your research, please consider citing our paper:

 @misc{chen2024unictrl,
     title={UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified Attention Control}, 
     author={Xuweiyi Chen and Tian Xia and Sihan Xu},
     year={2024},
     eprint={2403.02332},
     archivePrefix={arXiv},
     primaryClass={cs.CV}
 }

Acknowledgement 🤍

This project is distributed under the MIT License. See LICENSE for more information.

The example code is built upon AnimateDiff and FreeInit. Thanks to the team for their impressive work!

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
assets		assets
examples/AnimateDiff		examples/AnimateDiff
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UniCtrl

[TMLR 2024] UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified Attention Control

Project page | Paper | Demo

Updates🔥

Overview 📖

Quick Start🔨

1. Clone Repo

2. Prepare Environment

3. Download Checkpoints

🤗 Gradio Demo

Citation 🖋️

Acknowledgement 🤍

About

Releases

Packages

Contributors 3

License

XuweiyiChen/UniCtrl

Folders and files

Latest commit

History

Repository files navigation

UniCtrl

[TMLR 2024] UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified Attention Control

Project page | Paper | Demo

Updates🔥

Overview 📖

Quick Start🔨

1. Clone Repo

2. Prepare Environment

3. Download Checkpoints

🤗 Gradio Demo

Citation 🖋️

Acknowledgement 🤍

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Packages