Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support AnimateDiff, a popular text2animation method #1980

Merged
merged 60 commits into from
Sep 20, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
15d09e4
first commit for animatediff
ElliotQi Aug 15, 2023
a39aa96
fix lint errors
ElliotQi Aug 15, 2023
955248f
modify readme file and add readme_zh-CN.md
ElliotQi Aug 15, 2023
b46ec12
fix some typos in readme
ElliotQi Aug 15, 2023
016c86e
delete test_animatediff.py
ElliotQi Aug 15, 2023
127c99a
add some docstring
ElliotQi Aug 20, 2023
0aa9655
Merge branch 'open-mmlab:main' into animatediff
ElliotQi Aug 21, 2023
c562050
Merge branch 'open-mmlab:main' into animatediff
ElliotQi Aug 23, 2023
3609ac3
fix cross attention for 512*512 animation quality
ElliotQi Aug 24, 2023
800faae
Merge branch 'open-mmlab:main' into animatediff
ElliotQi Aug 24, 2023
8a092b3
fix some initial setting for cpu load
ElliotQi Sep 2, 2023
6dd1920
add unittest samples
ElliotQi Sep 2, 2023
ec7e191
modify unittest codes
ElliotQi Sep 2, 2023
60cd955
remove duplicated unittest files
ElliotQi Sep 2, 2023
116f42a
modify unittest codes for minimum memory
ElliotQi Sep 2, 2023
e0994cf
modify test_unet3d resolution for minimum memory unittest
ElliotQi Sep 2, 2023
dda8716
modify test_unet_blocks3d input resolution for minimum memory unittest
ElliotQi Sep 2, 2023
87bb203
Merge branch 'open-mmlab:main' into animatediff
ElliotQi Sep 3, 2023
9e0b432
modify animatediff.py for gradio
ElliotQi Sep 3, 2023
55d381c
add gradio app for animatediff
ElliotQi Sep 3, 2023
a10ede2
skip test with large memory
ElliotQi Sep 4, 2023
cde60c6
Merge branch 'main' into animatediff
ElliotQi Sep 4, 2023
276e051
Merge branch 'main' into animatediff
liuwenran Sep 4, 2023
4f54924
fix environment building
ElliotQi Sep 6, 2023
1de46df
Merge branch 'animatediff' of github.com:ElliotQi/mmagic into animate…
ElliotQi Sep 6, 2023
b645f0c
Merge branch 'main' into animatediff
ElliotQi Sep 9, 2023
485fdd2
Merge branch 'main' into animatediff
ElliotQi Sep 11, 2023
76cb637
fix merging conflict
ElliotQi Sep 11, 2023
d67f61a
Merge branch 'open-mmlab:main' into animatediff
ElliotQi Sep 11, 2023
541c2e9
Add different style ckpt
ElliotQi Sep 11, 2023
b476929
Merge branch 'animatediff' of github.com:ElliotQi/mmagic into animate…
ElliotQi Sep 11, 2023
eaeb9a0
Merge branch 'main' into animatediff
liuwenran Sep 11, 2023
3a1de39
fix environment building
ElliotQi Sep 11, 2023
7168ee2
Merge branch 'animatediff' of github.com:ElliotQi/mmagic into animate…
ElliotQi Sep 11, 2023
3502e09
add new motion module
ElliotQi Sep 13, 2023
cf0e50c
Merge branch 'open-mmlab:main' into animatediff
ElliotQi Sep 18, 2023
71102cf
add prompts for all config files in README
ElliotQi Sep 18, 2023
e407e79
add image in README
ElliotQi Sep 18, 2023
ec15c1b
fix sd ckpt auto downloading
ElliotQi Sep 18, 2023
7b51631
remove unused import in test code
ElliotQi Sep 18, 2023
55a2b15
align README_zh and README
ElliotQi Sep 18, 2023
8e33acc
fix building error
ElliotQi Sep 18, 2023
e89b7a4
delete unused comments
ElliotQi Sep 18, 2023
f9b6f2a
fix test memory
ElliotQi Sep 18, 2023
fa16d66
Merge branch 'main' into animatediff
ElliotQi Sep 18, 2023
c7a90e8
fix text_model error for later transformer version
ElliotQi Sep 19, 2023
df00be0
Merge branch 'main' into animatediff
ElliotQi Sep 19, 2023
67c29ca
fix comment copyright
ElliotQi Sep 19, 2023
8cece90
add animatediff gradio README
ElliotQi Sep 19, 2023
db8023d
modify some copyright in motion_module.py
ElliotQi Sep 19, 2023
178a2e8
modify README for better test guidance
ElliotQi Sep 19, 2023
cacfd29
fix inference without xformers and mimsave for higher version of imageio
ElliotQi Sep 19, 2023
28a3437
fix errors in different versions of imageio
ElliotQi Sep 20, 2023
7aed840
Merge branch 'main' into animatediff
ElliotQi Sep 20, 2023
9e30b43
add train tutorial and pretrained models
ElliotQi Sep 20, 2023
e5d611b
fix some comments in README
ElliotQi Sep 20, 2023
08bf694
delete personal information
ElliotQi Sep 20, 2023
cda517c
fix gradio sd selection
ElliotQi Sep 20, 2023
fbf49ec
add some tips for run gradio
ElliotQi Sep 20, 2023
9bad0b9
add pretrained links
ElliotQi Sep 20, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
221 changes: 221 additions & 0 deletions configs/animatediff/README.md
ElliotQi marked this conversation as resolved.
Show resolved Hide resolved

Large diffs are not rendered by default.

220 changes: 220 additions & 0 deletions configs/animatediff/README_zh-CN.md

Large diffs are not rendered by default.

61 changes: 61 additions & 0 deletions configs/animatediff/animatediff_Lyriel.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# config for model
stable_diffusion_v15_url = 'runwayml/stable-diffusion-v1-5'
models_path = '/home/AnimateDiff/models/'
randomness = dict(
seed=[
10917152860782582783, 6399018107401806238, 15875751942533906793,
6653196880059936551
],
diff_rank_seed=True)

diffusion_scheduler = dict(
type='DDIMScheduler',
beta_end=0.012,
beta_schedule='linear',
beta_start=0.00085,
num_train_timesteps=1000,
prediction_type='epsilon',
set_alpha_to_one=True,
clip_sample=False,
thresholding=False,
steps_offset=1)

model = dict(
type='AnimateDiff',
vae=dict(
type='AutoencoderKL',
from_pretrained=stable_diffusion_v15_url,
subfolder='vae'),
unet=dict(
type='UNet3DConditionMotionModel',
unet_use_cross_frame_attention=False,
unet_use_temporal_attention=False,
use_motion_module=True,
motion_module_resolutions=[1, 2, 4, 8],
motion_module_mid_block=False,
motion_module_decoder_only=False,
motion_module_type='Vanilla',
motion_module_kwargs=dict(
num_attention_heads=8,
num_transformer_block=1,
attention_block_types=['Temporal_Self', 'Temporal_Self'],
temporal_position_encoding=True,
temporal_position_encoding_max_len=24,
temporal_attention_dim_div=1),
subfolder='unet',
from_pretrained=stable_diffusion_v15_url),
text_encoder=dict(
type='ClipWrapper',
clip_type='huggingface',
pretrained_model_name_or_path=stable_diffusion_v15_url,
subfolder='text_encoder'),
tokenizer=stable_diffusion_v15_url,
scheduler=diffusion_scheduler,
test_scheduler=diffusion_scheduler,
data_preprocessor=dict(type='DataPreprocessor'),
motion_module_cfg=dict(path=models_path + 'Motion_Module/mm_sd_v14.ckpt'),
dream_booth_lora_cfg=dict(
type='ToonYou',
path=models_path + 'DreamBooth_LoRA/lyriel_v16.safetensors',
steps=25,
guidance_scale=7.5))
62 changes: 62 additions & 0 deletions configs/animatediff/animatediff_MajicMix.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# config for model
stable_diffusion_v15_url = 'runwayml/stable-diffusion-v1-5'
models_path = '/home/AnimateDiff/models/'
randomness = dict(
seed=[
1572448948722921032, 1099474677988590681, 6488833139725635347,
18339859844376517918
],
diff_rank_seed=True)

diffusion_scheduler = dict(
type='DDIMScheduler',
beta_end=0.012,
beta_schedule='linear',
beta_start=0.00085,
num_train_timesteps=1000,
prediction_type='epsilon',
set_alpha_to_one=True,
clip_sample=False,
thresholding=False,
steps_offset=1)

model = dict(
type='AnimateDiff',
vae=dict(
type='AutoencoderKL',
from_pretrained=stable_diffusion_v15_url,
subfolder='vae'),
unet=dict(
type='UNet3DConditionMotionModel',
unet_use_cross_frame_attention=False,
unet_use_temporal_attention=False,
use_motion_module=True,
motion_module_resolutions=[1, 2, 4, 8],
motion_module_mid_block=False,
motion_module_decoder_only=False,
motion_module_type='Vanilla',
motion_module_kwargs=dict(
num_attention_heads=8,
num_transformer_block=1,
attention_block_types=['Temporal_Self', 'Temporal_Self'],
temporal_position_encoding=True,
temporal_position_encoding_max_len=24,
temporal_attention_dim_div=1),
subfolder='unet',
from_pretrained=stable_diffusion_v15_url),
text_encoder=dict(
type='ClipWrapper',
clip_type='huggingface',
pretrained_model_name_or_path=stable_diffusion_v15_url,
subfolder='text_encoder'),
tokenizer=stable_diffusion_v15_url,
scheduler=diffusion_scheduler,
test_scheduler=diffusion_scheduler,
data_preprocessor=dict(type='DataPreprocessor'),
motion_module_cfg=dict(path=models_path + 'Motion_Module/mm_sd_v14.ckpt'),
dream_booth_lora_cfg=dict(
type='ToonYou',
path=models_path +
'DreamBooth_LoRA/majicmixRealistic_v5Preview.safetensors',
steps=25,
guidance_scale=7.5))
61 changes: 61 additions & 0 deletions configs/animatediff/animatediff_RcnzCartoon.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# config for model
stable_diffusion_v15_url = 'runwayml/stable-diffusion-v1-5'
models_path = '/home/AnimateDiff/models/'
randomness = dict(
seed=[
16931037867122267877, 2094308009433392066, 4292543217695451092,
15572665120852309890
],
diff_rank_seed=True)

diffusion_scheduler = dict(
type='DDIMScheduler',
beta_end=0.012,
beta_schedule='linear',
beta_start=0.00085,
num_train_timesteps=1000,
prediction_type='epsilon',
set_alpha_to_one=True,
clip_sample=False,
thresholding=False,
steps_offset=1)

model = dict(
type='AnimateDiff',
vae=dict(
type='AutoencoderKL',
from_pretrained=stable_diffusion_v15_url,
subfolder='vae'),
unet=dict(
type='UNet3DConditionMotionModel',
unet_use_cross_frame_attention=False,
unet_use_temporal_attention=False,
use_motion_module=True,
motion_module_resolutions=[1, 2, 4, 8],
motion_module_mid_block=False,
motion_module_decoder_only=False,
motion_module_type='Vanilla',
motion_module_kwargs=dict(
num_attention_heads=8,
num_transformer_block=1,
attention_block_types=['Temporal_Self', 'Temporal_Self'],
temporal_position_encoding=True,
temporal_position_encoding_max_len=24,
temporal_attention_dim_div=1),
subfolder='unet',
from_pretrained=stable_diffusion_v15_url),
text_encoder=dict(
type='ClipWrapper',
clip_type='huggingface',
pretrained_model_name_or_path=stable_diffusion_v15_url,
subfolder='text_encoder'),
tokenizer=stable_diffusion_v15_url,
scheduler=diffusion_scheduler,
test_scheduler=diffusion_scheduler,
data_preprocessor=dict(type='DataPreprocessor'),
motion_module_cfg=dict(path=models_path + 'Motion_Module/mm_sd_v14.ckpt'),
dream_booth_lora_cfg=dict(
type='ToonYou',
path=models_path + 'DreamBooth_LoRA/rcnzCartoon3d_v10.safetensors',
steps=25,
guidance_scale=7.5))
62 changes: 62 additions & 0 deletions configs/animatediff/animatediff_RealisticVision.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# config for model
stable_diffusion_v15_url = 'runwayml/stable-diffusion-v1-5'
models_path = '/home/AnimateDiff/models/'
randomness = dict(
seed=[
5658137986800322009, 12099779162349365895, 10499524853910852697,
16768009035333711932
],
diff_rank_seed=True)

diffusion_scheduler = dict(
type='DDIMScheduler',
beta_end=0.012,
beta_schedule='linear',
beta_start=0.00085,
num_train_timesteps=1000,
prediction_type='epsilon',
set_alpha_to_one=True,
clip_sample=False,
thresholding=False,
steps_offset=1)

model = dict(
type='AnimateDiff',
vae=dict(
type='AutoencoderKL',
from_pretrained=stable_diffusion_v15_url,
subfolder='vae'),
unet=dict(
type='UNet3DConditionMotionModel',
unet_use_cross_frame_attention=False,
unet_use_temporal_attention=False,
use_motion_module=True,
motion_module_resolutions=[1, 2, 4, 8],
motion_module_mid_block=False,
motion_module_decoder_only=False,
motion_module_type='Vanilla',
motion_module_kwargs=dict(
num_attention_heads=8,
num_transformer_block=1,
attention_block_types=['Temporal_Self', 'Temporal_Self'],
temporal_position_encoding=True,
temporal_position_encoding_max_len=24,
temporal_attention_dim_div=1),
subfolder='unet',
from_pretrained=stable_diffusion_v15_url),
text_encoder=dict(
type='ClipWrapper',
clip_type='huggingface',
pretrained_model_name_or_path=stable_diffusion_v15_url,
subfolder='text_encoder'),
tokenizer=stable_diffusion_v15_url,
scheduler=diffusion_scheduler,
test_scheduler=diffusion_scheduler,
data_preprocessor=dict(type='DataPreprocessor'),
motion_module_cfg=dict(path=models_path + 'Motion_Module/mm_sd_v14.ckpt'),
dream_booth_lora_cfg=dict(
type='ToonYou',
path=models_path +
'DreamBooth_LoRA/realisticVisionV20_v20.safetensors',
steps=25,
guidance_scale=7.5))
64 changes: 64 additions & 0 deletions configs/animatediff/animatediff_RealisticVision_v2.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# config for model
stable_diffusion_v15_url = 'runwayml/stable-diffusion-v1-5'
models_path = '/home/AnimateDiff/models/'
randomness = dict(
seed=[
13100322578370451493, 14752961627088720670, 9329399085567825781,
16987697414827649302
],
diff_rank_seed=True)

diffusion_scheduler = dict(
type='DDIMScheduler',
beta_end=0.012,
beta_schedule='linear',
beta_start=0.00085,
num_train_timesteps=1000,
prediction_type='epsilon',
set_alpha_to_one=True,
clip_sample=False,
thresholding=False,
steps_offset=1)

model = dict(
type='AnimateDiff',
vae=dict(
type='AutoencoderKL',
from_pretrained=stable_diffusion_v15_url,
subfolder='vae'),
unet=dict(
type='UNet3DConditionMotionModel',
use_inflated_groupnorm=True,
unet_use_cross_frame_attention=False,
unet_use_temporal_attention=False,
use_motion_module=True,
motion_module_resolutions=[1, 2, 4, 8],
motion_module_mid_block=True,
motion_module_decoder_only=False,
motion_module_type='Vanilla',
motion_module_kwargs=dict(
num_attention_heads=8,
num_transformer_block=1,
attention_block_types=['Temporal_Self', 'Temporal_Self'],
temporal_position_encoding=True,
temporal_position_encoding_max_len=32,
temporal_attention_dim_div=1),
subfolder='unet',
from_pretrained=stable_diffusion_v15_url),
text_encoder=dict(
type='ClipWrapper',
clip_type='huggingface',
pretrained_model_name_or_path=stable_diffusion_v15_url,
subfolder='text_encoder'),
tokenizer=stable_diffusion_v15_url,
scheduler=diffusion_scheduler,
test_scheduler=diffusion_scheduler,
data_preprocessor=dict(type='DataPreprocessor'),
motion_module_cfg=dict(path=models_path +
'Motion_Module/mm_sd_v15_v2.ckpt'),
dream_booth_lora_cfg=dict(
type='ToonYou',
path=models_path +
'DreamBooth_LoRA/realisticVisionV20_v20.safetensors',
steps=25,
guidance_scale=7.5))
80 changes: 80 additions & 0 deletions configs/animatediff/animatediff_ToonYou.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# config for model
ElliotQi marked this conversation as resolved.
Show resolved Hide resolved
stable_diffusion_v15_url = 'runwayml/stable-diffusion-v1-5'
models_path = '/home/AnimateDiff/models/'
randomness = dict(
seed=[
10788741199826055526, 6520604954829636163, 6519455744612555650,
16372571278361863751
],
diff_rank_seed=True)

val_prompts = [
'best quality, masterpiece, 1girl, looking at viewer,\
blurry background, upper body, contemporary, dress',
'masterpiece, best quality, 1girl, solo, cherry blossoms,\
hanami, pink flower, white flower, spring season, wisteria,\
petals, flower, plum blossoms, outdoors, falling petals,\
white hair, black eyes,',
'best quality, masterpiece, 1boy, formal, abstract,\
looking at viewer, masculine, marble pattern',
'best quality, masterpiece, 1girl, cloudy sky,\
dandelion, contrapposto, alternate hairstyle,'
]
val_neg_propmts = [
'',
'badhandv4,easynegative,ng_deepnegative_v1_75t,verybadimagenegative_v1.3,\
bad-artist, bad_prompt_version2-neg, teeth',
'',
'',
]
diffusion_scheduler = dict(
type='DDIMScheduler',
beta_end=0.012,
beta_schedule='linear',
beta_start=0.00085,
num_train_timesteps=1000,
prediction_type='epsilon',
set_alpha_to_one=True,
clip_sample=False,
thresholding=False,
steps_offset=1)

model = dict(
type='AnimateDiff',
vae=dict(
type='AutoencoderKL',
from_pretrained=stable_diffusion_v15_url,
subfolder='vae'),
unet=dict(
type='UNet3DConditionMotionModel',
unet_use_cross_frame_attention=False,
unet_use_temporal_attention=False,
use_motion_module=True,
motion_module_resolutions=[1, 2, 4, 8],
motion_module_mid_block=False,
motion_module_decoder_only=False,
motion_module_type='Vanilla',
motion_module_kwargs=dict(
num_attention_heads=8,
num_transformer_block=1,
attention_block_types=['Temporal_Self', 'Temporal_Self'],
temporal_position_encoding=True,
temporal_position_encoding_max_len=24,
temporal_attention_dim_div=1),
subfolder='unet',
from_pretrained=stable_diffusion_v15_url),
text_encoder=dict(
type='ClipWrapper',
clip_type='huggingface',
pretrained_model_name_or_path=stable_diffusion_v15_url,
subfolder='text_encoder'),
tokenizer=stable_diffusion_v15_url,
scheduler=diffusion_scheduler,
test_scheduler=diffusion_scheduler,
data_preprocessor=dict(type='DataPreprocessor'),
motion_module_cfg=dict(path=models_path + 'Motion_Module/mm_sd_v14.ckpt'),
dream_booth_lora_cfg=dict(
type='ToonYou',
path=models_path + 'DreamBooth_LoRA/toonyou_beta3.safetensors',
steps=25,
guidance_scale=7.5))
Loading
Loading