GitHub - frankchieng/ComfyUI_Aniportrait: Unofficial implementation of AniPortrait custom node in ComfyUI

Updates:

① Implement the frame_interpolation to speed up generation

② Modify the current code and support chain with the VHS nodes, i just found that comfyUI IMAGE type requires the torch float32 datatype, and AniPortrait heavily used numpy of image unit8 datatype,so i just changed my mind from my own image/video upload and generation nodes to the prevelance SOTA VHS image/video upload and video combined nodes,it WYSIWYG and inteactive well and instantly render the result

✅ [2024/04/09] raw video to pose video with reference image(aka self-driven)
✅ [2024/04/09] audio driven
✅ [2024/04/09] face reenacment
✅ [2024/04/22] implement audio2pose model and pre-trained weight for audio2video,the face reenacment and audio2video workflow has been modified, currently inference up to a maximum length of 10 seconds has been supported,you can experiment with the length hyperparameter.

U can contact me thr twitter Weixin：GalaticKing

audio driven combined with reference image and reference video

audio2video workflow

Aniportrait_00002-audio.mp4

raw video to pose video with reference image

Aniportrait_00004-audio.mp4

face reenacment

video2video workflow

AnimateDiff_00001-audio.mp4

This is unofficial implementation of AniPortrait in ComfyUI custom_node,cuz i have routine jobs,so i will update this project when i have time

Aniportrait_pose2video.json

Audio driven

face reenacment

you should run

git clone https://github.com/frankchieng/ComfyUI_Aniportrait.git

then run

pip install -r requirements.txt

download the pretrained models

StableDiffusion V1.5

sd-vae-ft-mse

image_encoder

wav2vec2-base-960h

download the weights:

denoising_unet.pth reference_unet.pth pose_guider.pth motion_module.pth audio2mesh.pt audio2pose.pt film_net_fp16.pt

./pretrained_model/
|-- image_encoder
|   |-- config.json
|   `-- pytorch_model.bin
|-- sd-vae-ft-mse
|   |-- config.json
|   |-- diffusion_pytorch_model.bin
|   `-- diffusion_pytorch_model.safetensors
|-- stable-diffusion-v1-5
|   |-- feature_extractor
|   |   `-- preprocessor_config.json
|   |-- model_index.json
|   |-- unet
|   |   |-- config.json
|   |   `-- diffusion_pytorch_model.bin
|   `-- v1-inference.yaml
|-- wav2vec2-base-960h
|   |-- config.json
|   |-- feature_extractor_config.json
|   |-- preprocessor_config.json
|   |-- pytorch_model.bin
|   |-- README.md
|   |-- special_tokens_map.json
|   |-- tokenizer_config.json
|   `-- vocab.json
|-- audio2mesh.pt
|-- audio2pose.pt
|-- denoising_unet.pth
|-- motion_module.pth
|-- pose_guider.pth
|-- reference_unet.pth
|-- film_net_fp16.pt

Tips : The intermediate audio file will be generated and deleted,the raw video to pose video with audio and pose2video mp4 file will be located in the output directory of ComfyUI the original uploaded mp4 video requires square size like 512x512, otherwise the result will be weird

I've updated diffusers from 0.24.x to 0.26.2,so the diffusers/models/embeddings.py classname of PositionNet changed to GLIGENTextBoundingboxProjection and CaptionProjection changed to PixArtAlphaTextProjection,you should pay attention to it and modify the corresponding python files like src/models/transformer_2d.py if you installed the lower version of diffusers

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
assets		assets
configs		configs
src		src
README.md		README.md
__init__.py		__init__.py
nodes.py		nodes.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Updates:

audio driven combined with reference image and reference video

raw video to pose video with reference image

face reenacment

About

Releases

Packages

Contributors 2

Languages

frankchieng/ComfyUI_Aniportrait

Folders and files

Latest commit

History

Repository files navigation

Updates:

audio driven combined with reference image and reference video

raw video to pose video with reference image

face reenacment

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages