video_loopback_v2.mp4
The original video is from Live3D (thanks for Live3D's work!!!).
23/2/25 Compatible with multi-Controlnet
If you want to use multi-Controlnet, Split your paths in reference_frames_directory
with !!!
For example: /dir/path1/!!!/dir/path2/!!!/dir/path3/
23/2/14 This extension is compatible with sd-webui-controlnet.
Make sure Settings->ControlNet->Allow other script to control this extension
is checked, then they can work toghter.
This is a Stable Diffusion WebUI extension for video img2img, which improves the stability of the video through loopback and temporal blurring methods, trying to reduce the flicker that is typical of img2img animations. This extension can be seen as a video version of loopback and superimpose, with some enhanced features. It's a quite simple method, but sometimes useful.
The design is intended for character animations and usually needs to be used with Lora or DreamBooth. The model quality requirements are relatively high, and the stability of the model itself will greatly affect the difficulty of subsequent parameter adjustment.
The final video effect is very very sensitive to the setting of parameters, which requires careful adjustment. This is a very mysterious thing, if it is not adjusted well, it is better to use Batch img2img directly :)
It is hoped that open source will promote more people to test different parameter configurations, so as to find the best strategy.
With the help of this script, it allows for higher denoising (such as 0.4 or even higher), which enables us to obtain good results even in situations where the driving video is relatively blurry. However, generally speaking, denoising between 0.3~0.4 can solve many requirements.
This technology is well suited for optimizing videos generated by Live3D. Live3D entrance: https://github.com/transpchan/Live3D-v2
In addition to the parameters in the script interface, all parameters in the original webui interface will affect the generation, so don't forget to adjust this part.
This extension itself lacks some detailed testing, so there may be bugs.
The known bug at the moment is that in the img2img interface or inpainting interface, you need to make sure that there is an image in the webui interface before you start processing (just drag and drop an image).
Install it like any normal WebUI extension installation process.
Also, you need to install ffmpeg in advance.
In the img2img tab or inpainting tab, find the script named "Video Loopback".
You need to make sure that there is an image in the webui interface before you start processing (just drag and drop an image).
The input_directory
can be filled with the path of a video file or a folder containing images.
Each time the SD generates an image,
the generated image is blended (superimposed) with the original image to form a new original image for the next generation.
The intensity of the blend is specified by the superimpose_alpha
parameter,
which is 0 when the original image is preserved.
loop_n
can control how many times the entire blending process will be repeated.
Note that if batch_size
and batch_count
are not 1,
the images generated in this batch will be blended into one image first, then blended with the original image.
This strategy is referred to as batch blend later.
Batch blend can effectively improve stability and significantly reduce dependence on fix_seed
.
However, this strategy sometimes produces blurry images,
which can be eliminated by reasonably configuring denoising_schedule
, batch_count_schedule
, and image_post_processing_schedule
.
By the way, let's talk about fix_seed
, which can effectively improve stability,
but it may cause texture sticking and is more likely to produce error frames.
And when using batch blend, fix_seed
may actually reduce stability.
However, fix_seed
is still worth trying. If you want to try fix_seed
,
it is recommended to select Extra
and use subseed
to add a certain degree of randomness,
which can reduce the side effects of fix_seed
to a certain extent. Based on my personal experience,
subseed_strength
can be set a little more aggressive, for example 0.9
Switch to the inpainting page and check use_mask
, and you can use the inpainting function.
Mask blur
, Mask mode
, Mask content
, Inpaint area
, Only masked padding
and so on will work as usual.
The mask_directory
can be filled with the path of a single image or a folder.
If it is a folder, each mask in the folder must have the same file name as the input image.
If it is empty, the alpha channel of image will be use as the mask.
In simple terms, each frame is first blended with its time-adjacent frames and then sent to the SD.
This function can theoretically slightly enhance stability, but may produce artifacts and is difficult to find the optimal setting.
If necessary, it can be skipped.
By inputting a comma-separated set of decimal numbers in the temporal_superimpose_alpha_list
, this function can be enabled.
Each number represents a weight, and the number of weights must be odd.
The weight in the middle represents the weight of the current frame,
the first weight on the left represents the weight of the previous frame,
and the first weight on the right represents the weight of the next frame, and so on.
The weights do not need to be added up to 1, and normalization will be automatically performed internally.
If the temporal_superimpose_alpha_list
is empty or only has a single number, it means that this function is not enabled.
In fact, i found that video_post_process_method
is more useful than temporal blend.
Using Python syntax, input the expression, and the available variables are: image_i
representing the serial number of the current processed image,
loop_i
representing the current iteration.
You can use contents in the math
library directly.
In image_post_processing_schedule
, you can use the PIL.ImageFilter
module to effectively reduce the blur of the image, for example, you can set it as:
lambda img: img.filter(ImageFilter.EDGE_ENHANCE).filter(ImageFilter.SMOOTH) if loop_i in {5,8} else img
Where {5,8} specifically depends on the model being used.
Use video_post_process_method
to further improve the stability of the video. Currently, only FastDVDNet is supported.
Thanks to the work of FastDVDNet.
Why use FastDVDNet: There is no special reason, it is a random choice, maybe because their name has "fast" :)
I will try using other models in the future. If you have any recommended models, please let me know.
The key point of the entire process is model training, which has high requirements for the model. A high-quality and stable model must be trained first in order for this to work, and the script serves as a helper.