Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make refiner switchover based on model timesteps instead of sampling steps #14978

Merged
merged 4 commits into from
Mar 2, 2024

Conversation

drhead
Copy link
Contributor

@drhead drhead commented Feb 20, 2024

Description

  • This makes the default behavior of refiner switchover aligned with model timesteps instead of sampling timesteps. This is easier to use, since setting refiner switchover to 0.8 (for a refiner trained on the last 200 timesteps like SDXL's) will now always switch to the refiner as soon as we're sampling from timesteps the refiner was trained on, where previously using img2img or different noise schedules could lead to unexpected model behavior (examples below).
  • I changed this to work off of the existing refiner switchover slider. The new default behavior will be to derive timesteps from sigma (or in DDIM's case take them directly) and use that to determine whether it is time to switch over. Old behavior is supported by a compatibility option.
  • Implements [Feature Request]: Refiner switchover should be controlled by (fraction of) training timesteps and not by fraction of sampling #14970
  • edit: converted to draft while I troubleshoot an off-by-one error There is a bug where model alphas_cumprod changes (compatibility casting option or zero terminal snr) are reverted during the timestep the refiner is applied. This will have to be fixed separately.

Screenshots/videos:

Examples of old behavior in txt2img:
xyz_grid-1367-36463525-(best quality, high quality,_1 5) by strange-fox, solo, anthro, male, rat, manly, clothed, jacket, shirt, detailed background, n
The refiner model used here was trained for the last 200 timesteps. The Karras schedule type, especially on zero snr, drastically changes the model timesteps called during this 50 step sampling process, which results in the refiner being switched to too early on what is actually the correct setting on the default noise schedule. The effectively correct setting for Karras samplers is 0.88 for this refiner under the old configuration.

Now for the fixed version:
xyz_grid-1368-36463525-(best quality, high quality,_1 5) by strange-fox, solo, anthro, male, rat, manly, clothed, jacket, shirt, detailed background, n
With this fix, the behavior of the refiner is consistent with the same settings across different schedules, and it no longer triggers too early. 0.8 is reliably a correct setting.

Examples of old behavior in img2img/inpainting (inpainting mask is over the head, adding a hat, 0.75 denoising strength):
xyz_grid-0028-691039214-(best quality, high quality,_1 5) by strange-fox, solo, anthro, male, rat, manly, clothed, jacket, shirt, hat, detailed backgrou
This one is more complicated, and the differences are subtle. The effectively correct settings for the normal schedule is to switch over at 0.75, and for Karras it is correct to switch over at 0.85. Using the expected setting of 0.8 therefore is too late for normal schedules and too early for Karras ones. As denoising strength gets lower, the problem becomes more severe.
xyz_grid-0029-691039214-(best quality, high quality,_1 5) by strange-fox, solo, anthro, male, rat, manly, clothed, jacket, shirt, hat, detailed backgrou
This grid shows the behavior after the fix. Switch at 0.8 is now correct for both.

Checklist:

@drhead drhead marked this pull request as draft February 20, 2024 22:46
@drhead drhead marked this pull request as ready for review February 20, 2024 23:04
@drhead
Copy link
Contributor Author

drhead commented Feb 21, 2024

I will note that there are two remaining cases that are handled poorly by the refiner:

  1. DPM Adaptive -- Will switch to the refiner on the first timestep under the threshold, but can then sample a higher timestep, producing incorrect output. Previously, nothing was able to trigger refiner switchover since DPM Adaptive has no definite step size. Can be mitigated by choosing a slightly higher trigger point or by not using the refiner.
  2. Restart -- Refiner will be switched to the first time timesteps fall under the threshold (around the halfway point of sampling), which will then go over the threshold, then below it again, then above it, then below it once more.

There is no proper way to handle these cases without having the refiner switch on and off multiple times during sampling. Neither particularly worked well with refiners to begin with, and Restart is the only one that could be argued to be a regression. I can look into ways to get Restart to only switch to the refiner on the last "restart", which would have Restart working as well as it could have before, but I don't think that robust handling of both of these cases would have benefits that justify the extra complexity required.

@AUTOMATIC1111 AUTOMATIC1111 merged commit aabedcb into AUTOMATIC1111:dev Mar 2, 2024
3 checks passed
AUTOMATIC1111 added a commit that referenced this pull request Mar 2, 2024
ruchej pushed a commit to ruchej/stable-diffusion-webui that referenced this pull request Sep 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants