Make refiner switchover based on model timesteps instead of sampling steps #14978
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
edit: converted to draft while I troubleshoot an off-by-one errorThere is a bug where model alphas_cumprod changes (compatibility casting option or zero terminal snr) are reverted during the timestep the refiner is applied. This will have to be fixed separately.Screenshots/videos:
Examples of old behavior in txt2img:
The refiner model used here was trained for the last 200 timesteps. The Karras schedule type, especially on zero snr, drastically changes the model timesteps called during this 50 step sampling process, which results in the refiner being switched to too early on what is actually the correct setting on the default noise schedule. The effectively correct setting for Karras samplers is 0.88 for this refiner under the old configuration.
Now for the fixed version:
With this fix, the behavior of the refiner is consistent with the same settings across different schedules, and it no longer triggers too early. 0.8 is reliably a correct setting.
Examples of old behavior in img2img/inpainting (inpainting mask is over the head, adding a hat, 0.75 denoising strength):
This one is more complicated, and the differences are subtle. The effectively correct settings for the normal schedule is to switch over at 0.75, and for Karras it is correct to switch over at 0.85. Using the expected setting of 0.8 therefore is too late for normal schedules and too early for Karras ones. As denoising strength gets lower, the problem becomes more severe.
This grid shows the behavior after the fix. Switch at 0.8 is now correct for both.
Checklist: