-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Allow DDIMInverseScheduler to use same number of noising and denoising steps #3436
[WIP] Allow DDIMInverseScheduler to use same number of noising and denoising steps #3436
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
Hmm @clarencechen wdyt here? It essentially reverts the PR here: #2619 |
@patrickvonplaten I wouldn't say it reverts #2619, since the |
i have uploaded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments inline
each diffusion step uses the value of alphas product at that step and at the previous one. For the final | ||
step there is no previous alpha. When this option is `True` the previous alpha product is fixed to `0`, | ||
otherwise it uses the value of alpha at step `num_train_timesteps - 1`. | ||
set_alpha_to_one (`bool`, default `True`): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have to do the renaming here? This is backwards breaking IMO
steps_offset: int = 0, | ||
prediction_type: str = "epsilon", | ||
clip_sample_range: float = 1.0, | ||
revert_all_steps: bool = False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this ever used?
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
For the final DDIMScheduler step, we are already at t=0 and are predicting the sample at
t=0-num_train_steps/num_inference_steps.
The current DDIMInverseScheduler implemetation starts with the original image at t=0. But to match DDIMScheduler, one needs to start at t=0-num_train_steps/num_inference_steps in order to revert all denoising steps.
This can also be seen by the fact that there are num_inference_steps denoising steps but only num_inference_steps-1 noising steps in StableDiffusionPix2PixZeroPipeline and StableDiffusionDiffEditPipeline (those two pipelines are currently using DDIMInverseScheduler).
In contrast to the current DDIMInverseScheduler implementation, in Null-text Inversion for Editing Real Images using Guided Diffusion Models inversion starts at t=0-num_train_steps/num_inference_steps and reverses all denoising steps.
In the description of DDIMInverseScheduler, it is stated that "The implementation is mostly based on the DDIM inversion definition of Null-text Inversion for Editing Real Images using Guided Diffusion Models" - which is currently not quite the case.
This PR allows to use DDIM inversion as implemented in Null-text Inversion for Editing Real Images using Guided Diffusion Models, when setting revert_all_steps=True.
This PR is strongly influenced by https://github.com/google/prompt-to-prompt/#null-text-inversion-for-editing-real-images.