Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separating legacy inpaint pipeline #920

Closed
juno-hwang opened this issue Oct 20, 2022 · 26 comments
Closed

Separating legacy inpaint pipeline #920

juno-hwang opened this issue Oct 20, 2022 · 26 comments
Labels
stale Issues that haven't received updates

Comments

@juno-hwang
Copy link

juno-hwang commented Oct 20, 2022

The new fine-tuned inpaint pipeline in 0.6.0 release is very good, but its UNet architecture is different from original pipeline.
I think the inpaint pipeline in 0.5.1 release has some advantage such as sharing the same model with txt2img on GPU or inheriting DreamBooth UNet trained on txt2img model.
I actually often use DreamBooth model for inpainting, and that's a pretty big advantage.
So, how about add legacy inpaint pipeline so that people can still use fine-tuned UNet to inpaint?

Also, I suggest strength parameter in the new inpaint pipeline.
It can be implemented by fixing the mask input and diffuse the image channels only, just like img2img pipeline.

@patrickvonplaten
Copy link
Contributor

Hey @juno-hwang,

We still support the old "inpaint" pipeline it's just moved to "Legacy" now:

class StableDiffusionInpaintPipelineLegacy(DiffusionPipeline):

Happy to not deprecate it if you want :-)

@MartiGrau
Copy link

Hello @patrickvonplaten ,

I suppose you're refering to use the StableDiffusionInpaintPipelineLegacy class with the old weights (CompVis/stable-diffusion-v1-4) because if you want to test the new Inpaint pipeline (0.6.0 release) with the new weights (runwayml/stable-diffusion-inpainting) you get the following error:

RuntimeError: Given groups=1, weight of size [320, 9, 3, 3], expected input[2, 4, 64, 64] to have 9 channels, but got 4 channels instead

Meaning that the architecture has changed... So at the end, as @juno-hwang suggested, it would be great to add strength parameter in the new inpaint pipeline with the new weights (runwayml/stable-diffusion-inpainting).

@patrickvonplaten
Copy link
Contributor

Hey @MartiGrau,

Currently we don't have enough time to do the PR ourselves, but if you think it makes sense to add a strength parameter, I'm more than happy to review a PR :-)

@ulmewennberg
Copy link

ulmewennberg commented Oct 26, 2022

Hey @patrickvonplaten 👋

Are there currently any plans to also support Dreambooth training (https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/train_dreambooth.py) with the new inpainting pipeline (https://huggingface.co/runwayml/stable-diffusion-inpainting)?

If not, I would be happy to try to implement this if this makes sense.

@patrickvonplaten
Copy link
Contributor

Think this could definitely make sense @patil-suraj what do you think? :-)

@patil-suraj
Copy link
Contributor

Hey @ulmewennberg that would be awesome, feel free to work on the PR and let us know if you any questions, happy to help :)

@MartiGrau , the strength argument is not really needed here. It's used in the legacy inpainting pipeline as it's based on the image2image example, where we provide an initial image and the strength argument decides how much noise to add to the image (which also control the number of inference timestep). Which is not the case here, as we don't use the image as an init_image the image is converted into latents and directly passed to the model along with the mask.

@juno-hwang
Copy link
Author

Hey @patil-suraj, as you said strength is not necessary, but I think introducing strength could allow us to re-paint the given region while keeping some consistency just like legacy inpainting pipeline.
KakaoTalk_20220710_223619816
Also I found an intereting phenomena on img2img pipeline : Theoretically strength=1 should be equivalent with txt2img because it means the original is fully diffused into perfect Gaussian distribution $x_T$, but somewhy it is different and it keeps the original image's information. The figure above shows the difference between txt2img and img2img with strength=1 started from plain white image. I guess it's due to improper index shift on scheduler.
Futhermore legacy inpainting pipeline is not good at seamless inpainting with high strength. For this reason, it would be very useful for image editing if we have new inpainting pipeline with proper strength implementation.

@patil-suraj
Copy link
Contributor

The figure above shows the difference between txt2img and img2img with strength=1 started from plain white image. I guess it's due to improper index shift on scheduler.

I see, thanks for reporting, will take a look at this.

Also, as I said above, we don't use init_image in inpainting, as we don't add any noise to the image and mask we pass for in-painting, because of this it's not possible to add strength here.

However, to keep consistency, what I could suggest is:
get the in-painted image from the pipeline and then replace the non-masked parts of the image with the original image. That could achieve what you are looking for.

@ulmewennberg
Copy link

Awesome, @patil-suraj and @patrickvonplaten! Some questions so far on the implementation:

"For inpainting, the UNet has 5 additional input channels (4 for the encoded masked-image and 1 for the mask itself)"

  1. Should I add noise to all the 9 input channels? Or perhaps not add it to the mask?
  2. The output has only 4 channels. How do I compute the loss between a 4-dimensional noise_pred and a 9-dimensional input?
    Should I blend the two fourdimensional tensors by using the mask in some way, or for example just take the first 4 dimensions?

@thedarkzeno
Copy link
Contributor

@ulmewennberg I'm also working on it myself, from the inpainting pipeline it seem to only use noise on the image latents

@thedarkzeno
Copy link
Contributor

I made a pull request here
But it's not passing the tests, can someone guide me to fix it?
It's my first contribution, not sure how to do it right
@ulmewennberg @patil-suraj

@patil-suraj
Copy link
Contributor

Hey @ulmewennberg !

Should I add noise to all the 9 input channels? Or perhaps not add it to the mask?
The noise is added to the target image that we want the model to generate.

The output has only 4 channels. How do I compute the loss between a 4-dimensional noise_pred and a 9-dimensional input?

We add noise to the target encoded image which has 4 channels, and the loss is simply computed against that noise.

Hope this makes it clear.

@patil-suraj
Copy link
Contributor

Thanks for the PR @thedarkzeno ! will take a look.

@patil-suraj
Copy link
Contributor

Hey @thedarkzeno and @ulmewennberg , actually not sure if I understand this. What's the goal of doing dream booth with in-painting ? How would it work ?

@ulmewennberg
Copy link

@patil-suraj the goal is to be able to for example take perform image-inpainting using custom concepts.

@patil-suraj
Copy link
Contributor

patil-suraj commented Nov 2, 2022

I see, so you mean in-paint the custom concepts in the masked part ?

@ulmewennberg
Copy link

@patil-suraj yep, this is correct

@Mystfit
Copy link
Contributor

Mystfit commented Nov 7, 2022

Hello @patrickvonplaten ,

I suppose you're refering to use the StableDiffusionInpaintPipelineLegacy class with the old weights (CompVis/stable-diffusion-v1-4) because if you want to test the new Inpaint pipeline (0.6.0 release) with the new weights (runwayml/stable-diffusion-inpainting) you get the following error:

RuntimeError: Given groups=1, weight of size [320, 9, 3, 3], expected input[2, 4, 64, 64] to have 9 channels, but got 4 channels instead

Meaning that the architecture has changed... So at the end, as @juno-hwang suggested, it would be great to add strength parameter in the new inpaint pipeline with the new weights (runwayml/stable-diffusion-inpainting).

I'm also receiving the same exact same error using the latest 0.7.2 diffusers library, StableDiffusionInpaintPipeline, and the latest runwayml/stable-diffusion-inpainting model. Is there anything I have to change on my end to fix this?

@patil-suraj
Copy link
Contributor

@Mystfit

The StableDiffusionInpaintPipeline should work with runwayml/stable-diffusion-inpainting model. Could you please post a code snippet so we could take a look ?

@Mystfit
Copy link
Contributor

Mystfit commented Nov 8, 2022

@Mystfit

The StableDiffusionInpaintPipeline should work with runwayml/stable-diffusion-inpainting model. Could you please post a code snippet so we could take a look ?

I think I've tracked down the cause of the error in my case. I'm using the StableDiffusionLongPromptWeightingPipeline (lpw_stable_diffusion) custom pipeline to add prompt weighting support and it doesn't seem to play well with the inpainting model or pipeline. When I remove the custom pipeline, then the error goes away.

@alicranck
Copy link

@thedarkzeno I'm currently experimenting with your code a bit, did you manage to get good results with it? any chance you can provide the flags you used?

@opetliak
Copy link

@thedarkzeno what did you use for pretrained_model_name_or_path ("runwayml/stable-diffusion-inpainting"?).
I'm trying to launch the code, but it looks like it fails with "with_prior_preservation mode"

@thedarkzeno
Copy link
Contributor

thedarkzeno commented Nov 18, 2022

Hi everyone, I haven't tested it extensively but got some interesting results, like this one:
toycat
It's still not perfect, but I'll see what I can do to improve.
@opetliak Yes, I used the runwayml/stable-diffusion-inpainting, the prior preservation is broken, but I'll try to fix it

@patrickvonplaten
Copy link
Contributor

Very cool!

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot added the stale Issues that haven't received updates label Dec 15, 2022
PhaneeshB pushed a commit to nod-ai/diffusers that referenced this issue Mar 1, 2023
@balasuh
Copy link

balasuh commented Aug 18, 2023

Hey @juno-hwang,

We still support the old "inpaint" pipeline it's just moved to "Legacy" now:

class StableDiffusionInpaintPipelineLegacy(DiffusionPipeline):

Happy to not deprecate it if you want :-)

Hi @patrickvonplaten patrickvonplaten,

I see that the legacy pipeline is being deprecated. I work regularly with older models, and I find this very useful. Would it be possible to NOT deprecate this pipeline?

If you absolutely have to, what are my options to keep using this pipeline?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Issues that haven't received updates
Projects
None yet
Development

No branches or pull requests

10 participants