Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Higher resolution cc12m_1_cfg model #10

Open
njbbaer opened this issue Jan 10, 2022 · 5 comments
Open

Higher resolution cc12m_1_cfg model #10

njbbaer opened this issue Jan 10, 2022 · 5 comments

Comments

@njbbaer
Copy link

njbbaer commented Jan 10, 2022

First off, I'd like to say the new cc12m_1_cfg model is amazing and thank you for the work you're doing.

Are there any plans to release a 512x512 version of it? I know it's possible to output images at any size, but it's clear they look best at the native 256x256 resolution. While sometimes very beautiful in their own way, higher resolutions tend to repeat patterns and multiple generations of the prompt do not look as unique.

@crowsonkb
Copy link
Owner

I would need to heavily filter the dataset to exclude images that are smaller than 512px on the short edge, so probably not. However I am thinking about trying for 512x512 or larger with a LAION model later, because the dataset is so huge that filtering will still leave me with a sufficiently large dataset.

@njbbaer
Copy link
Author

njbbaer commented Jan 10, 2022

Good to know. I hope you do!

I wonder if it would be possible to start the model at 256x256 and run the output through a second pass at a higher resolution.

@njbbaer
Copy link
Author

njbbaer commented Jan 14, 2022

@crowsonkb I'm getting good results with this. I run the model through once, then upscale and feed it into reverse_sample to run backwards, and finally take that to run forwards again with the same prompt. The image comes out looking slightly different, but higher quality and preserving the major features of the original.

@crowsonkb
Copy link
Owner

Ohh. I have been experimenting with scaling up then re-noising the image and doing forward sampling starting from there (i.e. using it as an init image) and that has been working for me. I'm surprised reverse then forward sampling isn't preserving the upscale blur/artifacts though... are you doing unconditional reverse sampling then forward using a text condition, or some such?

@njbbaer
Copy link
Author

njbbaer commented Jan 15, 2022

Ohh. I have been experimenting with scaling up then re-noising the image and doing forward sampling starting from there (i.e. using it as an init image) and that has been working for me.

Does that work? I tried something like it at first, but the images were either too blurry or too dissimilar from the original. I might have done something wrong though. Can you share how you're re-noising the image?

I'm surprised reverse then forward sampling isn't preserving the upscale blur/artifacts though... are you doing unconditional reverse sampling then forward using a text condition, or some such?

Yeah that's exactly what I'm been doing. max_timesteps balances blurriness with losing detail from the original image. The example below was done with max_timesteps=0.8. It doesn't always work though and some images come out looking worse.

Bilinear interpolation

original

Diffusion upscaled

0 8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants