Is there anyone who can't generate images correctly? #122

Cyberhan123 · 2023-12-23T03:07:19Z

This is an existing problem I have seen. Some have been solved and some are weird. If you encounter it, please be patient and click in and check the comments. If you encounter something similar, please leave a message below.

Related issues:

FSSRepo · 2023-12-28T18:13:13Z

@leejet It's my impression or it seems that the CUDA backend is experiencing synchronization issues even from the CLIP model; it tends to happen sometimes.

build\bin\Release\sd -m models/kotosmix_v10-f16.gguf -p "beautiful anime girl, white hair, blue eyes, realistic, masterpiece, azur lane, 4k, high quality" -n "bad quality, ugly, face malformed, bad anatomy" --sampling-method dpm++2m --steps 20 -s 424354

with cpu backend (and cuda backend sometimes):

Incorrect image since an incorrect (incomplete) embedding is generated, I don't really know. negative embedding invalid.

Investigating this synchronization issue is very challenging; it tends to occur sporadically, and replicating it isn't easy. I tried printing the output tensor of the clip, and after 10 repetitions, I identified a change in the values of the embedding.

diimdeep · 2023-12-28T22:46:59Z

google colab T4 cuda, in img2img mode VAE without --vae-tiling always producing solid color image.
update: also seeing this for txt2img for t4 cuda

!git clone https://github.com/leejet/stable-diffusion.cpp
%cd stable-diffusion.cpp
!git submodule update --init
!cmake -B build -DSD_CUBLAS=ON && cmake --build build --config Release
!mkdir output models

# https://civitai.com/models/133005?modelVersionId=240840
!wget "https://civitai.com/api/download/models/240840?type=Model&format=SafeTensor&size=full&fp=fp16" -O models/juggernautXL_v7Rundiffusion.safetensors
!wget "https://huggingface.co/madebyollin/sdxl-vae-fp16-fix/resolve/main/sdxl_vae.safetensors?download=true" -O models/sdxl_vae-fp16-fix.safetensors
!wget "https://huggingface.co/madebyollin/taesdxl/resolve/main/diffusion_pytorch_model.safetensors?download=true" -O "models/diffusion_pytorch_model.safetensors"

!clear

!wget "https://upload.wikimedia.org/wikipedia/commons/thumb/c/ce/Improbable_neon_-_--_-_-generative_-code_-processing_-geometry_-algorithmicart_-xuxoe_-procedural_-everyday_-computerart_-3d_-daily_-improbable_-streamofconsciousness_-bnw_-surreal_-abstract_-colors_-blackandwhite_%2826697665157%29.jpg/640px-thumbnail.jpg" -O "output/blob.png"

!./build/bin/sd -m models/juggernautXL_v7Rundiffusion.safetensors \
--vae models/sdxl_vae-fp16-fix.safetensors \
-p "((a lovely cat)), Gorgeous, Magnetic, a palette of warm and vivid colors, Cozy Pastels" \
-n "3d render, 3dcg, abhorrent, abominable, anatomical nonsense, asymmetrical, awful, awkward, b&w" \
-s 77 --sampling-method euler_a --cfg-scale 6.1 --steps 16 \
-o output/tiling.png -M img2img -i output/blob.png --strength 0.77 --vae-tiling 


!./build/bin/sd -m models/juggernautXL_v7Rundiffusion.safetensors \
--vae models/sdxl_vae-fp16-fix.safetensors \
-p "((a lovely cat)), Gorgeous, Magnetic, a palette of warm and vivid colors, Cozy Pastels" \
-n "3d render, 3dcg, abhorrent, abominable, anatomical nonsense, asymmetrical, awful, awkward, b&w" \
-s 77 --sampling-method euler_a --cfg-scale 6.1 --steps 16 \
-o output/vae.png -M img2img -i output/blob.png --strength 0.77


!./build/bin/sd -m models/juggernautXL_v7Rundiffusion.safetensors \
--taesd models/diffusion_pytorch_model.safetensors \
-p "((a lovely cat)), Gorgeous, Magnetic, a palette of warm and vivid colors, Cozy Pastels" \
-n "3d render, 3dcg, abhorrent, abominable, anatomical nonsense, asymmetrical, awful, awkward, b&w" \
-s 77 --sampling-method euler_a --cfg-scale 6.1 --steps 16 \
-o output/taesd.png -M img2img -i output/blob.png --strength 0.77

from IPython.display import Image, display
display(Image(filename='output/tiling.png'))
display(Image(filename='output/vae.png'))
display(Image(filename='output/taesd.png'))

Cyberhan123 · 2023-12-30T02:40:57Z

@leejet It's my impression or it seems that the CUDA backend is experiencing synchronization issues even from the CLIP model; it tends to happen sometimes.
build\bin\Release\sd -m models/kotosmix_v10-f16.gguf -p "beautiful anime girl, white hair, blue eyes, realistic, masterpiece, azur lane, 4k, high quality" -n "bad quality, ugly, face malformed, bad anatomy" --sampling-method dpm++2m --steps 20 -s 424354
with cpu backend (and cuda backend sometimes):

Incorrect image since an incorrect (incomplete) embedding is generated, I don't really know. negative embedding invalid.

Investigating this synchronization issue is very challenging; it tends to occur sporadically, and replicating it isn't easy. I tried printing the output tensor of the clip, and after 10 repetitions, I identified a change in the values of the embedding.

@FSSRepo Please try colab
I think I found the real reason. The reason why the previous problem occurred is because there is a read and write competition in soft_max_f32. For details, you can check the document https://docs.nvidia.com/compute-sanitizer/ComputeSanitizer/index.html#racecheck- tool

========= COMPUTE-SANITIZER
ggml_init_cublas: GGML_CUDA_FORCE_MMQ:   no
ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
ggml_init_cublas: found 1 CUDA devices:
  Device 0: Tesla T4, compute capability 7.5
[INFO]  stable-diffusion.cpp:5386 - loading model from 'v1-5-pruned-emaonly.safetensors'
[INFO]  model.cpp:638  - load v1-5-pruned-emaonly.safetensors using safetensors format
[INFO]  stable-diffusion.cpp:5412 - Stable Diffusion 1.x 
[INFO]  stable-diffusion.cpp:5418 - Stable Diffusion weight type: f32
[INFO]  stable-diffusion.cpp:5573 - total memory buffer size = 2731.37MB (clip 470.66MB, unet 2165.24MB, vae 95.47MB)
[INFO]  stable-diffusion.cpp:5579 - loading model from 'v1-5-pruned-emaonly.safetensors' completed, taking 2.45s
[INFO]  stable-diffusion.cpp:5593 - running in eps-prediction mode
[INFO]  stable-diffusion.cpp:6486 - apply_loras completed, taking 0.00s
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
=========     and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [80384 hazards]
========= 
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
=========     and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [79488 hazards]
========= 
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
=========     and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [77952 hazards]
========= 
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
=========     and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [75264 hazards]
========= 
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
=========     and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [81408 hazards]
========= 
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
=========     and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [79360 hazards]
========= 
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
=========     and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [80768 hazards]
========= 
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
=========     and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [80384 hazards]
========= 
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
=========     and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [78976 hazards]
========= 
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
=========     and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [78080 hazards]
========= 
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
=========     and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [79104 hazards]
========= 
========= Error: Race reported between Read access at 0xbd0 in soft_max_f32(const float *, const float *, float *, int, int, float)
=========     and Write access at 0x1d60 in soft_max_f32(const float *, const float *, float *, int, int, float) [78080 hazards]

FSSRepo · 2023-12-31T21:25:50Z

@Cyberhan123 Could you send me the CLI commands to perform this test? Your link is not allowing me to access Colab.

Cyberhan123 · 2024-01-01T04:36:09Z

@Cyberhan123 Could you send me the CLI commands to perform this test? Your link is not allowing me to access Colab.

I modified the link and the command is as follows

!rm -r stable-diffusion.cpp
!git clone --recursive https://github.com/leejet/stable-diffusion.cpp.git
!mkdir stable-diffusion.cpp/build
!echo "target_compile_options(ggml PRIVATE \$<\$<COMPILE_LANGUAGE:CUDA>:-lineinfo>)" >> stable-diffusion.cpp/CMakeLists.txt
!cmake -S stable-diffusion.cpp -B stable-diffusion.cpp/build -DSD_CUBLAS=ON
!cmake --build stable-diffusion.cpp/build --config Release
!curl -L -O https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.safetensors
# !curl -L -O https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth
# !mv 223670 model.safetensors
# !stable-diffusion.cpp/build/bin/sd -m v1-5-pruned-emaonly.safetensors -p "a lovely cat" --upscale-model RealESRGAN_x4plus_anime_6B.pth
!compute-sanitizer --tool racecheck stable-diffusion.cpp/build/bin/sd -m v1-5-pruned-emaonly.safetensors -p "a lovely cat"

FSSRepo · 2024-01-03T15:09:57Z

@leejet to fix race condition of softmax in cuda comment the line 6499, this may solve the errors with artifacts when using VAE tiling:

while (nth < ncols_x && nth < CUDA_SOFT_MAX_BLOCK_SIZE) nth *= 2; // comment this line

YAY-3M-TA3 · 2024-01-06T10:05:39Z

@leejet I've been testing the SDXL rendering. I did find some issues:

for 1024x1024 pictures(or anything above 512), we can see some undecoded latent on the bottom of the image.
There seems to be a problem with the prompting for SDXL - under the same conditions and seed, the image should be deterministic. But I get variations on Stable diffusion cpp that I do not get on other SD apps like InvokeAI (using the same conditions.). For example. This is an example image we should be able to reproduce in SD.cpp.

However, when I use the same meta data in SD.app, I get this instead...

SDXL does have two text encoders - I'm not sure if this is dealt with in SD.cpp....

(NOTE: as a test for deterministic image generation, I did SD.cpp with SD1.5). Here is the example SD1.5:

And this was reproduced in SD.cpp using the same meta data....

JohnClaw · 2024-01-06T14:04:20Z

However, when I use the same meta data in SD.app, I get this instead...

I'm getting the same horrible results while using SD-Turbo and SDXL-Turbo.

leejet · 2024-01-08T13:04:51Z

@YAY-3M-TA3

It seems to be an issue with ggml's CUDA backend synchronization. Do you still encounter this problem when using the latest code?
The 'Model: Stable Diffusion XL 1.0 (1024)' model doesn't seem to accommodate 512x512 images well. It's better to set the image generation size to 1024x1024. I think there might be an issue with the parameters for generating the displayed images on the webpage. I've generated normal images using the following parameters, consistent with sd-webui.

.\bin\Release\sd.exe -m ..\..\stable-diffusion-webui\models\Stable-diffusion\sd_xl_base_1.0.safetensors --vae ..\..\stable-diffusion-webui\models\VAE\sdxl_vae-fp16-fix.safetensors -p "Marilyn Monroe in the 21st century A stylish woman with a fashionable outfit, pretty makeup, facial closeup" -v --steps 25 -H 1024 -W 1024

Cyberhan123 · 2024-01-08T13:26:13Z

@leejet I think the parameters @YAY-3M-TA3 Y set are wrong. He may have set CFG Scale to 7.0

leejet · 2024-01-09T15:03:51Z

@leejet I think the parameters @YAY-3M-TA3 Y set are wrong. He may have set CFG Scale to 7.0

For the SDXL base model, setting the CFG scale to 7 should be fine. In my example above, the CFG scale is also 7 (the default value).

phudtran · 2024-02-29T22:15:46Z

@leejet I've been testing the SDXL rendering. I did find some issues:

for 1024x1024 pictures(or anything above 512), we can see some undecoded latent on the bottom of the image.

There seems to be a problem with the prompting for SDXL - under the same conditions and seed, the image should be deterministic. But I get variations on Stable diffusion cpp that I do not get on other SD apps like InvokeAI (using the same conditions.). For example. This is an example image we should be able to reproduce in SD.cpp.

However, when I use the same meta data in SD.app, I get this instead... ![output](https://private-user-images.githubusercontent.com/151010222/294664503-de28aa14-85a1-4c31-a74e-bbfba5eb8fb9.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MDkyNDUxMzcsIm5iZiI6MTcwOTI0NDgzNywicGF0aCI6Ii8xNTEwMTAyMjIvMjk0NjY0NTAzLWRlMjhhYTE0LTg1YTEtNGMzMS1hNzRlLWJiZmJhNWViOGZiOS5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwMjI5JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDIyOVQyMjEzNTdaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT0wY2YzMGVmZTVmYTUwOTE3MTFhM2RhNWNmOGZiYTVmYjRiYzIyYjI0OWQ0YjE5ZWI0MmUzZmE4ZmVhYjdjNTBjJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.lrmO4wmsQKQp8UUlMeSymgpQYBpJIj_tIHuh0UyCdiY)
SDXL does have two text encoders - I'm not sure if this is dealt with in SD.cpp....

(NOTE: as a test for deterministic image generation, I did SD.cpp with SD1.5). Here is the example SD1.5:

And this was reproduced in SD.cpp using the same meta data....

I'm also seeing this created #187

KintCark · 2024-07-29T20:32:54Z

Hey I'm getting artifacts. In my images like green and blue dots everywhere. I think it's a vae problem sd1.5 works fine but when I put resolution to 512x1024 in sdxl I get no artifacts. It's weird but I'm need an option for no vae just for models thar already have a vae baked in. I think it's useing the baked vae and the downloaded vae which is causing artifacts. BTW this works great on my android phone. I'm finally can use sdxl on my phone. All the other webui I've tried crashes on sdxl load or after the first generation plus loras work but not the 2gb lora I run out of ram trying to use Midjourney mimic 1.2

KintCark · 2024-07-31T18:52:24Z

I can't figure out the perfect settings my images are not generated correctly I trying hypersdxl lora but only lcm lora works I'm trying to use the least steps possible.

arenekosreal · 2024-11-12T02:30:30Z

I am using Vulkan backend on Arch Linux. The nvidia driver version is 565.57.01-1, my GPU is GTX 1050ti Mobile.
Although it is old, but thanks to stable-diffusion.cpp, it can run sd1.x and sd2.x. But it has some problems when using sd2.x models.
Here are some examples:

Model: https://huggingface.co/CompVis/stable-diffusion-v-1-4-original/blob/main/sd-v1-4.ckpt
Command: sd -m models/sd-v1-4.ckpt -p "The quick brown fox jumps over a lazy dog"
Picture:

Model: https://huggingface.co/stable-diffusion-v1-5/stable-diffusion-v1-5/blob/main/v1-5-pruned-emaonly.safetensors
Command: sd -m models/v1-5-pruned-emaonly.safetensors -p "The quick brown fox jumps over a lazy dog"
Picture:

Model: https://huggingface.co/stabilityai/stable-diffusion-2/blob/main/768-v-ema.safetensors
Command: sd -m ./models/768-v-ema.safetensors -p "The quick brown fox jumps over a lazy dog"
Picture:

Model: https://huggingface.co/stabilityai/stable-diffusion-2-1/blob/main/v2-1_768-ema-pruned.safetensors
Command: sd -m models/v2-1_768-ema-pruned.safetensors -p "The quick brown fox jumps over a lazy dog"
Picture:

Green-Sky · 2024-11-12T10:20:54Z

@arenekosreal keep in mind that each model has an image dimension it is optimized for and might not produce good images for lower or higher resolutions. (eg you using 512 for the 768 variant)

arenekosreal · 2024-11-12T10:55:58Z

@Green-Sky Oh I have not noticed that 768 in file name means this is suitable for generating 768x768 sized picture. But even I set width and height to 768, the problem still exists. For example, If I run sd -m models/v2-1_768-ema-pruned.safetensors -p "The quick brown fox jumps over a lazy dog" --width 768 --height 768, the generated picture will be this:

arenekosreal · 2024-11-12T12:02:34Z

I found that my model size is 4.9GB while it is 5.21GB on huggingface. I will try to re-download model and check if things change.

Green-Sky · 2024-11-12T12:04:58Z

I found that my model size is 4.9GB while it is 5.21GB on huggingface. I will try to re-download model and check if things change.

HF will give you a hash you can also compute locally and compare.

arenekosreal · 2024-11-12T12:17:48Z

I found that my model size is 4.9GB while it is 5.21GB on huggingface. I will try to re-download model and check if things change.

HF will give you a hash you can also compute locally and compare.

$ sha256sum sd2.1/v2-1_768-ema-pruned.safetensors 
dcd690123cfc64383981a31d955694f6acf2072a80537fdb612c8e58ec87a8ac  sd2.1/v2-1_768-ema-pruned.safetensors

Its sha256 matches what pointer file shows. Huggingface calculates size with GB instead GiB, so my 4.9GB is actually 4.9GiB. So it shows that the model is correct, I have to dig further to find out why it cannot run correctly on my old buddy.

Cyberhan123 mentioned this issue Dec 23, 2023

stable-diffusion: implement ESRGAN upscaler + Metal Backend #104

Merged

leejet pinned this issue Dec 23, 2023

wbruna mentioned this issue Oct 19, 2024

Corrupted images on Vukan backend #439

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there anyone who can't generate images correctly? #122

Is there anyone who can't generate images correctly? #122

Cyberhan123 commented Dec 23, 2023 •

edited

Loading

FSSRepo commented Dec 28, 2023 •

edited

Loading

diimdeep commented Dec 28, 2023 •

edited

Loading

Cyberhan123 commented Dec 30, 2023 •

edited

Loading

FSSRepo commented Dec 31, 2023

Cyberhan123 commented Jan 1, 2024 •

edited

Loading

FSSRepo commented Jan 3, 2024 •

edited

Loading

YAY-3M-TA3 commented Jan 6, 2024

JohnClaw commented Jan 6, 2024

leejet commented Jan 8, 2024

Cyberhan123 commented Jan 8, 2024 •

edited

Loading

leejet commented Jan 9, 2024

phudtran commented Feb 29, 2024

KintCark commented Jul 29, 2024 •

edited

Loading

KintCark commented Jul 31, 2024

arenekosreal commented Nov 12, 2024

Green-Sky commented Nov 12, 2024

arenekosreal commented Nov 12, 2024

arenekosreal commented Nov 12, 2024

Green-Sky commented Nov 12, 2024

arenekosreal commented Nov 12, 2024

Is there anyone who can't generate images correctly? #122

Is there anyone who can't generate images correctly? #122

Comments

Cyberhan123 commented Dec 23, 2023 • edited Loading

FSSRepo commented Dec 28, 2023 • edited Loading

diimdeep commented Dec 28, 2023 • edited Loading

Cyberhan123 commented Dec 30, 2023 • edited Loading

FSSRepo commented Dec 31, 2023

Cyberhan123 commented Jan 1, 2024 • edited Loading

FSSRepo commented Jan 3, 2024 • edited Loading

YAY-3M-TA3 commented Jan 6, 2024

JohnClaw commented Jan 6, 2024

leejet commented Jan 8, 2024

Cyberhan123 commented Jan 8, 2024 • edited Loading

leejet commented Jan 9, 2024

phudtran commented Feb 29, 2024

KintCark commented Jul 29, 2024 • edited Loading

KintCark commented Jul 31, 2024

arenekosreal commented Nov 12, 2024

Green-Sky commented Nov 12, 2024

arenekosreal commented Nov 12, 2024

arenekosreal commented Nov 12, 2024

Green-Sky commented Nov 12, 2024

arenekosreal commented Nov 12, 2024

Cyberhan123 commented Dec 23, 2023 •

edited

Loading

FSSRepo commented Dec 28, 2023 •

edited

Loading

diimdeep commented Dec 28, 2023 •

edited

Loading

Cyberhan123 commented Dec 30, 2023 •

edited

Loading

Cyberhan123 commented Jan 1, 2024 •

edited

Loading

FSSRepo commented Jan 3, 2024 •

edited

Loading

Cyberhan123 commented Jan 8, 2024 •

edited

Loading

KintCark commented Jul 29, 2024 •

edited

Loading