Extension: Stable Diffusion Api integration #309

Brawlence · 2023-03-14T03:59:58Z

Description:

Lets the bot answer you with a picture!

Load it in the --cai-chat mode with --extension sd_api_pictures alongside send_pictures (it's not really required, but completes the picture).

If enabled, the image generation is triggered either:

manually through the extension buttons OR
IF the words 'send | mail | me' are detected simultaneously with 'image | pic | picture | photo'

One needs an available instance of Automatic1111's webui running with an --api flag. Ain't tested with a notebook / cloud hosted one but should be possible. I'm running it locally in parallel on the same machine as the textgen-webui. One also needs to specify custom --listen-port if he's gonna run everything locally.

For the record, 12 GB VRAM is barely enough to run NeverEndingDream 512×512 fp16 and LLaMA-7b in 4bit precision.
TODO: We should really think about a way to juggle models around RAM and VRAM for this project to work on lower VRAM cards.

Extension interface

Don't mind the Windranger Arcana key in the Prompt Prefix, that's just the name of an embedding I trained beforehand.

Demonstrations:

Conversation 1

Conversation 2

Lets the bot answer you with a picture!

djkacevedo · 2023-03-14T04:29:17Z

Very nice.

Idea: Can you image to image the profile picture for any detected expression changes?

something like:

oobabooga · 2023-03-17T15:23:25Z

I couldn't test the extension so far, probably because I don't have the 'NeverEndingDream' model installed. I will try again later.

Brawlence · 2023-03-17T16:24:45Z

I couldn't test the extension so far, probably because I don't have the 'NeverEndingDream' model installed. I will try again later.

:D It's gonna use the last model something was generated with by default; you don't even need to have any particular one present. I haven't really even implemented the ability to specify a model because it's a separate API call

oobabooga · 2023-03-19T16:24:20Z

This is extremely amusing. It just worked in the end, all I had to do was tick "Activate SD Api integration" and change the host address to http://192.168.0.32:7861 where 192.168.0.32 is the IP of the machine where I am running stable diffusion.

RandomInternetPreson · 2023-03-20T00:14:13Z

Yeass!! I was able to get this to work, but I had to remove the "modules.py" and "modules-1.0.0.dist-info" folder from my textgen environment for it to work. I'm 'running on windows without wsl.

0xbitches · 2023-03-20T01:08:16Z

Yeah modules is listed in the list of requirements for the extension but it will conflict with the modules/ folder in the textgen webui directory. Please consider removing this in a commit.

St33lMouse · 2023-03-20T06:56:04Z

This sounds great! We just need a little bit more information to avoid guessing at how to get them to communicate.

Here's my Stable Diffusion launch line:
./webui.sh --no-half-vae --listen --port 7032 --api

And here's my Ooba text gen launch:
python server.py --model opt-1.3b --cai-chat

I don't think this will make them talk. Both programs are running on the same machine in the same browser in two different tabs. How should those lines read to allow textgen to utilize Stable Diffusion?

And if I need to know my local machine's IP, how do I do that? If you can answer those questions, maybe we could put the answers in the wiki so people don't bug you about it.

karlwancl · 2023-03-20T08:05:57Z

Yea, VRAM probably is the problem, you cant really host 2 VRAM eaters in the same consumer machine. That's another reason for moving the chatting AI to CPU (supported by AVX2) like the llama.cpp (https://github.com/ggerganov/llama.cpp) /alpaca.app (https://github.com/antimatter15/alpaca.cpp) projects, so we consume RAM instead of VRAM.

But text-generation-webui seems not supported yet and some people are working on the integration: #447

If that's done, I guess this extension would be more usable in average consumer machines.

Brawlence · 2023-03-20T08:26:27Z

@St33lMouse

And if I need to know my local machine's IP, how do I do that?

You don't; if you're running them on the same machine you can use a special address 127.0.0.1 which basically means 'on this machine' for any network. So in your case, just go to ooba's extension tab, tick the API checkbox to enable and change the address to 127.0.0.1:7032 - it should work out of the box

JohnWJarrett · 2023-03-20T08:44:49Z

Ok, so my problem seems to be with Auto having SSL as I am getting a "[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate" Error, any suggestions?

Brawlence · 2023-03-20T10:43:40Z

@JohnWJarrett
What are your launch parameters for both repos and how do you usually open the Auto1111's webUI? Through the https://, I presume?

JohnWJarrett · 2023-03-20T14:08:04Z

My TGWI params are (When trying to use SD along side)

python server.py --auto-devices --gpu-memory 5 --cai-chat --listen --listen-port 8888 --extension sd_api_pictures send_pictures

and my WUI params are

--xformers --deepdanbooru --api --listen --listen-port 8880

And yes, i use the SSL addon for WUI so yeah, through https

Brawlence · 2023-03-20T15:45:56Z

Thanks! I haven't yet tested if the API works correctly when used through https, and that probably is the root cause of the issue.
You could try temporarily disabling SSL for WUI; please report if it works in that state.

I'll try to look for the fix for https in the meantime

ewof · 2023-03-20T17:55:44Z

r we able to use this when we're not in cai mode?

Simon1V · 2023-03-20T19:19:27Z

I am working on trying integration of some multimodal models like mm-cot or nvidia prismer currently. Maybe it would be possible to have a common interface for picture handling? Both receiving and sending.

JohnWJarrett · 2023-03-20T23:52:44Z

@Brawlence, yeah, it get's past the cert error if I disable the SSL, but then I got a different error, one that I actually have a solution for... So, seeing as I am using a different port than WUI's default, I just copied and pasted the new url (http://127.0.0.1:8880/) into the settings on TGW, which I am guessing you might see the issue, or you might not, I didn't for about an hour until I was looking into the log and tried, on a whim, to do "localhost:8880" which gave me this error

requests.exceptions.InvalidSchema: No connection adapters were found for 'localhost:8880//sdapi/v1/txt2img'

which is when I noticed the "8880//sdapi", so I think you should truncate the trailing "/" in the IP if the user accidentally leaves it there, it was a thing I overlooked and I'm sure I wont be the only one, it's a stupid user error, sure, but I'd imagine it'd be an easy fix? I don't know, I hate Python with a passion so I never bothered learning it that much.

But other than that, yeah, it works fine, even on my 8GB GFX, I am not gonna try push it for anything over 256 images, but then again, I don't really need to, it's more just for the extra fun than anything.

EDIT: Also, while playing around, and this is just some general info for anyone who was wondering, you can put a LoRa into the "Prompt Prefix" and it will work, which would be good for getting a very consistent character.

ItsOkayItsOfficial · 2023-03-21T23:02:34Z

@Brawlence I've made some updates that I'd be happy to share!

Now one can optionally use 'subject' and 'pronoun' that will replace I have and My in the prompt sent to SD. This produces way better results on a wider-array of SD models and/or lets users with embeddings or Dreambooth models to specify their unique token.

Also added a Suffix field so that someone can better dial in other details of the scene if they want.

What I'd like to do next is actually read this information out of the Character json schema so that all a person has to do is load up their Character and the correct class and subject tokens are set. Heck, could even provide for model hash in that too...

I'm also able to get SD models working, but unfortunately I can't find where in the SD API that allows you to set the model.

RandomInternetPreson · 2023-03-21T23:13:15Z

Ooh yes please I'd like to try your updates out 😁

Brawlence · 2023-03-22T08:23:02Z

@JohnWJarrett

I think you should truncate the trailing "/" in the IP if the user accidentally leaves it there, it was a thing I overlooked and I'm sure I wont be the only one, it's a stupid user error, sure, but I'd imagine it'd be an easy fix?

Thanks for your feedback! Here's a preview for the upcoming change:

It's gonna strip the http(s):// part and the trailing / if present and also return the status when pressing Enter in that field

@karlwancl

you cant really host 2 VRAM eaters in the same consumer machine

But yes one can! I have already tested the memory juggling feature (see #471 and AUTOMATIC1111/stable-diffusion-webui/pull/8780) and if both of those patches are accepted then it would be possible to:

unload LLM to RAM,
load Stable Diffusion checkpoint,
generate the image and pass it to oobabooga UI,
unload the SD checkpoint,
load LLM back into VRAM

— all at the cost of ~20 additional seconds spent on shuffling models around. I've already tested it on my machine and it works.

Demo

As you can see, it successfully performs all the above steps, at least on my local rig with all the fixes implemented.

Of course, I'd be more than happy to have llama.cpp implemented as well, more options are always better

zyxpixel · 2023-03-23T06:16:57Z

My current opinion: while llama.cpp uses CPU and RAM, and SD uses GPU and VRAM. These two will not conflict with each other. For now, llama cares more about the size of the RAM/VRAM and GPU acceleration is not obvious, and in most PCs RAM is much larger than VRAM.

Andy-Goodheart · 2023-03-26T16:45:45Z

Hi, I am having some trouble in getting this extension to work. I always get the same error.

File "C:\Users\user\text-generation-webui\extensions\sd_api_pictures\script.py", line 85, in get_SD_pictures
for img_str in r['images']:
KeyError: 'images'

Now, it seems that the key "Images" in the directory r is not existing. How can I fix this? (I am new to github, sorry if I posted this in the wrong place. I don't find the same issue under issues.

Thank you for your answer.

oobabooga · 2023-03-26T17:42:37Z

@Andy-Goodheart it looks like the SD API is not responding to your requests. Make sure that the IP and port under "Stable Diffusion host address" are correct and that SD is started with the --api flag.

Andy-Goodheart · 2023-03-26T20:03:18Z

@oobabooga Thanks a lot! =) That solved it for me. I didn't have the --api Argument in the webui-user.bat file.

DerAlo · 2023-04-03T12:57:56Z

Any idea why i always recieve such creepy pics? ^^

Brawlence · 2023-04-03T14:10:32Z

@DerAlo Hmmmmm. What SD model do you use? Try generating the description verbatim in Auto1111's interface, what do you get there?

For me, such pictures are usually generated either when the model tries to do something it was not trained on OR when CFG_scale is set too high

DerAlo · 2023-04-03T15:01:38Z

@DerAlo Hmmmmm. What SD model do you use? Try generating the description verbatim in Auto1111's interface, what do you get there?

For me, such pictures are usually generated either when the model tries to do something it was not trained on OR when CFG_scale is set too high

Its strange - in 1111's interface everythin' is fine.. Model is 'SD_model': 'sd-v1-4' and cfg is at 7.... i really dont get it^^ but thx 4 ur reply :)

francoisatt · 2023-04-05T20:08:01Z

anyone could write a tutorial for this extension, i can't strat this without error (rtx 3070) :(

Brawlence · 2023-04-06T01:01:46Z

@francoisatt what's the error, what's the parameters on the launch, what models do you use and how much VRAM you got?

altoiddealer · 2023-04-06T15:22:00Z

I have this extension running and it seems like it is working as intended-
-The bot types something
-SD using that as a prompt to generate an image
-The image appears in the chat, and also in the /sd_api_pictures/outputs directory.

However, the output .PNG images do not have any Stable Diffusion metadata., which is very unfortunate.

francoisatt · 2023-04-06T16:11:29Z

Hello,
I use the one click installer,my .bat:

on the web interface, when i Activate SD Api integration, and i click "generate an image reponse", i obtain this error:

My configuration is: i7-11800H ram 16go and rtx3070 vram 8go .

thanks for your help.

ClayShoaf · 2023-04-07T18:15:52Z

@ItsOkayItsOfficial
Any updates?

Brawlence · 2023-04-08T02:50:09Z

@francoisatt

Try to reorder the Google translation extension after the sd-api or if that does not help, removing it (even for a while to test)

I have a hunch that it's messing with the outputted text which is needed for SD-api to work

Brawlence · 2023-04-08T02:54:06Z

@altoiddealer

I have this extension running and it seems like it is working as intended- -The bot types something -SD using that as a prompt to generate an image -The image appears in the chat, and also in the /sd_api_pictures/outputs directory.

However, the output .PNG images do not have any Stable Diffusion metadata., which is very unfortunate.

Could you open this as an issue? I'll look if it's possible to get metadata sent via the API too

francoisatt · 2023-04-08T09:09:32Z

without googletranslate:

I don't uderstand the problem...

tandpastatester · 2023-04-09T10:51:12Z

@Brawlence
Awesome work so far. I successfully got your extension running with my Ooba and SD installations. The VRAM loading/offloading works well, which is very useful to run everything on the same machine.

I would like to understand a little better how the prompting works, though. I can't seem to get the extension to output images that are corresponding with the context/theme of the conversation. E.g:

I assume it just sent “Alright. Let's see if you are impressed.” as a prompt to SD, without using any of the prompt prefixes or negative prompts that I configured in the settings below the chat interface. Is there any way I can actually see the full prompts that are being communicated? Is it logged somewhere (either in SD or in the text-generation-webui) where I can see what it does (and does not) in order to understand what it's trying to do and how to get a better result?

Brawlence · 2023-04-09T12:20:49Z

@tandpastatester the easiest way right now is to open the auto1111's WebUI (even with the currently unloaded model) and click the 'show previous generation parameters' button (the leftmost one under the 'Generate/Stop' on the txt2img tab

I'm currently figuring out how to solve for #920 which would shed light on this issue as well.

In the meantime, try changing the prefix to include more tags for the character OR forcing the generation on something you predict to be very descriptive, it usually does way better on long outputs than on short ones

ClayShoaf · 2023-04-10T15:06:32Z

@tandpastatester That looks like you are not using the correct VAE in Stable Diffusion. You can see what is being sent by SD underneather the picture that is returned in oobabooga. It is a combination of:

the generation parameters in the sd_api_pictures block on the text-generation-webui and
followed by the text that you see underneath the image that is returned.

Unfortunately, you can't really get detailed generations like what you're looking for. I was trying to work on something that could make them a little more customizable, but, like @Brawlence, I cannot seem to figure out how to get the name of the currently used bot anywhere from within this project.

EDIT: Wow... I just checked ffd102e and everything I was doing will have to be reworked. I can't keep up with this stuff.

Extension: Stable Diffusion Api integration

St33lMouse · 2023-04-21T05:56:33Z

Here's a trick: You can use a lora of your character if you have one and put it in the prompt. That will make a stable character when you ask it to send photos of itself. You can also use png information to give you the prompt which generated the character, so you'll get a good start of what your character should look like.

Doomed1986 · 2023-11-02T15:18:05Z

This is awesome...wish i could use it though :-( I'm using a 7.16gb model on a 8gb gpu.
Any word of implementing this extension for poor people like me? Something like shifting recourses if possible or unloading model shifting vram producing image then reloading model and resuming chat and receive picture. I'd wait a few minutes for this to happen as a trade off to get low vram card users onboard the picture gen train!

Brawlence · 2023-11-02T16:02:59Z

Something like shifting recourses if possible or unloading model shifting vram producing image then reloading model and resuming chat and receive picture

That's literally already implemented

Doomed1986 · 2023-11-02T23:33:06Z

It is? Great news!
I saw "TODO: We should really think about a way to juggle models around RAM and VRAM for this project to work on lower VRAM cards." in the description so.

Xyem · 2023-11-08T02:09:22Z

Is anyone aware of an equivalent that uses ComfyUI for the backend?

PRCbubu · 2023-12-25T15:53:09Z

Can anyone tell me how to disable SSL verification so that I don't get CERTIFICATE_VERIFY_FAILED error

RandomInternetPreson · 2023-12-25T17:24:49Z

If you do this don't let your computer access the internet, I don't think this is very secure idk...But I like to use the webui on my mobile devices on my network and need to run the site as an https://ip:port so I can use things like the microphone on the mobile device. The best web browser I've found to work is Opera, you need to use the --ssl-keyfile and --ssl-certfile flags in the CMD_FLAGS.txt file, https://github.com/oobabooga/text-generation-webui?tab=readme-ov-file#gradio

I used this website: https://regery.com/en/security/ssl-tools/self-signed-certificate-generator to create the keys and certs, downloaded them, and put the location of the files after the appropriate flags.

These keys and certs are self signed, they normally come from a trusted external sources, so when you try to access via a web browser you will get a warning that the certs are not recognized.

guispfilho · 2024-05-15T15:58:42Z

Can anyone tell me how to disable SSL verification so that I don't get CERTIFICATE_VERIFY_FAILED error

Hi. Did you ended up figuring out how to fix this issue?

Extension: Stable Diffusion Api integration

d537b28

Lets the bot answer you with a picture!

Merge branch 'oobabooga:main' into main

e45d8e3

Merge branch 'main' into Brawlence-main

eab8de0

oobabooga merged commit 4bafe45 into oobabooga:main Mar 19, 2023

oobabooga added a commit that referenced this pull request Mar 20, 2023

Remove redundant requirements #309

31ab2be

Brawlence mentioned this pull request Mar 21, 2023

Unload and reload models on request #471

Merged

Brawlence mentioned this pull request Apr 2, 2023

SD Api Pics extension, v.1.1 #596

Merged

altoiddealer mentioned this pull request Apr 8, 2023

sd_api_pictures Should include SD metadata in generated images #920

Closed

Ph0rk0z pushed a commit to Ph0rk0z/text-generation-webui-testing that referenced this pull request Apr 17, 2023

Merge pull request oobabooga#309 from Brawlence/main

130789e

Extension: Stable Diffusion Api integration

Ph0rk0z pushed a commit to Ph0rk0z/text-generation-webui-testing that referenced this pull request Apr 17, 2023

Remove redundant requirements oobabooga#309

1d3deb4

Extension: Stable Diffusion Api integration #309

Extension: Stable Diffusion Api integration #309

Conversation

Brawlence commented Mar 14, 2023 • edited Loading

Description:

Demonstrations:

djkacevedo commented Mar 14, 2023 • edited Loading

oobabooga commented Mar 17, 2023

Brawlence commented Mar 17, 2023 via email • edited Loading

oobabooga commented Mar 19, 2023 • edited Loading

RandomInternetPreson commented Mar 20, 2023

0xbitches commented Mar 20, 2023

St33lMouse commented Mar 20, 2023

karlwancl commented Mar 20, 2023 • edited Loading

Brawlence commented Mar 20, 2023

JohnWJarrett commented Mar 20, 2023

Brawlence commented Mar 20, 2023

JohnWJarrett commented Mar 20, 2023

Brawlence commented Mar 20, 2023

ewof commented Mar 20, 2023

Simon1V commented Mar 20, 2023

JohnWJarrett commented Mar 20, 2023 • edited Loading

ItsOkayItsOfficial commented Mar 21, 2023

RandomInternetPreson commented Mar 21, 2023

Brawlence commented Mar 22, 2023 • edited Loading

zyxpixel commented Mar 23, 2023

Andy-Goodheart commented Mar 26, 2023

oobabooga commented Mar 26, 2023

Andy-Goodheart commented Mar 26, 2023

DerAlo commented Apr 3, 2023

Brawlence commented Apr 3, 2023

DerAlo commented Apr 3, 2023

francoisatt commented Apr 5, 2023

Brawlence commented Apr 6, 2023

altoiddealer commented Apr 6, 2023

francoisatt commented Apr 6, 2023

ClayShoaf commented Apr 7, 2023

Brawlence commented Apr 8, 2023

Brawlence commented Apr 8, 2023 • edited Loading

francoisatt commented Apr 8, 2023

tandpastatester commented Apr 9, 2023 • edited Loading

Brawlence commented Apr 9, 2023

ClayShoaf commented Apr 10, 2023 • edited Loading

St33lMouse commented Apr 21, 2023

Doomed1986 commented Nov 2, 2023

Brawlence commented Nov 2, 2023

Doomed1986 commented Nov 2, 2023

Xyem commented Nov 8, 2023

PRCbubu commented Dec 25, 2023

RandomInternetPreson commented Dec 25, 2023

guispfilho commented May 15, 2024

Brawlence commented Mar 14, 2023 •

edited

Loading

djkacevedo commented Mar 14, 2023 •

edited

Loading

Brawlence commented Mar 17, 2023 via email •

edited

Loading

oobabooga commented Mar 19, 2023 •

edited

Loading

karlwancl commented Mar 20, 2023 •

edited

Loading

JohnWJarrett commented Mar 20, 2023 •

edited

Loading

Brawlence commented Mar 22, 2023 •

edited

Loading

Brawlence commented Apr 8, 2023 •

edited

Loading

tandpastatester commented Apr 9, 2023 •

edited

Loading

ClayShoaf commented Apr 10, 2023 •

edited

Loading