Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Negative / Unconditioned Prompts #661

Merged
merged 1 commit into from
Sep 18, 2022
Merged

Negative / Unconditioned Prompts #661

merged 1 commit into from
Sep 18, 2022

Conversation

blessedcoolant
Copy link
Collaborator

@blessedcoolant blessedcoolant commented Sep 18, 2022

This is an improved and optimized version of @rabidcopy's PR #637 that implements negative prompting.

Usage

Any words between a pair of square brackets will try and be ignored by Stable Diffusion's model during generation of images.

this is a test prompt [not really] to make you understand [cool] how this works.

In the above statement, the words 'not really cool` will be ignored by Stable Diffusion.

Here's a prompt that depicts what it does.

original prompt: "A fantastical translucent poney made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve" -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 1654590180

000001 1654590180

This one gave me a woman. I don't want a woman in this picture. So I just type ---

new prompt: "A fantastical translucent poney made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve [woman]" -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 1654590180

000002 1654590180

Awesome. But I don't want the image to be blue.

new prompt: "A fantastical translucent poney made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve [woman blue]" -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 1654590180

000003 1654590180

Perfect. But I don't want that stupid saddle. I want the horse to be free.

new prompt: "A fantastical translucent poney made of water and foam, ethereal, radiant, hyperalism, scottish folklore, digital painting, artstation, concept art, smooth, 8 k frostbite 3 engine, ultra detailed, art by artgerm and greg rutkowski and magali villeneuve [woman blue saddle]" -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 1654590180

000004 1654590180

Perfect.

As you can see, there is a ton of potential with this and now it is implemented in the simplest way possible.

Notes

  • The only requirement for words to be ignored is that they are in between a pair of square brackets.
  • You can provide multiple words within the same bracket.
  • You can provide multiple brackets with multiple words in different places of your prompt. That works just fine.

I've done quite a bit of testing and it works great as long as the words you are omitting are actual difference makers to your prompt.

@lstein This should be good to go. Simple code that breaks almost nothing. And seems to be working quite awesome. Tagging you coz you said you might do it yourself.


Co-Authored-By: @rabidcopy [email protected]

Copy link
Contributor

@tildebyte tildebyte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice. Good work, @rabidcopy, @blessedcoolant 🌮

@rabidcopy
Copy link
Contributor

Wow, that's great. Very clean. Also very nice examples showing how well it works.

@BrentOzar
Copy link
Contributor

This PR also works really well to cut down anatomy problems.

"doctor holding a pill" -s 50 -W 512 -H 512 -C 7.5 -A k_lms -S 931495956

000008 931495956

But add in some exclusions, and things get better (although still not perfect, but a lot better):

"doctor holding a pill [bad anatomy, extra legs, extra arms, extra fingers, poorly drawn hands, poorly drawn feet, disfigured, out of frame, tiling, bad art, deformed, mutated]" -s 50 -W 512 -H 512 -C 7.5 -A k_lms -S 931495956

000010 931495956

@blessedcoolant
Copy link
Collaborator Author

Here's another example to show you guys the power of this thing.

Original Prompt: "Portrait of palpatine from star wars blue eyes, detailed face coherent face highly detailed digital painting artstation concept art smooth sharp focus illustration art by artgerm and greg rutkowski and alphonse mucha" -s 20 -W 512 -H 768 -C 7.5 -A k_euler_a -S 4148260801

000003 4148260801

Too blue for my liking. Gimme some [blue]

000004 4148260801

Nice. Get rid of your cloak evil man. [blue, cloak]

000005 4148260801

Cool. But what is wrong with your head? [blue, cloak, big head]

000006 4148260801

Nice. Get rid of that coat too. [blue, cloak, big head, coat]

000007 4148260801

No one can rock collars like that. [blue, cloak, big head, coat, collar]

000008 4148260801


As you can see, it's pretty fcukin fantastic.

@psychedelicious
Copy link
Collaborator

This is too cool for school. Holy smokes.

Is this different from https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Features#attention ? If so, is it compatible?

Maybe if we are totally negating a particular token and the other attention control thing is also compatible, we can use a unique syntax like -[this gets negated fully] [[[this just has a lot less attention]]] (this has a bit more attention).

@blessedcoolant
Copy link
Collaborator Author

Is this different from https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Features#attention ? If so, is it compatible?

From reading the basic functionality, it is different. This one completely asks SD to overlook the given words if it can help it. While the one in the link seems to suggest it to produce more of that word.

Maybe if we are totally negating a particular token and the other attention control thing is also compatible, we can use a unique syntax like -[this gets negated fully] [[[this just has a lot less attention]]] (this has a bit more attention).

I'm not a fan of using multiple brackets. It's just not good or easy syntax to write and edit. I'll have a look at this and see what the feature is. If it's something we can implement, maybe we can do it with another syntax -- a different bracket or another format.

@krummrey
Copy link
Contributor

Awesome, tried it out and it works on my MPB M1 16GB.

Copy link
Collaborator

@lstein lstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works like a charm!

@lstein lstein merged commit 0a43970 into invoke-ai:development Sep 18, 2022
@blessedcoolant blessedcoolant deleted the negative-prompt branch September 18, 2022 13:11
@smoke2007
Copy link

does this also apply to img2img ?

@blessedcoolant
Copy link
Collaborator Author

does this also apply to img2img ?

Yes. It works with anything that takes a prompt.

lstein referenced this pull request Sep 18, 2022
Documentation for pull lstein#661, and splits prompt docs into a separate file.
@rabidcopy
Copy link
Contributor

rabidcopy commented Sep 18, 2022

Is this different from https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Features#attention ? If so, is it compatible?

From reading the basic functionality, it is different. This one completely asks SD to overlook the given words if it can help it. While the one in the link seems to suggest it to produce more of that word.

Maybe if we are totally negating a particular token and the other attention control thing is also compatible, we can use a unique syntax like -[this gets negated fully] [[[this just has a lot less attention]]] (this has a bit more attention).

I'm not a fan of using multiple brackets. It's just not good or easy syntax to write and edit. I'll have a look at this and see what the feature is. If it's something we can implement, maybe we can do it with another syntax -- a different bracket or another format.

Too add on a bit, basically in AUTOMATIC1111's fork any word put into () is emphasized more (multiple parenthesis work and increase the strength of that word even more), and any word put into brackets is de-emphasized but not negated. With how it's implemented it is combined with their implementation of negative prompts. This leads to an absurd amount of fine tuning with a prompt by the strength and weakness of each word, including the strength and weakness of the negative prompt's words.

prompt: a detailed 3d render of an apple
negative prompt: simple
1

prompt: a (detailed) 3d render of an apple
negative prompt: (simple)
2
Not much changes...

prompt: a ((((((detailed)))))) 3d render of an apple
negative prompt: ((((((simple))))))
3

prompt: a ((((((detailed)))))) 3d render of an [[[[[[[[[[apple]]]]]]]]]]
negative prompt: ((((((simple))))))
4

prompt: a ((((((detailed)))))) 3d render of an [[[[[[[[[[apple]]]]]]]]]]
negative prompt: ((((((simple)))))), ((((((((((red))))))))))
5

You can drive vectors very dramatically with this feature. With higher CFG values it becomes very incoherent with more stacking of () and []. I'm not sure what would be a good implementation here if it was implemented. Plus or minus before the word?
(Edit: scratch that it probably can't be a dash unless the prompt input is surrounded in quotes? Eh. Probably not the way to go.)

a ++++++detailed 3d render of an ----------apple [++++++simple, ++++++++++red]
Edit2: a detailed~6 3d render of an apple~0.1 [simple~6, red~10]?
Edit3: This may not be sound either? Could conflict with embeddings called with tilde? Unless that isn't one of the characters that can be used as a placeholder.

Still not entirely confident if it's the same or different from how prompt weighting works already here but from my understanding prompt weighting is a percentage to be split between words that adds up to be 100%. While this allows individual words to be pushed beyond that.

@tildebyte
Copy link
Contributor

Is this different from AUTOMATIC1111/stable-diffusion-webui/wiki/Features#attention ? If so, is it compatible?

any word put into () is emphasized more (multiple parenthesis work and increase the strength of that word even more), and any word put into brackets is de-emphasized but not negated

We already have numerical non-negating weights.

We need to be careful about having an "integrate everything" mindset, as we're likely to end up with duplicate features. Numeric non-negating weights (implemented), Negative prompting (implemented in this PR), and cross-attention (wishlisted) all do slightly different things in different ways.

@rabidcopy
Copy link
Contributor

rabidcopy commented Sep 18, 2022

Is this different from AUTOMATIC1111/stable-diffusion-webui/wiki/Features#attention ? If so, is it compatible?

any word put into () is emphasized more (multiple parenthesis work and increase the strength of that word even more), and any word put into brackets is de-emphasized but not negated

We already have numerical non-negating weights.

We need to be careful about having an "integrate everything" mindset, as we're likely to end up with duplicate features. Numeric non-negating weights (implemented), Negative prompting (implemented in this PR), and cross-attention (wishlisted) all do slightly different things in different ways.

Ah, sorry I just wasnt sure if that was effectively the same thing, apologies.

@tildebyte
Copy link
Contributor

@rabidcopy

Ah, sorry I just wasnt sure if that was effectively the same thing, apologies.

No worries! It's very confusing, and the only reason I have it straight in my head is because I'm really interested in "extended" prompting techniques from a user's perspective.

I really wish I understood the cross-attention thing well enough to be able to rebase Doggettx' work onto 'dev'...

afiaka87 pushed a commit to afiaka87/lstein-stable-diffusion that referenced this pull request Sep 19, 2022
afiaka87 referenced this pull request in afiaka87/lstein-stable-diffusion Sep 19, 2022
Documentation for pull lstein#661, and splits prompt docs into a separate file.
afiaka87 pushed a commit to afiaka87/lstein-stable-diffusion that referenced this pull request Sep 19, 2022
afiaka87 referenced this pull request in afiaka87/lstein-stable-diffusion Sep 19, 2022
Documentation for pull lstein#661, and splits prompt docs into a separate file.
afiaka87 pushed a commit to afiaka87/lstein-stable-diffusion that referenced this pull request Sep 19, 2022
afiaka87 referenced this pull request in afiaka87/lstein-stable-diffusion Sep 19, 2022
Documentation for pull lstein#661, and splits prompt docs into a separate file.
afiaka87 pushed a commit to afiaka87/lstein-stable-diffusion that referenced this pull request Sep 21, 2022
austinbrown34 pushed a commit to cognidesign/InvokeAI that referenced this pull request Dec 30, 2022
Co-Authored-By: rabidcopy <[email protected]>

Co-authored-by: rabidcopy <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants