Alternate Subtraction Method, Faster #14

torridgristle · 2021-03-31T20:30:04Z

torridgristle
Mar 31, 2021

I was trying out ways of manipulating the encoded text and one that I tried was subtracting encoded text from the encoded text prompt. I tried four renders for each and they look about the same, except the one that changes the encoded text had less of the subtract prompt which suggests to me that it's more effective at subtracting a prompt. Also it ends up using just the one txt_enc rather than 2, and just the one cosine similarity.

Prompt: "a photo of a human face" and Negative: "a photo of a face"

Subtracting Subtract's txt_enc0 from text_enc resulted in these

Existing negative method what uses cosine similarity with the image and negative prompt for loss resulted in these

And for fun, using subtract to increase the difference between the two by txt_enc + (txt_enc - text_enc0) resulted in these

The encoded text and images seem to be explorable like latent space.

eps696 · 2021-03-31T23:46:11Z

eps696
Mar 31, 2021
Maintainer

good point, will move subtraction out of the training loop.
your method of "increasing the difference" in fact just decreasing the effect of subtraction (like adding weight < 1): here 2x-y ~ x-0.5y. and the examples did show that - some kind of "faces" appeared with such weighing down.
sure; on my understanding, any continuous embedding is a latent vector by definition. we just don't have decoder for that, like from proper dall-e (not the stripped down published version, but the photorealistic one from the article), so have to move around with optimization techniques instead.

0 replies

torridgristle · 2021-04-01T00:33:39Z

torridgristle
Apr 1, 2021
Author

Ha! Whoops, I was so focused on trying to do something involving the tendency for CLIP to label an image with a face as "a photo of a human face" with a higher score than "a photo of a human face" that I done went and did 2*enc1-enc2, shit. Back to the drawing board.

0 replies

eps696 · 2021-04-01T01:55:13Z

eps696
Apr 1, 2021
Maintainer

regarding preliminary text subtraction txt_enc - text_enc0: after second thinking, it's not the same. when we compare the losses after cossimilarity, we check how far or close we're to those prompts/concepts (that's what we probably want). if we subtract it at once, we will check instead how close we are to the difference between the two, essentially losing the position of "center of mass" of the pair (in the embedding space). so the resulting vector may have nothing in common with either of prompts, and most likely we'd get smth rather different.

0 replies

eps696 · 2021-04-01T18:36:36Z

eps696
Apr 1, 2021
Maintainer

just to ensure - i've tried direct subtraction method on a few meaningful sentences, and it predictably went totally aside of main topic. and just to make it clear - encoded embeddings are NOT losses, their summation/subtraction have different impact.
finally, cossim comparison is just an op, it's probably few orders of magnitude faster than encoding (and even slicing), so "time savings" should not be measurable

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Alternate Subtraction Method, Faster #14

{{title}}

Replies: 4 comments

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Alternate Subtraction Method, Faster #14

torridgristle Mar 31, 2021

Replies: 4 comments

eps696 Mar 31, 2021 Maintainer

torridgristle Apr 1, 2021 Author

eps696 Apr 1, 2021 Maintainer

eps696 Apr 1, 2021 Maintainer

torridgristle
Mar 31, 2021

eps696
Mar 31, 2021
Maintainer

torridgristle
Apr 1, 2021
Author

eps696
Apr 1, 2021
Maintainer

eps696
Apr 1, 2021
Maintainer