An idea to improve resolution and quality #336

Don-Chad · 2023-08-01T13:07:46Z

Don-Chad
Aug 1, 2023

I noticed description become a lot better when I included 4 zoomed sections of the image, and combined the descriptions, yet this is very slow b/c of the use of language models multiple times.

To improve detailed description, how about making separate sections of of each identified object in the highest resolution - so one for each object - and feeding this zoomed selection to clip for captioning? (I presume at the moment clip only takes from the image in it's full form) In this way you could do several steps for clip, potentially making much better use of clips resolution.

So for a photo of a person with a car. It could look at the upperbody, lower body, face, tires etc all in detail. It should make it much easier to recognise emotions for example.

Is this possible? Or is this how clip works already? Sorry not sure on the mechanics here :-)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

An idea to improve resolution and quality #336

{{title}}

Replies: 0 comments

Select a reply

An idea to improve resolution and quality #336

Don-Chad Aug 1, 2023

Replies: 0 comments

Don-Chad
Aug 1, 2023