-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLIP semantic image search #1058
Comments
@flozi00 nice suggestion. I also wanted to suggest the same. It is nice to support image documents which will suite VQA, Image search and other use cases. Fews concerns I have are -
Overall it is nice to have it in haystack in my view but adding it will require good design discussion and proper long term planning. Frequent breaking changes will not be good. Also I see deepset already have handful and they would need active support from the community. I see lot of good suggestions from the community, so how about having experimental feature stream to have a playground for these features and graduate matured features to mainline? |
It's pretty clear to me that we will eventually add other data types to Haystack. The vision here is really to build natural language interfaces to all kinds of data. This includes texts, images, tables, databases, logs ... However, we want to nail the text case first and optimize it really end-to-end instead of allowing 5 formats with "50% solutions". TableQA is probably one of the bigger next additions and we are actively working on it right now. So long-story short, VQA is nothing that we will work on in the next weeks for sure, but it's on the longterm roadmap. @lalitpagaria what do you mean with experimental stream? A separate branch here in the repo? |
@tholor I am align with the vision. My only concern is prioritization. Hence suggested if we have process around it. In my view these are two most time consuming steps and of-course critical: Design Discussion and Code Review. Now able to come up with solution to resolve it. Regarding experimental stream, I mean separate to have module |
Can you please share reference link for the one you've tried. I'd like to see results as well. Thanks, |
Is your feature request related to a problem? Please describe.
No, it would be just cool
Describe the solution you'd like
Indexing and searching for images by text
Describe alternatives you've considered
Jina already does, but since CLIP is in latest huggingface release it would be cool have it here too
Additional context
I did some runs locally with my own photos and the results were amazing.
Describing images instead of just keywords improves the performance masively, event special query working fine
But the biggest question I have is if you want to have vision data in this framework or not ?
The text was updated successfully, but these errors were encountered: