Add support for images #2418

ZanSara · 2022-04-13T12:57:26Z

Problem
So far Haystack has been focusing strongly on text-only search. However, the same architecture is likely to be effective on other mediums, such as images.

This epic tracks the implementation of support for image indexing and retrieval in Haystack.

Process

Research on the topic
- Candidate models for image retrieval identified: CLIP and Data2Vec
- Data2VecVision has been chosen due to the existence of sibling models for text and audio which would allow us to have comparable embeddings across different document types later on (see ImageRetriever #2445).
Investigate and eventually adapt how Haystack loads models from HF.
- Simplify language_modeling.py and tokenization.py #2703
Create a primitive MultiModalRetriever that works in isolation (NOT in a pipeline) with at least one single document store, and add tests.
Make MultiModalRetriever work on all docstores, if possible and not too time consuming.
- Verify support for image retrieval by all document stores #2866
Extend primitives (Document, Answer, etc) to account for the new data type, and add tests. (~~WARNING: Big task!~~ Changes might not be as big as initially assumed)
- Generalize primitives #2867
Make the MultiModalRetriever work in a query pipeline, and add tests.
Make the MultiModalRetriever work in a indexing pipeline, and add tests.
Create a tutorial and some documentation

Later steps (not in order of importance, not blocking each other):

Implement training and evaluation on image retrieval task
- Training and evaluation for MultiModalRetriever #3410
Make MultiModalRetriever work on the REST API and/or make a separate demo for it (a mixed media one would be super-cool tho)
- Create a multi-modal demo #3623
Image-to-text conversion, mainly designed for indexing. Note: this is independent from the rest of the changes, could be even picked up now by some brave external contributor.
- ImageToText & AnswerToImage #2444

The text was updated successfully, but these errors were encountered:

ZanSara added epic topic:images labels Apr 13, 2022

ZanSara self-assigned this Apr 13, 2022

This was referenced Apr 21, 2022

ImageToText & AnswerToImage #2444

Closed

ImageRetriever #2445

Closed

ZanSara added the type:feature New feature or request label Apr 21, 2022

julian-risch mentioned this issue May 5, 2022

Is it possible to add a code for Visual QA on Bar Charts, Pie Charts and Visualizations ? #2504

Closed

masci changed the title ~~Support for images~~ Add support for images Jul 14, 2022

masci assigned vblagoje Jul 20, 2022

This was referenced Jul 21, 2022

Verify compatibility between Data2VecVision models and existing retrievers #2865

Closed

Verify support for image retrieval by all document stores #2866

Closed

Generalize primitives #2867

Closed

feat: MultiModalRetriever #2891

Merged

masci removed the type:feature New feature or request label Jul 27, 2022

masci moved this to Q3 2022 in Haystack Public Roadmap Jul 27, 2022

masci added this to Haystack Public Roadmap Jul 27, 2022

masci added epic:idle Epic not yet started epic:in-progress Epic is in progress and removed epic:idle Epic not yet started labels Jul 28, 2022

masci moved this from Q3 2022 to Q4 2022 in Haystack Public Roadmap Oct 3, 2022

masci moved this from Q4 2022 to Q3 2022 in Haystack Public Roadmap Oct 13, 2022

ZanSara closed this as completed Nov 24, 2022

masci removed the epic:in-progress Epic is in progress label Dec 19, 2022

anakin87 mentioned this issue Jan 11, 2023

CLIP semantic image search #1058

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for images #2418

Add support for images #2418

ZanSara commented Apr 13, 2022 •

edited

Loading

Add support for images #2418

Add support for images #2418

Comments

ZanSara commented Apr 13, 2022 • edited Loading

ZanSara commented Apr 13, 2022 •

edited

Loading