Training and evaluation for `MultiModalRetriever` #3410

ZanSara · 2022-10-18T16:03:23Z

Currently there's no train() or eval() methods for MultiModalRetriever.

We should add them, taking inspiration from EmbeddingRetriever.

The text was updated successfully, but these errors were encountered:

anakin87 · 2022-11-24T09:55:06Z

@ZanSara this proposal sounds very interesting and challenging!

If you would like to provide some resources/pointers/examples to get started with the implementation, I think that they would be very helpful for me and the other contributors 😃.

Do you think that this dataset could be useful to test the implementation or do you have better proposals?

ZanSara · 2022-11-24T10:44:00Z

Hey @anakin87! Nice one you've picked! So I should make a premise: I'm not the most knowledgeable in the team about evaluation and training of models, so take my input with a grain of salt 😄

That said:

The dataset looks fine to me! I can't say if this is the best one though. I'd say it's a good start in my opinion, so you can get started with it. There might be standard datasets to evaluate/train against: maybe @mayankjobanputra, @julian-risch or @vblagoje know where to look?
MultiModalRetriever leverages heavily the sentence-transformers library. I believe that at least training should be supported by it. Have a look here:
Similarly, sentence-transformers should be able to help you with the evaluation part.
- https://www.sbert.net/docs/package_reference/evaluation.html
Actually, BaseRetriever implements eval() already, so there's a possibility that the evaluation already sort of works. Needs proper testing though
- https://github.com/deepset-ai/haystack/blob/main/haystack/nodes/retriever/base.py#L118

Have fun with this one 😁 Also, please open separate PRs for the different features in order to keep them small.

anakin87 · 2022-12-01T15:03:24Z

I was thinking about CLIP training/fine-tuning...

At the moment, sentence-transformers doesn't explain how to address this task, although a tutorial is a very requested feature (UKPLab/sentence-transformers#840 https://github.com/UKPLab/sentence-transformers/issues?q=is%3Aissue+is%3Aopen+clip).

I'm sure that even if not available out-of-the-box, the training/fine-tuning can be done using Transformers or other approaches (openai/CLIP#150).

...but there is much more in this issue

Studying a bit the MultiModalRetriever, I see that it virtually accepts several input types, each one with a proper encoder. The constraint is that the sizes of the encoded vectors must match.

So I wonder: is there a unified and effective way to perform the training in such a heterogeneous set of situations?

To give a more focused scope to this issue, I'm curious to hear your opinions: @ZanSara @bogdankostic @julian-risch @vblagoje @mayankjobanputra...

vblagoje · 2022-12-02T08:20:54Z

@anakin87 my hunch, without proper investigation, is that training/fine-tuning such a model should be outside of Haystack's scope. These models are fine-tuned using accelerate/hf library setup, and I am hard-pressed to see a reason for adding such support in Haystack.

ZanSara · 2022-12-05T10:37:05Z

@vblagoje I originally added this issue for consistency with other Retrievers: if you see this unfeasible, let's skip 👍

However, I believe evaluation is still valuable and should be implemented. WDYT?

vblagoje · 2022-12-05T13:36:11Z

@ZanSara I am not sure, tbh. I would say it is prudent to keep the eval interface we already have as they seem to be more consistent than training APIs. Something we can talk about internally first as well.

anakin87 · 2024-02-06T15:42:01Z

Very broad topic; training is not a focus currently.
Closing as won't fix.

ZanSara added type:feature New feature or request Contributions wanted! Looking for external contributions topic:retriever labels Oct 18, 2022

ZanSara mentioned this issue Nov 24, 2022

Add support for images #2418

Closed

8 tasks

masci removed the Contributions wanted! Looking for external contributions label Dec 13, 2023

masci added the wontfix This will not be worked on label Mar 12, 2024

masci closed this as completed Mar 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training and evaluation for `MultiModalRetriever` #3410

Training and evaluation for `MultiModalRetriever` #3410

ZanSara commented Oct 18, 2022

anakin87 commented Nov 24, 2022

ZanSara commented Nov 24, 2022 •

edited

Loading

anakin87 commented Dec 1, 2022

vblagoje commented Dec 2, 2022 •

edited

Loading

ZanSara commented Dec 5, 2022

vblagoje commented Dec 5, 2022

anakin87 commented Feb 6, 2024

Training and evaluation for MultiModalRetriever #3410

Training and evaluation for MultiModalRetriever #3410

Comments

ZanSara commented Oct 18, 2022

anakin87 commented Nov 24, 2022

ZanSara commented Nov 24, 2022 • edited Loading

anakin87 commented Dec 1, 2022

I was thinking about CLIP training/fine-tuning...

...but there is much more in this issue

vblagoje commented Dec 2, 2022 • edited Loading

ZanSara commented Dec 5, 2022

vblagoje commented Dec 5, 2022

anakin87 commented Feb 6, 2024

Training and evaluation for `MultiModalRetriever` #3410

Training and evaluation for `MultiModalRetriever` #3410

ZanSara commented Nov 24, 2022 •

edited

Loading

vblagoje commented Dec 2, 2022 •

edited

Loading