Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize primitives #2867

Closed
ZanSara opened this issue Jul 21, 2022 · 3 comments
Closed

Generalize primitives #2867

ZanSara opened this issue Jul 21, 2022 · 3 comments

Comments

@ZanSara
Copy link
Contributor

ZanSara commented Jul 21, 2022

Context

  • Part of Add support for images #2418
  • Haystack primitives (data classes like Document, Answer, etc) are one of the basic building block of the library. They are used to carry the data through the pipeline from one node to the other, allowing it to be modular.
  • These data classes, however, are heavily based on the assumption that the main "content" that they carry is text. This is especially visible in the interaction with document stores in indexing pipelines.
  • However, with support for different data types closing in (see Add support for images #2418, AnswerToSpeech #2584, feat: SpeechToDocument #2676), this assumption will soon become obsolete.
  • In fact this brick wall was already hit in AnswerToSpeech #2584 (comment) and had to be worked around.

Goals

  • Implement fully "data type agnostic" primitives and a hierarchy of implementations for each supported data type
  • Adapt document stores to support first the text subclass only, then all of the subclasses
  • Test the new primitives throughout
  • Document the change

Note

  • This will probably result in breaking changes
  • It's still unclear if this big task can be split into smaller blocks that can be merged to master in between.
@baregawi
Copy link
Contributor

Hello @ZanSara! I was wondering if I could be of any help in this. In particular, can I help implement and test some subtypes?

@ZanSara
Copy link
Contributor Author

ZanSara commented Jul 22, 2022

Hello @baregawi! Thank you for volunteering, but this is a very delicate change which needs to be split into subtasks. I need to do some tests myself before planning further.

However, if I identify some tasks that fit well an external contribution I'll tag you in the next days, so you can contribute! Most likely some subtypes indeed will be a good fit, but again, that's a bit early to tell. Stay tuned 😊

@ZanSara
Copy link
Contributor Author

ZanSara commented Nov 24, 2022

Closing, turned out not to be necessary.

@ZanSara ZanSara closed this as completed Nov 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants