Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Retriever API #619

Merged
merged 72 commits into from
Dec 13, 2023
Merged

New Retriever API #619

merged 72 commits into from
Dec 13, 2023

Conversation

AyushExel
Copy link
Collaborator

@AyushExel AyushExel commented Nov 10, 2023

I communicated these changes over discord

Why are these changes needed?

This PR creates a new Retriever API that makes plugging different retrievers simpler. It also removes hardcoding of particular vectorDBs and makes a centralized module that deal with optional vectordb dependencies

Retrieval functions were hardcoded. This PR introduces a new Retriever abstract interface that simply has 3 functions.
Retriever.init_db() - To initialize connection/client or nothing if in serverless setting
Retriever.ingest_data() - To upsert data in the db
Retriever.query() - To run queries
Any DB provider can simply implement these functions and register as a supported retriever for Autogen.
Note: This PR will not break the existing examples/notebooks.
This is still WIP but I tested the current chromadb implementation and it works well with the existing examples.

Related issue number

Closes #586 #416 and a few others

Checks

@thinkall
Copy link
Collaborator

Thank you @AyushExel , I agree with the design you proposed. Users will be able to use different vector dbs by simply giving different keys.

Should we put "retriever" folder in "contrib" as well? Maybe even move retrieve_utils.py to retriever folder.

What do you think? @sonichi

@thinkall thinkall changed the base branch from main to refactor_rag December 13, 2023 06:31
@thinkall thinkall merged commit a2f4461 into microsoft:refactor_rag Dec 13, 2023
72 of 84 checks passed
@thinkall thinkall mentioned this pull request Dec 13, 2023
3 tasks
jackgerrits pushed a commit that referenced this pull request Oct 2, 2024
* Fix outstanding_tasks and cancellation bugs

* formatting

* Use separate except blocks

---------

Co-authored-by: Eric Zhu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rag retrieve-augmented generative agents
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Math Chat using LanceDb Integration
7 participants