-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Roadmap] RAG #1657
Comments
Hello, I was wondering if there was support coming for pgvector, if not I would be happy to contribute. |
Hi @Knucklessg1 , contribution is welcome, thank you for your interest! |
Thank you @WaelKarkoub , interesting idea! Would adding mean RAG is already performed? |
@thinkall we could define what that tag means by adding attributes (e.g. |
Great initiative @thinkall. |
Thank you @ChristianWeyer , I hope to support as many established features in the OSS as possible. So we're carefully thinking of the new design of the rag feature in AutoGen. Would you like to share your thoughts? |
One thing is that there are so many connector & retriever implementations in LangChain, that it would not make sense to reinvent the wheel and trying to keep up. Same goes for embedding support. |
Agree! Would you like to have a quick chat on this? It would be great to hear more from you! |
@thinkall Will the upcoming RAG update still require using |
Sure. I am cethewe in AG Discord. |
Hi @Knucklessg1 , are you in our Discord channel? Could we have a quick chat? Thanks. |
Hi @dsalas-crogl , I'd like to remove the usage of Are you in our Discord channel? |
Yes absolutely. I reached out on Discord. |
@thinkall any flow diagram regarding the rag? |
interesting roadmap , and i'm very happy with chromadb , looking forward to in memory vector store too , now. if anyone is interested it could be a good opportunity to collaborate and break down complex tasks . i'll also consider creating + sharing an "advanced upsert" agent , which enriches the text chunks to improve retrieval performance. |
Are RAG applications limited to document processing, or do they extend to code-related tasks as well? For instance:
Usecase:
|
RAG can help if the documents are well organized. For pure code, currently, you can use 3rd party code chunk methods to help load the code into the vector dbs. |
Let's also add documentation task to the roadmap? We should have a rag category under https://microsoft.github.io/autogen/docs/topics |
Do we also have a task on the roadmap for using custom embeddings? This is a very vital and important requirement for successful RAG. |
Custom embeddings are already supported and will also be supported in the new version. Re-ranking may also be supported, but we may not implement the algorithms, instead we could support plugin different re-ranking models. |
@thinkall Is it possible for you to provide an example of a 3rd party code chunk method? I'm very interested in extending the knowledge of an agent to my whole codebase :) |
I’ve got some thoughts on how we could use agents to automate code
generation
+1 for Maxime, it would be really helpful to include a section on code
chunking with examples to illustrate the working.
Now, when it comes to generating code, agents shouldn't just spit out code,
they should mimic the way engineers think and work in real life.
Here’s what I mean:
- Start by clearly defining the requirements—think OKRs that cascade
from key results down to epics and stories.
- Set up milestones and break down tasks among the agents involved.
- Have each agent carry out their tasks and check in regularly to ensure
everything’s on track, with room for human oversight when needed.
- Provide visibility of OKR, milestones and tasks visibility at one
place. Make a planner central for agent and human collaboration and
progress tracking.
- Keep the agent's execution isolated (in a separate process). May be a
distributed workflow where each node can host one or more agents. Workflow
is orchestrated through central planner contributed by agents and humans
For the code workflow, we could see something like this:
- Set up a new GitHub project with a clean, well-structured setup and
isolated code (for new projects).
- Handle resource creation, both on-premises and in the cloud.
- Back it all up with a CI/CD system tailored for both on-premises and
cloud environments.
- Support incremental code commits through PR with CI/CD
These steps would be helpful both for rolling out new features or fixes to
existing projects and for starting fresh ones. I managed to get a mini
reference implementation of a distributed key-value store (80%) using
chatgpt(gpt4) and was able to build, test, and run the services locally
(screenshot attached). I was experimenting with autogen to reproduce the
steps that I have followed and see if I can achieve decent level autonomy
(I am sure it will take many iterations :) ). I am still learning and
experimenting. I will share my findings as I make progress.
[image: Screenshot 2024-04-13 at 10.37.18 AM.png]
Thanks for all the great work and support.
Regards
lnr
… Message ID: ***@***.***>
|
A solution for this has just been released |
Hi @maximedupre , please check out an example of using 3rd party chunk method here: https://microsoft.github.io/autogen/blog/2023/10/18/RetrieveChat/#customizing-text-split-function |
Any plan to integrate with GraphRAG ?? |
Some of my team members recently started using AutoGen with RAG and are interested in contributing, but it is unclear what you're actively working on and what tasks do you need help with. It'd be great to collaborate if there's good alignment between this work stream and Holistic Intelligence for Global Good. |
We're working on it. |
Thank you very much, @wammar, for your feedbacks! The tasks list contains RAG related issues and PRs we're working on. You're very welcome to raise PRs for resolving existing issues or propose new features such as new vector dbs, new retrieve util functions (file parsing, chunking, etc.), or review PRs. Any thoughts, suggestions, comments are very welcome. |
Is there any progress on it? |
How will the RAG pattern change with the new v0.4 architecture? |
Why RAG
Retrieval-Augmented Generation (RAG) is a technique that enhances the capabilities of LLMs by incorporating a retrieval mechanism into the generative process. This approach allows the model to leverage a vast amount of relevant information from a pre-existing knowledge base, which can significantly improve the quality and accuracy of its generated responses. Thus, for agents chat, incorporating a RAG agent offers several compelling advantages that can significantly enhance the performance and utility of your agent system.
RAG in AutoGen
AutoGen has provided
RetrieveUserProxyAgent
andRetrieveAssistantAgent
for performing RetrieveChat in Aug, 2023 and announced it in blog in Oct, 2023. Given a set of documents, the Retrieval-augmented User Proxy first automatically processes documents—splits, chunks, and stores them in a vector database. Then for a given user input, it retrieves relevant chunks as context and sends it to the Retrieval-augmented Assistant, which uses LLM to generate code or text to answer questions. Agents converse until they find a satisfactory answer.As both AutoGen and RAG are evolving very fast, we find that many users are asking for supports on customized vector databases, incremental document ingesting, customized retrieve/re-ranking algorithms, customized RAG pattern/workflow, etc. We've adjusted some of the issues and feature requests, such as we've added
QdrantRetrieveUserProxyAgent
for using qdrant as the vector db; we've integrated UNSTRUCTURED to support many unstructed documents. However, there are many more to do.Our Plan
In order to better support RAG in AutoGen, we plan to refactor the existing RetrieveChat agents. The goals includes:
Primary goals
Optional goals
Tasks
overlap
parameter in thesplit_text_to_chunks
not used. #1844The text was updated successfully, but these errors were encountered: