-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactorization of ConversableAgent to unify async and sync code and better extensibility #1240
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #1240 +/- ##
===========================================
+ Coverage 32.48% 49.48% +17.00%
===========================================
Files 41 49 +8
Lines 4907 5252 +345
Branches 1120 1238 +118
===========================================
+ Hits 1594 2599 +1005
+ Misses 3187 2485 -702
- Partials 126 168 +42
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
I'm going to be honest, even as a mid-level developer, I can't understand a word of the example code. Aren't we making AutoGen easy for everyone to use, regardless of their skill level? The basic Autogen code is fairly simple, set up a model, instantiate an agent, initiate that agent chat. Where do these middlewares fit into that process? |
|
Middleware is meant for the framework developer, not for application developer. It is not changing the interface of AutoGen, rather it is changing the backend and how the "under-the-hood" stuff is written. You can read the PR description about the motivation. |
Currently in AutoGen, each incoming message is handled by a pipeline of registered reply functions. Each reply function is triggered by a trigger function. If a reply function is triggered and signaled it is a This design pattern is described in the AutoGen paper (https://arxiv.org/pdf/2308.08155.pdf, Section 2). This PR is a refactor, we convert reply function into middleware class, so it can better handle states like code executors, message history, etc. |
@Tylersuard Apologies if this PR confused you. Shouldn't have routed you here 😄 . If you are interested in contribute new code, don't worry about this PR. |
Closing this PR. We have learned a lot working on this branch. Time to use what we have learned toward a series of PRs that will incrementally move the code base toward extensibility. |
@Tylersuard as @ekzhu said, the idea was to refactor the code using a well-known Middleware pattern instead of a currently used ad-hoc mechanism with reply functions. The Middleware pattern was explored here because of its popularity in frameworks such as Starlette (https://www.starlette.io/middleware/) and FastAPI (https://fastapi.tiangolo.com/tutorial/middleware/) where it is the standard way of extending the framework by non-experts. The main idea is to have a self-contained Middleware class that can be independently tested and then integrated with other middleware functions to implement a required combination of functionalities. E.g. if you wish to add observability to your agents using OpenTelemetry, you would only need to implement a Middleware subclass and attach it to an Agent. As I said, that is a well-known and battle-proven mechanism in the most popular frameworks used by millions of junior and mid-level developers. Implementation of the mechanism is not the simplest one, but using it is easier than using the mechanism with reply functions available today. |
@davorrunje Thank you for your response, I understand, and I am glad you are interested in simplifying the experience for developers. Could you give some simple example code for a real-world application for the middleware pattern you are proposing? For example, how would I make an agent save its output to a text document instead of printing it to the terminal? How would I modify the below code to incorporate the text-saving middleware? Thank you.
|
Note: this is a refactorization study and will change a lot as we explore different design patterns and how they fit together in the framework. Feedback and suggestions are more than welcome.
Why are these changes needed?
The current state.
Several issues and PRs involve how to extend the
ConversableAgent
class. Effort like Teachability for every agent (#534) and the concept of modularized agent capability is a big step toward solving this from a higher level. There are still low-level extension issues such as:While some patches have been made to address those issues but not the root cause, which is that the
ConversableAgent
was mostly designed to work in a console environment but not yet a server-side library, and the code duplication in async and sync public methods.What is this PR for.
The main goal of this PR is to address the above and at the same time introducing a design pattern that would make it much easier to add low-level functionalities like logging, content filtering, RAG-style context expansion, and custom termination mechanism and many others. We also want to demonstrate that high-level capabilities like Teachability can be composed of re-usable low-level components.
What is this PR NOT for:
This PR is not for breaking the existing
ConversableAgent
methods. TheConversableAgent
has many great methods like theregister_for_llm
andinitiate_chat
that are loved by users.What can you do to help:
This is not a small change, so we want to have as many feedback AND HELP as possible. We will post some work items on this page as we move forward, and you are welcome to contribute!
Update 01/16/2024
@ekzhu and I came up with the new scheme which introduces a single new function/decorator
hookable
and implements Middleware pattern. An example of how to use it is as follows:There can be more than one hookable method in each class. We can use this to implement reply and hook functions and probably many other things.
Update 01/17/2024
@tyler-suard-parker @joshkyh @bitnom @jackgerrits @rickyloynd-microsoft
You are welcome to try out this branch. We are currently working on replacing register_hook for now then we will move on to refactor existing generate_***_reply functions in the ConversableAgent class into a middleware and upgrade the generate_reply method to use the middleware -- so the current functionalities stay the same.
But we need someone to think about how to implement some of these new features using middleware, and add them to the generate_reply to enable new functionalities.
Here is a simple example of logging middleware that logs incoming and outgoing messages.
Here is another for filtering out OpenAI API keys.
Another one for simple RAG-style context expansion:
We are also adding a decorator that would convert a function into a middleware, saving user the effort to write a class.
More updates 01/17/2024
Teachability is refactored using the Middleware pattern instead of hooks. This is the actual implementation right now:
Whenever
ConversableAgent.process_last_message_user_text
is called, theProcessLastMessageMiddleware.call
is invoked and a wrapper to the originalConversableAgent.process_last_message_user_text
is passed as thenext
parameter. All typing hints here are optional, they are present only to help understand what the expected parameters are.There is some cleanup and error handling remaining, but this is basically it. As Erik mentioned above, it is easy to write a set of standard MIddleware that covers the most common use cases.
Update 01/18/2024
Middleware registration methods
add_middleware
andset_middleware
are refactored to be attached to bounded methods as suggested by @LittleLittleCloud and @ekzhu.Update 01/19/2024
The code was internally refactored so it accurately uses the signature of a decorated function in call() methods of a MIddleware class. Another change is adding
a_call
method toMiddleware
classes and removing oftrigger
method. Having bothcall()
anda_call
allows for the most efficient implementation. Decorators for automatically generatingcall
froma_call
and vice versa will be added shortly so we'll be still able to mix sync/async styles if we are willing to pay the price in reduced performance. Examples ofMiddleware
classes were added to theConversableAgent
. Here is a simple one performing logging:All the tests are passing and we are ready to start refactoring the
ConversableAgent
class.Update 01/20/2024
Created the following middleware:
ToolUseMiddleware
LLMMiddleware
CodeExecutionMiddleware
TerminationAndHumanReplyMiddleware
MessageStoreMiddleware
TeachabilityMiddleware
See
autogen/middleware
andcontrib/capability/teachability.py
Refactored
ConversableAgent
by composing it using the middleware above. All public methods are backward-compatible.Fixed some tests. The failing tests should be easy to fix.
Next step:
call(...)
method signatures.Update 01/26/2024
Async/sync mixing works now in all cases
async
/sync
calls.a_initialize_chat
now.Quality improvements
All tests are passing now. Code coverage was significantly improved with the goal of having over 90% code covered by tests. Type annotations are fixed and mypy reports no errors in
autogen/agentchat/middleware
andtest/agentchat/middleware
folders.Next steps:
call(...)
method signatures.ConversibleAgent
using middleware instead of subclassing.Related issue number
Checks