Refactorization of ConversableAgent to unify async and sync code and better extensibility #1240

davorrunje · 2024-01-14T02:37:55Z

Note: this is a refactorization study and will change a lot as we explore different design patterns and how they fit together in the framework. Feedback and suggestions are more than welcome.

Why are these changes needed?

The current state.

Several issues and PRs involve how to extend the ConversableAgent class. Effort like Teachability for every agent (#534) and the concept of modularized agent capability is a big step toward solving this from a higher level. There are still low-level extension issues such as:

Logging Logging #1146
Streaming message to frontend use sockets rather than print messages to terminal #394 Using sockets rather than print #1290 How can I stream to a frontend using SSE? #1313
Several bugs caused by async and sync code not "synced" in code changes like run sync and async in async #1242 and issues related to unexpected behaviors of using async public methods ([Bug]: when i give agent function,there is the errorline 898, in _request raise self._make_status_error_from_response(err.response) from None openai.NotFoundError: Error code: 404 - {'error': {'message': 'Unrecognized request argument supplied: functions (request id: 20231218185632700815098gdho4iSo)', 'type': 'invalid_request_error', 'param': '', 'code': None}} #1012)

While some patches have been made to address those issues but not the root cause, which is that the ConversableAgent was mostly designed to work in a console environment but not yet a server-side library, and the code duplication in async and sync public methods.

What is this PR for.

The main goal of this PR is to address the above and at the same time introducing a design pattern that would make it much easier to add low-level functionalities like logging, content filtering, RAG-style context expansion, and custom termination mechanism and many others. We also want to demonstrate that high-level capabilities like Teachability can be composed of re-usable low-level components.

What is this PR NOT for:

This PR is not for breaking the existing ConversableAgent methods. The ConversableAgent has many great methods like the register_for_llm and initiate_chat that are loved by users.

What can you do to help:

This is not a small change, so we want to have as many feedback AND HELP as possible. We will post some work items on this page as we move forward, and you are welcome to contribute!

Update 01/16/2024

@ekzhu and I came up with the new scheme which introduces a single new function/decorator hookable and implements Middleware pattern. An example of how to use it is as follows:

class A:
    def __init__(self, name: str) -> None:
        self.name = name

    @hookable
    def go(self, *args: Any, **kwargs: Any) -> str:
        return f"{self.name}.{format_function(self.go, *args, **kwargs)}"

class MyMiddleware:
    def __init__(self, name: str) -> None:
        self.name = name

    def call(self, *args: Any, next: Callable[..., Any], **kwargs: Any) -> str:
        retval = next(*args, **kwargs)
        return f"{self.name}.{format_function(self.call, retval)}"

    def trigger(self, *args: Any, **kwargs: Any) -> bool:
        return not ("skip_middleware" in kwargs and kwargs["skip_middleware"])

a = A("a")
add_middleware(A.go, MyMiddleware("mw"))

assert a.go(1, 2, 3, a=4, b=5) == "mw.call(a.go(1, 2, 3, a=4, b=5))"
assert a.go(1, 2, 3, a=4, b=5, skip_middleware=False) == "mw.call(a.go(1, 2, 3, a=4, b=5, skip_middleware=False))"
assert a.go(1, 2, 3, a=4, b=5, skip_middleware=True) == "a.go(1, 2, 3, a=4, b=5, skip_middleware=True)"

add_middleware(A.go, MyMiddleware("MW"))
assert a.go(1, 2, 3, a=4, b=5) == "mw.call(MW.call(a.go(1, 2, 3, a=4, b=5)))"

There can be more than one hookable method in each class. We can use this to implement reply and hook functions and probably many other things.

Update 01/17/2024

@tyler-suard-parker @joshkyh @bitnom @jackgerrits @rickyloynd-microsoft

You are welcome to try out this branch. We are currently working on replacing register_hook for now then we will move on to refactor existing generate_***_reply functions in the ConversableAgent class into a middleware and upgrade the generate_reply method to use the middleware -- so the current functionalities stay the same.

But we need someone to think about how to implement some of these new features using middleware, and add them to the generate_reply to enable new functionalities.

Incoming and outgoing message streaming to web socket
Incoming and outgoing message logging
RAG-style message context expansion, e.g., retrieve relevant context from a vector database and expand the incoming message's content.
Human middleware to short-circuit the rest of the middleware pipeline -- think a static file middleware in a web framework.
Out-going message filtering. e.g., filtering api keys, passwords, etc.

Here is a simple example of logging middleware that logs incoming and outgoing messages.

class LoggingMiddleware:
  def call(self, message: Dict, next: Callable[[Dict], Dict]) -> Dict:
    logging.info(f"Incoming: {retval}")
    retval = next(message)
    logging.info(f"Outgoing: {retval}")
    return retval

Here is another for filtering out OpenAI API keys.

class FilterAPIKeyMiddleware:
  def call(self, message: Dict, next: Callable[[Dict], Dict]) -> Dict:
    retval = next(message)
    if retval.get("content", False):
      re.sub(r'(sk-\w{4})\w+', r'\1***', retval["content"])
    return retval

Another one for simple RAG-style context expansion:

class RAGMiddleware:
  def call(self, message: Dict, next: Callable[[Dict], Dict]) -> Dict:
    if message.get("content", False):
      # Expand the message content with some text retrieved from vector db.
      expansion = vectordb.search(message["content"], k=1)[0]
      message["content"] += expansion
    retval = next(message)
    return retval

We are also adding a decorator that would convert a function into a middleware, saving user the effort to write a class.

More updates 01/17/2024

Teachability is refactored using the Middleware pattern instead of hooks. This is the actual implementation right now:

def add_to_agent(self, agent: ConversableAgent):
    """Adds teachability to the given agent."""
    self.teachable_agent = agent

    # Register a middleware for processing the last message.
    class ProcessLastMessageMiddleware:
        def __init__(self, *, agent: ConversableAgent, teachability: Teachability):
            self.teachability = teachability
            self.agent = agent

        def call(self, agent: ConversableAgent, user_text: str, *, next: Callable[[str], str]):
            user_text = next(agent, user_text)
            return self.teachability.process_last_message(user_text)

        def trigger(self, agent: ConversableAgent, user_text: str):
            return self.agent == agent

    add_middleware(
        ConversableAgent.process_last_message_user_text,
        ProcessLastMessageMiddleware(agent=agent, teachability=self),
    )

Whenever ConversableAgent.process_last_message_user_text is called, the ProcessLastMessageMiddleware.call is invoked and a wrapper to the original ConversableAgent.process_last_message_user_text is passed as the next parameter. All typing hints here are optional, they are present only to help understand what the expected parameters are.

There is some cleanup and error handling remaining, but this is basically it. As Erik mentioned above, it is easy to write a set of standard MIddleware that covers the most common use cases.

Update 01/18/2024

Middleware registration methods add_middleware and set_middleware are refactored to be attached to bounded methods as suggested by @LittleLittleCloud and @ekzhu.

class A:
    def __init__(self, name: str) -> None:
        self.name = name

    @register_for_middleware
    def process_message(self, msg: str, skip_middleware: Optional[bool] = None) -> str:
        return f"{self.name}.process_message({msg=})"

class MyMiddleware:
    def __init__(self, name: str) -> None:
        self.name = name

    def call(self, *args: Any, next: Callable[..., Any], **kwargs: Any) -> str:
        retval = next(*args, **kwargs)
        return f"{self.name}.{format_function(self.call, retval)}"

    def trigger(self, *args: Any, **kwargs: Any) -> bool:
        return not ("skip_middleware" in kwargs and kwargs["skip_middleware"])

a = A("a")
mw = MyMiddleware("mw")

# middleware attached to a bounded method a.process_message
add_middleware(a.process_message, mw)

assert a.process_message("hello") == "mw.call(a.process_message(msg='hello'))"

mw2 = MyMiddleware("mw2")
add_middleware(a.process_message, mw2)
assert a.process_message("hello") == "mw.call(mw2.call(a.process_message(msg='hello')))"

b = A("b")
with pytest.raises(ValueError):
    # mw is already added to a.process_message
    add_middleware(b.process_message, mw)

mwb = MyMiddleware("mwb")
add_middleware(b.process_message, mwb)

# only mwb middleware is called on calling b.process_message
assert b.process_message("hello") ==  "mwb.call(b.process_message(msg='hello'))"

# only mw and mw2 are called on a.process_message
assert a.process_message("hello") ==  "mw.call(mw2.call(a.process_message(msg='hello')))"

Update 01/19/2024

The code was internally refactored so it accurately uses the signature of a decorated function in call() methods of a MIddleware class. Another change is adding a_call method to Middleware classes and removing of trigger method. Having both call() and a_call allows for the most efficient implementation. Decorators for automatically generating call from a_call and vice versa will be added shortly so we'll be still able to mix sync/async styles if we are willing to pay the price in reduced performance. Examples of Middleware classes were added to the ConversableAgent. Here is a simple one performing logging:

# notice that the `call`` signature must match the function decorated with `register_for_middleware`:
# passing arguments to call() functions must the the same as passing arguments
# to generate_reply() apart from next being passed as a keyword argument
# default values must also be the same
class _PrintReplyMiddleware:
    def __init__(self, agent: Agent):
        self._agent = agent

    def call(
        self,
        messages: Optional[List[Dict]] = None,
        sender: Optional[Agent] = None,
        # next will be passed as a keyword argument
        next: Optional[Callable[..., Any]] = None,
    ) -> Tuple[bool, Optional[str]]:
        print(f"generate_reply() called: {sender} sending {messages[-1] if messages else messages}'")
        retval = next(messages, sender)
        return retval
    
    async def a_call(
        self,
        messages: Optional[List[Dict]] = None,
        sender: Optional[Agent] = None,
        next: Optional[Callable[..., Any]] = None,
    ) -> Tuple[bool, Optional[str]]:
        print(f"a_generate_reply() called: {sender} sending {messages[-1] if messages else messages}'")
        retval = await next(messages, sender)
        return retval

class ConversableAgent(Agent):
    def __init__(self, *args, **kwargs):
        ...
        # attaching middleware to a registered method 
        add_middleware(self.generate_reply, _PrintReplyMiddleware(self))
        add_middleware(self.a_generate_reply, _PrintReplyMiddleware(self))

    @register_for_middleware
    def generate_reply(
        self,
        messages: Optional[List[Dict]] = None,
        sender: Optional[Agent] = None,
        exclude: Optional[List[Callable]] = None,
    ) -> Union[str, Dict, None]:
        ...

    @register_for_middleware
    async def a_generate_reply(
        self,
        messages: Optional[List[Dict]] = None,
        sender: Optional[Agent] = None,
        exclude: Optional[List[Callable]] = None,
    ) -> Union[str, Dict, None]:
        ...

All the tests are passing and we are ready to start refactoring the ConversableAgent class.

Update 01/20/2024

Created the following middleware:

ToolUseMiddleware
LLMMiddleware
CodeExecutionMiddleware
TerminationAndHumanReplyMiddleware
MessageStoreMiddleware
TeachabilityMiddleware

See autogen/middleware and contrib/capability/teachability.py

Refactored ConversableAgent by composing it using the middleware above. All public methods are backward-compatible.

Fixed some tests. The failing tests should be easy to fix.

Next step:

Use wrapper to unify sync and async code path.
Utilities for building middleware chain and validating call(...) method signatures.
Fix all tests.
Update code-level documentation, remove recommendation for subclassing.

Update 01/26/2024

Async/sync mixing works now in all cases

Function and tool calling is working in all combinations of async/sync calls.
Code execution works in a_initialize_chat now.

Quality improvements

All tests are passing now. Code coverage was significantly improved with the goal of having over 90% code covered by tests. Type annotations are fixed and mypy reports no errors in autogen/agentchat/middleware and test/agentchat/middleware folders.

autogen/agentchat/middleware/base.py                                6      0      0      0   100%
autogen/agentchat/middleware/code_execution.py                    108      0     36      0   100%
autogen/agentchat/middleware/llm.py                               143      4     62      4    95%
autogen/agentchat/middleware/message_store.py                     113      5     60      9    92%
autogen/agentchat/middleware/termination.py                       127     22     66     14    79%
autogen/agentchat/middleware/tool_use.py                          143      0     54      0   100%

Next steps:

Utilities for building middleware chain and validating call(...) method signatures.
Update code-level documentation, remove recommendation for subclassing.
Write a tutorial on extending ConversibleAgent using middleware instead of subclassing.

Related issue number

Checks

I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally.
I've added tests (if relevant) corresponding to the changes introduced in this PR.
I've made sure all auto checks have passed.

codecov-commenter · 2024-01-14T02:38:49Z

Codecov Report

Attention: 209 lines in your changes are missing coverage. Please review.

Comparison is base (1ab2354) 32.48% compared to head (bbdc8dd) 49.48%.

Files	Patch %	Lines
...gen/agentchat/contrib/capabilities/teachability.py	5.08%	56 Missing ⚠️
autogen/agentchat/conversable_agent.py	71.79%	45 Missing and 10 partials ⚠️
autogen/agentchat/middleware/termination.py	71.65%	22 Missing and 14 partials ⚠️
autogen/agentchat/middleware/llm.py	79.72%	24 Missing and 5 partials ⚠️
autogen/agentchat/middleware/message_store.py	79.64%	15 Missing and 8 partials ⚠️
autogen/agentchat/contrib/gpt_assistant_agent.py	0.00%	2 Missing ⚠️
autogen/agentchat/contrib/compressible_agent.py	50.00%	1 Missing ⚠️
autogen/agentchat/contrib/llava_agent.py	50.00%	1 Missing ⚠️
autogen/agentchat/contrib/math_user_proxy_agent.py	75.00%	1 Missing ⚠️
...ogen/agentchat/contrib/retrieve_assistant_agent.py	0.00%	1 Missing ⚠️
... and 4 more

Additional details and impacted files

@@             Coverage Diff             @@
##             main    #1240       +/-   ##
===========================================
+ Coverage   32.48%   49.48%   +17.00%     
===========================================
  Files          41       49        +8     
  Lines        4907     5252      +345     
  Branches     1120     1238      +118     
===========================================
+ Hits         1594     2599     +1005     
+ Misses       3187     2485      -702     
- Partials      126      168       +42

Flag	Coverage Δ
unittests	`49.42% <78.25%> (+16.98%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Tylersuard · 2024-01-28T05:44:04Z

I'm going to be honest, even as a mid-level developer, I can't understand a word of the example code. Aren't we making AutoGen easy for everyone to use, regardless of their skill level? The basic Autogen code is fairly simple, set up a model, instantiate an agent, initiate that agent chat. Where do these middlewares fit into that process?

Tylersuard · 2024-01-28T05:56:33Z

Thank you for the valuable feedback! We need to recognize the fact that most autogen users who want to extend agent capabilities will not have strong python backgrounds. This should not prevent us from adopting clean and powerful extension mechanisms, but the machinery needs to be readable, carefully documented, and easy for almost anyone to use, as well as easy to maintain after all of us have moved on. If we require users to become versed in arcane libraries and abstruse programming patterns, those barriers will severely limit autogen's adoption.

ekzhu · 2024-01-28T06:12:54Z

I'm going to be honest, even as a mid-level developer, I can't understand a word of the example code. Aren't we making AutoGen easy for everyone to use, regardless of their skill level? The basic Autogen code is fairly simple, set up a model, instantiate an agent, initiate that agent chat. Where do these middlewares fit into that process?

Middleware is meant for the framework developer, not for application developer. It is not changing the interface of AutoGen, rather it is changing the backend and how the "under-the-hood" stuff is written.

You can read the PR description about the motivation.

ekzhu · 2024-01-28T06:21:36Z

Currently in AutoGen, each incoming message is handled by a pipeline of registered reply functions. Each reply function is triggered by a trigger function. If a reply function is triggered and signaled it is a final reply, it short-circuits the pipeline and returns the generated reply back to the sender.

This design pattern is described in the AutoGen paper (https://arxiv.org/pdf/2308.08155.pdf, Section 2). This PR is a refactor, we convert reply function into middleware class, so it can better handle states like code executors, message history, etc.

ekzhu · 2024-01-28T06:49:42Z

@Tylersuard Apologies if this PR confused you. Shouldn't have routed you here 😄 . If you are interested in contribute new code, don't worry about this PR.

ekzhu · 2024-01-28T07:26:33Z

Closing this PR. We have learned a lot working on this branch. Time to use what we have learned toward a series of PRs that will incrementally move the code base toward extensibility.

davorrunje · 2024-01-28T11:28:44Z

@Tylersuard as @ekzhu said, the idea was to refactor the code using a well-known Middleware pattern instead of a currently used ad-hoc mechanism with reply functions. The Middleware pattern was explored here because of its popularity in frameworks such as Starlette (https://www.starlette.io/middleware/) and FastAPI (https://fastapi.tiangolo.com/tutorial/middleware/) where it is the standard way of extending the framework by non-experts. The main idea is to have a self-contained Middleware class that can be independently tested and then integrated with other middleware functions to implement a required combination of functionalities. E.g. if you wish to add observability to your agents using OpenTelemetry, you would only need to implement a Middleware subclass and attach it to an Agent. As I said, that is a well-known and battle-proven mechanism in the most popular frameworks used by millions of junior and mid-level developers. Implementation of the mechanism is not the simplest one, but using it is easier than using the mechanism with reply functions available today.

tyler-suard-parker · 2024-01-29T17:51:14Z

@davorrunje Thank you for your response, I understand, and I am glad you are interested in simplifying the experience for developers. Could you give some simple example code for a real-world application for the middleware pattern you are proposing? For example, how would I make an agent save its output to a text document instead of printing it to the terminal? How would I modify the below code to incorporate the text-saving middleware? Thank you.


config_list = ***

assistant = autogen.AssistantAgent(***)
user_proxy = autogen.UserProxyAgent(***)

user_proxy.initiate_chat(
    assistant,
    message="""What date is today? Compare the year-to-date gain for META and TESLA.""",
) ```

refactorization of hooks

60250eb

davorrunje had a problem deploying to openai1 January 14, 2024 02:38 — with GitHub Actions Failure

davorrunje marked this pull request as draft January 14, 2024 02:38

davorrunje had a problem deploying to openai1 January 14, 2024 02:38 — with GitHub Actions Failure

davorrunje requested a review from rickyloynd-microsoft January 14, 2024 02:38

davorrunje had a problem deploying to openai1 January 14, 2024 02:38 — with GitHub Actions Failure

davorrunje requested review from ekzhu, sonichi and rickyloynd-microsoft and removed request for rickyloynd-microsoft January 14, 2024 02:38

Merge branch 'main' into hooks

a81402d

davorrunje had a problem deploying to openai1 January 14, 2024 02:38 — with GitHub Actions Failure

renaming

b38b7bd

davorrunje had a problem deploying to openai1 January 14, 2024 02:47 — with GitHub Actions Failure

davorrunje had a problem deploying to openai1 January 27, 2024 06:47 — with GitHub Actions Failure

websurfer fixes

bbdc8dd

davorrunje had a problem deploying to openai1 January 27, 2024 07:04 — with GitHub Actions Failure

ekzhu closed this Jan 28, 2024

ekzhu deleted the hooks branch January 28, 2024 07:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactorization of ConversableAgent to unify async and sync code and better extensibility #1240

Refactorization of ConversableAgent to unify async and sync code and better extensibility #1240

davorrunje commented Jan 14, 2024 •

edited

Loading

codecov-commenter commented Jan 14, 2024 •

edited

Loading

Tylersuard commented Jan 28, 2024

Tylersuard commented Jan 28, 2024 •

edited

Loading

ekzhu commented Jan 28, 2024

ekzhu commented Jan 28, 2024 •

edited

Loading

ekzhu commented Jan 28, 2024

ekzhu commented Jan 28, 2024 •

edited

Loading

davorrunje commented Jan 28, 2024 •

edited

Loading

tyler-suard-parker commented Jan 29, 2024 •

edited

Loading

Refactorization of ConversableAgent to unify async and sync code and better extensibility #1240

Refactorization of ConversableAgent to unify async and sync code and better extensibility #1240

Conversation

davorrunje commented Jan 14, 2024 • edited Loading

Why are these changes needed?

The current state.

What is this PR for.

What is this PR NOT for:

What can you do to help:

Update 01/16/2024

Update 01/17/2024

More updates 01/17/2024

Update 01/18/2024

Update 01/19/2024

Update 01/20/2024

Update 01/26/2024

Async/sync mixing works now in all cases

Quality improvements

Related issue number

Checks

codecov-commenter commented Jan 14, 2024 • edited Loading

Codecov Report

Tylersuard commented Jan 28, 2024

Tylersuard commented Jan 28, 2024 • edited Loading

ekzhu commented Jan 28, 2024

ekzhu commented Jan 28, 2024 • edited Loading

ekzhu commented Jan 28, 2024

ekzhu commented Jan 28, 2024 • edited Loading

davorrunje commented Jan 28, 2024 • edited Loading

tyler-suard-parker commented Jan 29, 2024 • edited Loading

davorrunje commented Jan 14, 2024 •

edited

Loading

codecov-commenter commented Jan 14, 2024 •

edited

Loading

Tylersuard commented Jan 28, 2024 •

edited

Loading

ekzhu commented Jan 28, 2024 •

edited

Loading

ekzhu commented Jan 28, 2024 •

edited

Loading

davorrunje commented Jan 28, 2024 •

edited

Loading

tyler-suard-parker commented Jan 29, 2024 •

edited

Loading