Finding solution for select_speaker and agent_by_name functions: Sterilization or ?? #489

robzsaunders · 2023-10-30T20:36:59Z

Hey everyone,

Was doing some probing into why the group manager just fails to do its job and have some questions.

on lines 153 to 156, why are we broadcasting the message to all agents?

Lines 136 to 179 in b432c1b

    
           def run_chat( 
        
               self, 
        
               messages: Optional[List[Dict]] = None, 
        
               sender: Optional[Agent] = None, 
        
               config: Optional[GroupChat] = None, 
        
           ) -> Union[str, Dict, None]: 
        
               """Run a group chat.""" 
        
               if messages is None: 
        
                   messages = self._oai_messages[sender] 
        
               message = messages[-1] 
        
               speaker = sender 
        
               groupchat = config 
        
               for i in range(groupchat.max_round): 
        
                   # set the name to speaker's name if the role is not function 
        
                   if message["role"] != "function": 
        
                       message["name"] = speaker.name 
        
                   groupchat.messages.append(message) 
        
                   # broadcast the message to all agents except the speaker 
        
                   for agent in groupchat.agents: 
        
                       if agent != speaker: 
        
                           self.send(message, agent, request_reply=False, silent=True) 
        
                   if i == groupchat.max_round - 1: 
        
                       # the last round 
        
                       break 
        
                   try: 
        
                       # select the next speaker 
        
                       speaker = groupchat.select_speaker(speaker, self) 
        
                       # let the speaker speak 
        
                       reply = speaker.generate_reply(sender=self) 
        
                   except KeyboardInterrupt: 
        
                       # let the admin agent speak if interrupted 
        
                       if groupchat.admin_name in groupchat.agent_names: 
        
                           # admin agent is one of the participants 
        
                           speaker = groupchat.agent_by_name(groupchat.admin_name) 
        
                           reply = speaker.generate_reply(sender=self) 
        
                       else: 
        
                           # admin agent is not found in the participants 
        
                           raise 
        
                   if reply is None: 
        
                       break 
        
                   # The speaker sends the message without requesting a reply 
        
                   speaker.send(reply, self, request_reply=False) 
        
                   message = self.last_message(speaker) 
        
               return True, None

Instead of just sending the message to the selected speaker?

many server queue incoming messages. Personally I've been using LM Studio and noticed that it notes "running queued message" and runs them one by one, which may be causing some of these Local LLM group chat complications despite hacks to force chat orders.

robzsaunders · 2023-10-30T20:41:01Z

I'm currently checking to see if theres better performance by commenting out lines 154,155 and 156

autogen/autogen/agentchat/groupchat.py

Lines 153 to 156 in b432c1b

    
           # broadcast the message to all agents except the speaker 
        
           for agent in groupchat.agents: 
        
               if agent != speaker: 
        
                   self.send(message, agent, request_reply=False, silent=True)

and adding this between 162 and 164
self.send(message, speaker, request_reply=False, silent=True)

afourney · 2023-10-30T22:41:29Z

Group Chat is designed to mimic ... well a group chat (e.g., on your phone or in slack). We expect every agent to be aware of the shared conversation up to that point. Other workflows are certainly possible (Like spoke and hub delegation), but they wouldn't be called group chat. Commenting out the broadcast will likely break many of the Group Chat demo scenarios.

When you say that you were investigating why the Group Chat Manager "fails to do its job", can you be more specific? What failures were you observing?

robzsaunders · 2023-10-31T01:22:05Z

Hey @afourney, thanks for the explanation. Makes sense!

Apologies for the long write up but I think I found the source of a lot of local LLM's group chat problems. Since there's no error outputs for when there's failures, no one knew that the responses from the manager were incorrect.

Taking a look through the issue board and discord, there seems to be a common theme over the last week or so of the group chat manager not working in the "correct order" or "selecting agents correctly".

After doing some debugging using a few local LLM's I think I found the issue. It is a pair of problems that are related to one another.

Problem 1

Example: Agent being called is named "Coder"

Similar to my ticket #399 where we needed to sterilize the input of raw code, the raw output of the manager's role message is not in the correct format for the agent_by_name function in groupchat.py.

The below is called from the end of select_speaker function.

autogen/autogen/agentchat/groupchat.py

Lines 40 to 42 in b432c1b

    
           def agent_by_name(self, name: str) -> Agent: 
        
               """Find the next speaker based on the message.""" 
        
               return self.agents[self.agent_names.index(name)]

Example outputs from the manager:

"The role I select is Coder"
"Coder:"
"```Coder"

The major offense though which I repeatedly keep seeing is the manager respond

"Coder: >>Writes out all the code<<"

What that function expects

Coder

This causes the ValueError to trigger on line 106 causing pseudo correct looking functionality. My belief is that because the ValueError spillover calls next, the manager seems to be working because I suspect most people order their agents in the chatgroup in the logical order of operations.

Rolling into problem 2

For some reason, I'm not quite sure why, the manager is semi-ignoring the system prompts fed to it from select_speaker_msg and line 96 in groupchat.py.

Now I don't know for sure how it works yet 100%, but my intuition is that the self.messages on line 96 needs to be a user message and the prompt from the user needs to be a system message for the manager.

Before I finished up today, the last thing I did was swap line 95 from being a system tag to a user tag, and the manager stopped writing out full blocks of code.

It didn't output the correct single word response of "Coder" but it went "I will choose the coder" or something like that. So there is something weird going on with how the local LLM's interact with the group chat manager prompts.

I'm not sure the best way to approach this, but I think this is a another roadblock for most local LLM users and will help with the group chat problems they're facing.

Current behavior

4 agents, Manager, User_Proxy, Coder, QA

User proxy
"Hey write me a basic hello world in python" (to manager)
System
[ "Choose a role, only choose one role" ] (to manager)
Manager
"I choose Coder: '''Python (insert python code here)"
autogen logic
[ Manager has finished and I got it's response. Checking message. Result: ValueError fail. Last agent: Coder. Next agent is... Coder ]
Coder
" '''Python (insert python code with incorrect code here) " (replies to autogen logic)
autogen logic
[ coder is done, manager picks another speaker]
System
[ "Choose a role, only choose one role" ] (to manager)
Manager
"I choose QA: "(Does full QA assessment that detects the incorrect code)
autogen logic
[ Manager has finished and I got it's response. Checking message. Result: ValueError fail. Last agent: Coder. Next agent is... QA']
QA
[ Does full assessement that detects the incorrect code] (replies to autogen logic)
autogen logic
[ QA is done, manager picks another speaker]
System
[ "Choose a role, only choose one role" ] (to manager)
Manager
"I choose Coder: '''Python (insert python code here)"
autogen logic
[ Manager has finished and I got it's response. Checking message. Result: ValueError fail. Last agent: QA. Next agent is... User_Proxy']
User Proxy
>>>Executes Code

Disclosure : I haven't done any testing using OpenAI's GPT. This may or may not be a problem for GPT 3.5/4

afourney · 2023-10-31T01:30:21Z

Thanks for the awesome deep dive. Keep them coming!

We've not done a lot of testing with local models, and i actually have no idea how well we should expect a GroupChatManager to function if backed by such models. One thing to try is to leave the Chat Manager as GPT-4, but local LLMs everywhere else, and just compare performance.

Selection of the next agent is non-trivial, and frankly I'm surprised it works even with GPT-4. Here's one possible source of confusion: #319

dogukanustun · 2023-10-31T16:51:51Z

Hello,

I am facing a similar issue. When I tried agent_by_name(), I also got ValueError but interestingly my example outputs are not like given below (taken from robzsaunders)

Example outputs from the manager:

"The role I select is Coder"
"Coder:"
"```Coder"

What I get for name parameter is the whole output of the LLM.

I am in desperate situation and open to every solution.

Thanks.

robzsaunders · 2023-10-31T19:47:10Z

Yea that's what I was trying to communicate with:

The major offense though which I repeatedly keep seeing is the manager respond

"Coder: >>Writes out all the code<<"

The manager does the work first but it is silent in the background and throws the role prompt through a loop

robzsaunders · 2023-11-01T14:52:58Z

This PR is related to this, offering a partial solution.

#500

afourney · 2023-11-01T22:41:49Z

I think there are a few things going on here, and I think we need a piecemeal approach to solving it. The PR #500 would solve the problem if you don't need dynamic orchestration. In other words, if you already know which agents should speak, and in what order, then Group Chat -- as it currently stands -- is not a great solution, and some deterministic alternative would be better.

However, if you still want dynamic orchestration, then what we need to do is improve the GroupChatManager's performance on local models. We can do this a few ways. First, we can try improving the prompt that it uses, or perhaps use a different prompt altogether, tuned to the local model. Alternatively we can improve parsing (or recognize the failure and remind the model to output the correct format, similar to TypeChat). This would add some robustness to the selection.

I would, however, openly wonder about effectiveness. Orchestration is super complex problem, and it resembles planning. If the underlying LLM can't handle the instructions to output the correct format, I might naturally wonder how carefully considered its plans are?

robzsaunders · 2023-11-02T18:48:10Z

Yea, its why I opened this as an issue instead of knee jerking a solution with a PR.

The orchestration needs some discussion.

I'm still convinced that something isn't being sent through properly to the local model (I'm using LM Studio) since editing the managers system prompts don't do anything

SoheylM · 2023-11-03T00:05:23Z

Hi there,

I am also working on it, serving Mistral_7B with vLLM using the openai api end points.

**** EDIT ****
llm_config gets passed along with via kwargs, so first problem is already addressed. The second problem seems linked to to serving Mistral 7B Instruct with vLLM. It may be related to the required prompt template.

[FIXED] The first problem I notice with respect to running a local LLM using the llm_config dictionary, and correct me if I am wrong, is its absence in the GroupChatManager constructor. Unless overwriting the DEFAULT_MODEL to be the local LLM, or pointing 'gpt-4', 'gpt-3.5-turbo' etc. model names to the local LLM, I believe two lines of code need to be added. These are to initialize the llm_config dictionary and pass it along to the Parent class, ConversableAgent:

'class GroupChatManager(ConversableAgent):

def __init__(
    self,
    groupchat: GroupChat,
    name: Optional[str] = "chat_manager",
    # unlimited consecutive auto reply by default
    max_consecutive_auto_reply: Optional[int] = sys.maxsize,
    human_input_mode: Optional[str] = "NEVER",
    system_message: Optional[str] = "Group chat manager.",
    llm_config: Optional[Union[Dict, bool]] = None,
    # seed: Optional[int] = 4,
    **kwargs,
):
    super().__init__(
        name=name,
        max_consecutive_auto_reply=max_consecutive_auto_reply,
        human_input_mode=human_input_mode,
        system_message=system_message,
        llm_config=llm_config,
        **kwargs,
    )
    self.register_reply(Agent, GroupChatManager.run_chat, config=groupchat, reset_config=GroupChat.reset)
    # self._random = random.Random(seed)'

The second problem specific to my implementation is in-line with @dogukanustun 's comment. The GroupChatManager never outputs anything related to a role. Instead it seems to answer directly the question asked even though the role mentioned is Planner, Engineer etc. The first agent that seems to answer is always the first of the list, followed by the second one and so on in order to the last. Then I swaped the positions of the agents in the groupchat. The answers provided were exactly the same, meaning they are position dependent and not agent dependent. The first, second answer etc. are always the same, the name of the agent answering appearing in terminal changes and matches the ordering of the groupchat list.

If my second problem is solved and I reach @robzsaunders problem, I think I could offer some solutions. One would be through prompt engineering to force the local LLM to spit out the correct role based on the groupchat list.

tevslin · 2024-01-29T17:54:06Z

would like to see whatever is implemented for groupchatmanager be exposed in autogen studio

…nc (#489) * Port docker code executor, make async, make code executor restart async * add export * fmt * fix async file

afourney added the group chat/teams group-chat-related issues label Oct 30, 2023

pcdeadeasy assigned afourney Nov 1, 2023

robzsaunders changed the title ~~Questions about groupchat.py (Local LLM Related)~~ Finding solution for select_speaker and agent_by_name functions: Sterilization or ?? Nov 1, 2023

This was referenced Nov 6, 2023

Improve manager's agent selection logic #565

Closed

How to set multiple agents flow in the Groupchat #584

Closed

gagb closed this as completed Aug 27, 2024

jackgerrits added a commit that referenced this issue Oct 2, 2024

Port docker code executor, make async, make code executor restart asy…

3ab51d3

…nc (#489) * Port docker code executor, make async, make code executor restart async * add export * fmt * fix async file

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finding solution for select_speaker and agent_by_name functions: Sterilization or ?? #489

Finding solution for select_speaker and agent_by_name functions: Sterilization or ?? #489

robzsaunders commented Oct 30, 2023

robzsaunders commented Oct 30, 2023 •

edited

Loading

afourney commented Oct 30, 2023 •

edited

Loading

robzsaunders commented Oct 31, 2023

afourney commented Oct 31, 2023 •

edited

Loading

dogukanustun commented Oct 31, 2023

robzsaunders commented Oct 31, 2023

robzsaunders commented Nov 1, 2023

afourney commented Nov 1, 2023

robzsaunders commented Nov 2, 2023

SoheylM commented Nov 3, 2023 •

edited

Loading

tevslin commented Jan 29, 2024

Finding solution for select_speaker and agent_by_name functions: Sterilization or ?? #489

Finding solution for select_speaker and agent_by_name functions: Sterilization or ?? #489

Comments

robzsaunders commented Oct 30, 2023

robzsaunders commented Oct 30, 2023 • edited Loading

afourney commented Oct 30, 2023 • edited Loading

robzsaunders commented Oct 31, 2023

afourney commented Oct 31, 2023 • edited Loading

dogukanustun commented Oct 31, 2023

robzsaunders commented Oct 31, 2023

robzsaunders commented Nov 1, 2023

afourney commented Nov 1, 2023

robzsaunders commented Nov 2, 2023

SoheylM commented Nov 3, 2023 • edited Loading

tevslin commented Jan 29, 2024

robzsaunders commented Oct 30, 2023 •

edited

Loading

afourney commented Oct 30, 2023 •

edited

Loading

afourney commented Oct 31, 2023 •

edited

Loading

SoheylM commented Nov 3, 2023 •

edited

Loading