Code executors #1405

ekzhu · 2024-01-25T22:43:20Z

Why are these changes needed?

The default code execution is done in a command line environment in a docker container. This has following limitations:

It cannot save variables in memory, as each code execution is performed by running a script from disk.
It cannot support easy data in/out the code execution environment. E.g., a lot of people asking where can they find the plots or code script that agents generated.
It only supports limited code runner (shell, bash, sh and python)

This is why we introduce code executors to allow users to select and configure the code execution environment, and at the same time making it easy for people to write their own code executors.

This requires changes to the schema of code_execution_config configuration -- those changes will be backward compatible so existing code will be using the legacy code execution module with no change in behavior.

Here is an example of specifying ipython-embedded code executor for a user proxy.

user_proxy = UserProxyAgent(name="proxy", code_execution_config={"executor": "ipython-embedded", "ipython-embedded": {"output_dir": "coding_output"}})
# Now code messages received by user proxy will be running in an embedded IPython kernel.

Or the local command line code executor:

user_proxy = UserProxyAgent(name="proxy", code_execution_config={"executor": "commandline-local", "commandline-local": {"work_dir": "coding"}})

In some cases, the user of the code executing agent needs to know how to use the code executor. E.g., the agent needs to know that it is interacting with an IPython notebook to make use of the notebook related features such as display, preloaded modules, and ! pip install ..., and expecting rich messages like formatted tables and plots. This requires the user agent to be equipped with a capability. This can be accomplished as following:

agent = ConversableAgent("agent", ...)
user_proxy.code_executor.user_capability.add_to_agent(agent)

Here the user_capability is an AgentCapability type that modifies agent's system message to add instructions related to usage of Ipython code executor.

User-defined code executor.

It is also possible to use user-supplied code executor. So advanced user can use their own executor without modifying the framework. Here is an example of a customized notebook executor that execute LLM generated code within the same notebook it is running on.

from typing import List
from IPython import get_ipython
from autogen.agentchat.agent import LLMAgent
from autogen.agentchat.user_proxy_agent import UserProxyAgent
from autogen.coding import CodeExecutor, MarkdownCodeExtractor, CodeExtractor, CodeBlock, CodeResult

class NotebookExecutor(CodeExecutor):

    class UserCapability:

        def add_to_agent(self, agent: LLMAgent):
            agent.update_system_message(agent.system_message + "\nInstruction on coding.")

    @property
    def code_extractor(self) -> CodeExtractor:
        return MarkdownCodeExtractor()

    @property
    def user_capability(self) -> "NotebookExecutor.UserCapability":
        return NotebookExecutor.UserCapability()

    def __init__(self) -> None:
        self._ipython = get_ipython()

    def execute_code_blocks(self, code_blocks: List[CodeBlock]) -> CodeResult:
        log = ""
        for code_block in code_blocks:
            result = self._ipython.run_cell("%%capture --no-display cap\n" + code_block.code)
            log += self._ipython.ev("cap.stdout")
            log += self._ipython.ev("cap.stderr")
            if result.result is not None:
                log += str(result.result)
            exitcode = 0 if result.success else 1
            if result.error_before_exec is not None:
                log += f"\n{result.error_before_exec}"
                exitcode = 1
            if result.error_in_exec is not None:
                log += f"\n{result.error_in_exec}"
                exitcode = 1
            if exitcode != 0:
                break
        return CodeResult(exit_code=exitcode, output=log)


# Equip the UserProxyAgent with the ExampleExecutor.
proxy = UserProxyAgent("user", code_execution_config={"executor": NotebookExecutor()})

Documentation

Documentation will be in a future PR once the user-defined module work is completed. See #1421 .

Backward compatibility

For backward compatibility, existing code that uses either setting code_execution_config to a dictionary (without the key "executor") will still be using the legacy code execution module, and subclasses that overrides run_code and execute_code_blocks will still have their overriding methods used in those classes.

Once we have finished the other tasks in the code execution roadmap (#1421), a deprecation warning will be displayed when they do that, encouraging the developer to switch from subclassing when it comes to customizing code execution.

To turn off code execution, set code_execution_config=False. This is consistent with the current behavior.

Additional changes

Per PEP544 protocol is for supporting structured sub-typing aka interface in Python. So we don't have to declare subclass of a protocol, rather we can rely on static type checker or @runtime_checkable on the protocol to check of type. e.g.,

@runtime_checkable
class Animal(Protocol):

  def speak() -> str:
    ..

class Duck:
  def speak() -> str:
    return "quack"

isinstance(duck(), Animal)
# This checks whether it implements the Animal protocol.

We make Agent and LLMAgent protocols, this allows external code to create their own agent classes without subclassing our ConversableAgent and inherit all the underlying methods and variables, yet we can use them in our code like GroupChat for example.

Related issue number

#1336
#1396
#1095

Checks

I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally.
I've added tests (if relevant) corresponding to the changes introduced in this PR.
I've made sure all auto checks have passed.

codecov-commenter · 2024-01-25T22:47:44Z

Codecov Report

Attention: Patch coverage is 88.44985% with 38 lines in your changes missing coverage. Please review.

Project coverage is 69.93%. Comparing base (5d81ed4) to head (2c4ae6f).
Report is 583 commits behind head on main.

Files	Patch %	Lines
autogen/agentchat/conversable_agent.py	65.27%	11 Missing and 14 partials ⚠️
autogen/coding/local_commandline_code_executor.py	91.66%	4 Missing and 1 partial ⚠️
autogen/agentchat/agent.py	80.95%	4 Missing ⚠️
autogen/coding/embedded_ipython_code_executor.py	96.36%	2 Missing and 2 partials ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##             main    #1405       +/-   ##
===========================================
+ Coverage   35.03%   69.93%   +34.89%     
===========================================
  Files          44       50        +6     
  Lines        5383     5677      +294     
  Branches     1247     1381      +134     
===========================================
+ Hits         1886     3970     +2084     
+ Misses       3342     1342     -2000     
- Partials      155      365      +210

Flag	Coverage Δ
unittests	`69.89% <88.44%> (+34.85%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

afourney · 2024-01-25T22:55:23Z

I like this idea very much.

Given that we use the markdown header to specify language, should be allow executor to be a dictionary?

{
    "executor": {
        "python": notebook_executor,
        "sh": terminal_executor,
        "typescript": ts_executor,
        "c#": c_sharp_executor
   },
   ...
}

rlam3 · 2024-01-25T22:55:59Z

I'm not sure why with the recent 0.2.8 change why we need to default to using docker. I thought the default was to not use docker.

afourney · 2024-01-25T23:01:25Z

I'm not sure why with the recent 0.2.8 change why we need to default to using docker. I thought the default was to not use docker.

Running in Docker was always our recommendation -- it's so much safer when dealing with the arbitrary code these agents write. Previously, we printed a prominent warning to console when Docker wasn't explicitly disabled with use_docker=False -- I wrote that PR myself in #172, which was one of my first contributions to AutoGen.

The change in 0.2.8 is to elevate the recommendation to a default. But, you can easily opt-out setting use_docker to false or setting the global environment variable.

ekzhu · 2024-01-26T01:31:58Z

@afourney H

I like this idea very much.

Given that we use the markdown header to specify language, should be allow executor to be a dictionary?
{
    "executor": {
        "python": notebook_executor,
        "sh": terminal_executor,
        "typescript": ts_executor,
        "c#": c_sharp_executor
   },
   ...
}

Thanks. This is actually an interesting idea that the dictionary entry could be an instance of an executor to achieve customization. Though currently we assume the code executor is supposed to be language agnostic -- as the LLM could produce code in multiple languages and we assume those will be executed in the same environment. So, the code executor is more about the environment in which the code runs. E.g., a command line environment which supports command utilities, an ipython environment that only supports ipython commands (python code and stuff like ! pip install package.)

We can also introduce Google Code Lab environment and .NET interactive (shout out to @LittleLittleCloud @colombod) in the future. For now, I am expecting mostly community contributions on these cases. Each executor can put in their configuration parameters inside the code_execution_config:

{"executor": "ipython",
  "ipython": {
    "timeout": 50,
    "preload_modules": ["numpy", "pandas", ...],
  }
}

afourney · 2024-01-26T01:42:19Z

That makes sense.

Another question: Right now the default assistant prompt is heavily tuned to suggesting sh and python code, and heavily instructed to making sure the codeblocks "stand alone". Are you imagining that the executors might also contain suggested meta-prompts, or descriptions, that can make this a little more integrated?

ekzhu · 2024-01-26T01:57:36Z

Are you imagining that the executors might also contain suggested meta-prompts, or descriptions, that can make this a little more integrated?

You are thinking what I am thinking. I just updated the PR description. In short:

agent = ConversableAgent("agent", ...)
user_proxy.code_executor.user_capability.add_to_agent(agent)

davorrunje · 2024-01-26T05:07:56Z

I'm not sure why with the recent 0.2.8 change why we need to default to using docker. I thought the default was to not use docker.

Running in Docker was always our recommendation -- it's so much safer when dealing with the arbitrary code these agents write. Previously, we printed a prominent warning to console when Docker wasn't explicitly disabled with use_docker=False -- I wrote that PR myself in #172, which was one of my first contributions to AutoGen.

The change in 0.2.8 is to elevate the recommendation to a default. But, you can easily opt-out setting use_docker to false or setting the global environment variable.

This is actually not quite true. Running in docker doesn't mean running in a separate docker container which would be a much safer way of doing it. It also means running in the same docker container if autogen is already running in a docker container. I discovered that yesterday and thought it was a bug (#1396) while implementing some missing tests, but apparently, it is not.

afourney · 2024-01-26T05:48:20Z

Yes. There is a difference between use_docker=True with autogen hosted outside, and running everything in Docker (in which case use_docker is effectively ignored). The former is more secure than the latter. We should definitely distinguish this, and I would be happy to discuss this in another thread.

ekzhu · 2024-02-09T20:58:07Z

@IANTHEREAL I have resolved everything else except regarding the functionality to update producer agent's description field. Please see my comments.

ekzhu · 2024-02-10T00:07:04Z

@abhijithnair1 You can take a look at the PR description about a notebook executor that runs inside a JupyterNote book. There is also an ipython executor that runs in a separate IPython Kernel which is stateful.

IANTHEREAL

LGTM.

* code executor * test * revert to main conversable agent * prepare for pr * kernel * run open ai tests only when it's out of draft status * update workflow file * revert workflow changes * ipython executor * check kernel installed; fix tests * fix tests * fix tests * update system prompt * Update notebook, more tests * notebook * raise instead of return None * allow user provided code executor. * fixing types * wip * refactoring * polishing * fixed failing tests * resolved merge conflict * fixing failing test * wip * local command line executor and embedded ipython executor * revert notebook * fix format * fix merged error * fix lmm test * fix lmm test * move warning * name and description should be part of the agent protocol, reset is not as it is only used for ConversableAgent; removing accidentally commited file * version for dependency * Update autogen/agentchat/conversable_agent.py Co-authored-by: Jack Gerrits <[email protected]> * ordering of protocol * description * fix tests * make ipython executor dependency optional * update document optional dependencies * Remove exclude from Agent protocol * Make ConversableAgent consistent with Agent * fix tests * add doc string * add doc string * fix notebook * fix interface * merge and update agents * disable config usage in reply function * description field setter * customize system message update * update doc --------- Co-authored-by: Davor Runje <[email protected]> Co-authored-by: Jack Gerrits <[email protected]> Co-authored-by: Aaron <[email protected]> Co-authored-by: Chi Wang <[email protected]>

ekzhu added 6 commits January 24, 2024 02:03

code executor

be75414

Merge branch 'main' into coding

2ea759b

test

36fb1be

revert to main conversable agent

93d0fc6

prepare for pr

71d3cfa

Merge branch 'main' into coding

af3c7ae

ekzhu had a problem deploying to openai1 January 25, 2024 22:43 — with GitHub Actions Failure

ekzhu requested review from davorrunje, sonichi and olgavrou January 25, 2024 22:43

ekzhu requested review from afourney and BeibinLi January 26, 2024 00:50

ekzhu had a problem deploying to openai1 February 9, 2024 20:55 — with GitHub Actions Failure

update doc

2c4ae6f

ekzhu temporarily deployed to openai1 February 9, 2024 21:00 — with GitHub Actions Inactive

IANTHEREAL approved these changes Feb 10, 2024

View reviewed changes

sonichi enabled auto-merge February 10, 2024 04:44

sonichi added this pull request to the merge queue Feb 10, 2024

Merged via the queue into microsoft:main with commit 609ba7c Feb 10, 2024
57 checks passed

AaronWard mentioned this pull request Feb 11, 2024

Command line code sanitation #1627

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code executors #1405

Code executors #1405

ekzhu commented Jan 25, 2024 •

edited

Loading

codecov-commenter commented Jan 25, 2024 •

edited

Loading

afourney commented Jan 25, 2024 •

edited

Loading

rlam3 commented Jan 25, 2024

afourney commented Jan 25, 2024

ekzhu commented Jan 26, 2024 •

edited

Loading

afourney commented Jan 26, 2024

ekzhu commented Jan 26, 2024

davorrunje commented Jan 26, 2024

afourney commented Jan 26, 2024 •

edited

Loading

ekzhu commented Feb 9, 2024

ekzhu commented Feb 10, 2024

IANTHEREAL left a comment

Code executors #1405

Code executors #1405

Conversation

ekzhu commented Jan 25, 2024 • edited Loading

Why are these changes needed?

User-defined code executor.

Documentation

Backward compatibility

Additional changes

Related issue number

Checks

codecov-commenter commented Jan 25, 2024 • edited Loading

Codecov Report

afourney commented Jan 25, 2024 • edited Loading

rlam3 commented Jan 25, 2024

afourney commented Jan 25, 2024

ekzhu commented Jan 26, 2024 • edited Loading

afourney commented Jan 26, 2024

ekzhu commented Jan 26, 2024

davorrunje commented Jan 26, 2024

afourney commented Jan 26, 2024 • edited Loading

ekzhu commented Feb 9, 2024

ekzhu commented Feb 10, 2024

IANTHEREAL left a comment

Choose a reason for hiding this comment

ekzhu commented Jan 25, 2024 •

edited

Loading

codecov-commenter commented Jan 25, 2024 •

edited

Loading

afourney commented Jan 25, 2024 •

edited

Loading

ekzhu commented Jan 26, 2024 •

edited

Loading

afourney commented Jan 26, 2024 •

edited

Loading