Skip to content

Commit

Permalink
Merge branch 'main' into swe-bench-eval
Browse files Browse the repository at this point in the history
  • Loading branch information
xingyaoww authored Mar 21, 2024
2 parents 909f2de + b84463f commit deb2af2
Show file tree
Hide file tree
Showing 157 changed files with 926 additions and 262 deletions.
4 changes: 2 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ dist/
downloads/
eggs/
.eggs/
lib/
./lib/
lib64/
parts/
sdist/
Expand Down Expand Up @@ -190,4 +190,4 @@ yarn-error.log*

# agent
.envrc
agent/workspace
/workspace
9 changes: 0 additions & 9 deletions agent/build-and-run.sh

This file was deleted.

6 changes: 0 additions & 6 deletions agent/lib/actions/__init__.py

This file was deleted.

7 changes: 0 additions & 7 deletions agent/lib/actions/kill.py

This file was deleted.

18 changes: 0 additions & 18 deletions agent/lib/actions/run.py

This file was deleted.

85 changes: 0 additions & 85 deletions agent/lib/agent.py

This file was deleted.

18 changes: 0 additions & 18 deletions agent/lib/controlloop.py

This file was deleted.

83 changes: 0 additions & 83 deletions agent/main.py

This file was deleted.

57 changes: 57 additions & 0 deletions agenthub/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Agent Framework Research

In this folder, there may exist multiple implementations of `Agent` that will be used by the

For example, `agenthub/langchain_agent`, `agenthub/metagpt_agent`, `agenthub/codeact_agent`, etc.
Contributors from different backgrounds and interests can choose to contribute to any (or all!) of these directions.

## Constructing an Agent
Your agent must implement the following methods:

### `step`
```
def step(self, cmd_mgr: CommandManager) -> Event:
```
`step` moves the agent forward one step towards its goal. This probably means
sending a prompt to the LLM, then parsing the response into an action `Event`.

Each Event has an `action` and a dict of `args`. Supported Events include:
* `read` - reads the contents of a file. Arguments:
* `path` - the path of the file to read
* `write` - writes the contents to a file. Arguments:
* `path` - the path of the file to write
* `contents` - the contents to write to the file
* `run` - runs a command. Arguments:
* `command` - the command to run
* `background` - if true, run the command in the background, so that other commands can be run concurrently. Useful for e.g. starting a server. You won't be able to see the logs. You don't need to end the command with `&`, just set this to true.
* `kill` - kills a background command
* `id` - the ID of the background command to kill
* `browse` - opens a web page. Arguments:
* `url` - the URL to open
* `recall` - recalls a past memory. Arguments:
* `query` - the query to search for
* `think` - make a plan, set a goal, or record your thoughts. Arguments:
* `thought` - the thought to record
* `finish` - if you're absolutely certain that you've completed your task and have tested your work, use the finish action to stop working.

For Events like `read` and `run`, a follow-up event will be added via `add_event` with the output.

### `add_event`
```
def add_event(self, event: Event) -> None:
```
`add_event` adds an event to the agent's history. This could be a user message,
an action taken by the agent, log output, file contents, or anything else.

You'll probably want to keep a history of events, and use them in your prompts
so that the agent knows what it did recently. You may also want to keep events
in a vector database so the agent can refer back to them.

The output of `step` will automatically be passed to this method.

### `search_memory`
```
def search_memory(self, query: str) -> List[str]:
```
`search_memory` should return a list of events that match the query. This will be used
for the `recall` action.
2 changes: 2 additions & 0 deletions agenthub/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
from . import langchains_agent
from . import codeact_agent
21 changes: 21 additions & 0 deletions agenthub/codeact_agent/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# CodeAct-based Agent Framework

This folder implements the [CodeAct idea](https://arxiv.org/abs/2402.13463) that relies on LLM to autonomously perform actions in a Bash shell. It requires more from the LLM itself: LLM needs to be capable enough to do all the stuff autonomously, instead of stuck in an infinite loop.

A minimalistic exmaple can be found at [research/codeact/examples/run_flask_server_with_bash.py](./examples/run_flask_server_with_bash.py):

```bash
mkdir workspace
PYTHONPATH=`pwd`:$PYTHONPATH python3 opendevin/main.py -d ./workspace -c CodeActAgent -t "Please write a flask app that returns 'Hello, World\!' at the root URL, then start the app on port 5000. python3 has already been installed for you."
```


Example: prompts `gpt-3.5-turbo-0125` to write a flask server, install `flask` library, and start the server.

<img width="951" alt="image" src="https://github.com/OpenDevin/OpenDevin/assets/38853559/325c3115-a343-4cc5-a92b-f1e5d552a077">

<img width="957" alt="image" src="https://github.com/OpenDevin/OpenDevin/assets/38853559/68ad10c1-744a-4e9d-bb29-0f163d665a0a">

Most of the things are working as expected, except at the end, the model did not follow the instruction to stop the interaction by outputting `<execute> exit </execute>` as instructed.

**TODO**: This should be fixable by either (1) including a complete in-context example like [this](https://github.com/xingyaoww/mint-bench/blob/main/mint/tasks/in_context_examples/reasoning/with_tool.txt), OR (2) collect some interaction data like this and fine-tune a model (like [this](https://github.com/xingyaoww/code-act), a more complex route).
Loading

0 comments on commit deb2af2

Please sign in to comment.