Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Include relevant file names if any #2865

Closed
wants to merge 16 commits into from

Conversation

SmartManoj
Copy link
Contributor

@SmartManoj SmartManoj commented Jul 9, 2024

What is the problem that this fixes or functionality that this introduces? Does it fix any open issues?
Closes #2838; Included relevant file names if any in step 1 only.

image


  • Regenerated tests

agenthub/codeact_agent/codeact_agent.py Outdated Show resolved Hide resolved
@SmartManoj SmartManoj marked this pull request as draft July 12, 2024 04:20
@rezzie-rich rezzie-rich mentioned this pull request Jul 12, 2024
@tobitege tobitege dismissed their stale review July 12, 2024 05:23

I'd like this get reviewed by other team member(s)

@SmartManoj SmartManoj changed the title Feat: Include workspace contents if any Feat: Include workspace item names if any Jul 12, 2024
@tobitege
Copy link
Collaborator

The refactored code looks good to me.
I'll defer the decision whether to use it as is to other team members' review.

@tobitege
Copy link
Collaborator

tobitege commented Jul 12, 2024

@SmartManoj did you run your PR with LLM being enabled in integration test to confirm the change doesn't cause an issue there?

@SmartManoj
Copy link
Contributor Author

SmartManoj commented Jul 12, 2024

As mentioned in the description, not regenerated tests yet. Tested directly in live versions.

@tobitege
Copy link
Collaborator

As mentioned in the description, Did live test only.

Ok, please run the integration tests on your end then, too.
Please have a look at the errors in the actions and try to find out why your PR wouldn't pass here.

@SmartManoj
Copy link
Contributor Author

SmartManoj commented Jul 12, 2024

Failed because of the prompt change. Added that change only in the description image. Will generate once it's approved.

@tobitege
Copy link
Collaborator

Failed because of the prompt change. Added that change only in the description image. Will generate once it's approved.

Please have a look at this error, it is separate from the prompt change, I think:
https://github.com/OpenDevin/OpenDevin/actions/runs/9902616135/job/27357899808?pr=2865#step:7:444

Did you try your live test with an empty workspace, too?

@SmartManoj
Copy link
Contributor Author

Same error.

Screenshot_20240712-125703

@tobitege
Copy link
Collaborator

Same error.

Screenshot_20240712-125703

That is not the location I linked you, please look more closely!
The mock response not found error is only a subsequent error.

@SmartManoj
Copy link
Contributor Author

Screenshot_20240712-130245

Line 476 error
Line 443-475 is it's traceback

@tobitege
Copy link
Collaborator

Line 476 error Line 443-475 is it's traceback

So you ignore line 450 then in your error analysis?

Copy link
Contributor

@ryanhoangt ryanhoangt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about others' thoughts, but I am more inclined towards a simpler approach: explicitly letting the agent know that the project is already set up in the prompt. This way, we can minimize manual intervention in the prompt sent to the LLM, which is preferred to keep the agent more general imo.

@@ -217,6 +218,16 @@ def _get_messages(self, state: State) -> list[dict[str, str]]:
{'role': 'user', 'content': self.in_context_example},
]

workspace_contents = ', '.join(list_files(config.workspace_base))
Copy link
Contributor

@ryanhoangt ryanhoangt Jul 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may also require mounting workspace to work when running swe-bench eval similar to RepoMap before? CC: @xingyaoww

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mounting workspace to work

work means?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

codeact_swe_agent.py is only used for swe-bench. right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rezzie-rich
Copy link

@xingyaoww An ui beside option for the user to choose to mount or not mount any workspace could be a simple solution. This way, it's useful when working with existing projects as well as swe-bench eval.

@rezzie-rich
Copy link

IMO, this PR should be tied to a vector-base 'repomap' so OD not only knows the file names of an existing project but also the content of it to effectively work on old or new projects. Also IMHO that should be a priority because successful integration of it will make OD capable of working on and improving OD, making the development supersonic.

@neubig neubig removed their request for review July 15, 2024 17:31
@SmartManoj SmartManoj changed the title Feat: Include workspace item names if any Feat: Include relevant file names if any Jul 16, 2024
@mamoodi
Copy link
Collaborator

mamoodi commented Jul 29, 2024

@xingyaoww can you take a quick look at this PR again and see if this is not the correct approach and please close it if so.

@xingyaoww
Copy link
Collaborator

Let's close this for now and wait to develop a more general solution (e.g., a search agent)

@xingyaoww xingyaoww closed this Jul 29, 2024
@SmartManoj
Copy link
Contributor Author

a more general solution (e.g., a search agent)

@mamoodi, Could you create a new issue for this to track?

@SmartManoj
Copy link
Contributor Author

wait to develop a more general solution (e.g., a search agent)

Till that one can use this PR if one needs that quickly if is opened.
image

Also, if it is opened, it will prevent duplicate PRs if one use a feature like list-prs-for-file refined-github/refined-github#2197

image

@SmartManoj
Copy link
Contributor Author

IMO, this PR should be tied to a vector-base 'repomap' so OD not only knows the file names of an existing project but also the content of it to effectively work on old or new projects. Also IMHO that should be a priority because successful integration of it will make OD capable of working on and improving OD, making the development supersonic.

@rezzie-rich, Does the current commit match your expectation?

@enyst
Copy link
Collaborator

enyst commented Jul 30, 2024

@SmartManoj You're missing problems pointed out above and the alternative solution here #2865 (review). To make embeddings only for this is not a good solution, when a ls or simply telling the agent a few words would do.

@rezzie-rich
Copy link

I'm not sure about the exact technical implementation. However, having all the file names from the project in memory can be useful. It can serve as a skeleton map for the search agents.

The search agent can go through the project and create a small summary of each files including the key contexts. This way, the search agent can know about a complete project not by the entire source code but by the small summaries it creates per file name. It will help it be aware of the complete project without maxing out the context limit, and when a task requires relevant source code from the project, it can effectively navigate through the file structure quickly and extract the actual code for completion.

Summarizing the content of a 200 lines code file can be done using a single line of natural language. It is a form of compression by contextual meaning.

Going through a project and making summaries will definitely use a lot of tokens, so it should be done through a UI option where users can choose to load a local project/github repo to the knowledge base. By default, OD should assume it's a new project, IG, that will help with eval as well.

@rezzie-rich
Copy link

@enyst, we might be talking about something similar.

@SmartManoj i was pro embedding until mentat-bot turned out useless. A real project can be of any size, and even with embedding, there's a risk of exceeding the context window. Since context window is the amount of info llm can process at a time, it's better for llm to have summarized whole context rather than embedded/raw incomplete context.

Embedding after summarization could have different potential if that allows to squeeze more context in without extending the context window.

@SmartManoj
Copy link
Contributor Author

@SmartManoj You're missing problems pointed out above and the alternative solution here #2865 (review). To make embeddings only for this is not a good solution, when a ls or simply telling the agent a few words would do.

@enyst listdir was the first commit c972cd0 (#2865)

@tobitege, could you hide the comments about the integration tests?


@rezzie-rich I think there is no need to send the summaries of the workspace to the LLM too. If the right files are given, it will work on it.
Copilot workspace works like that.

image

image

Workspace link

@rezzie-rich
Copy link

rezzie-rich commented Jul 30, 2024

@rezzie-rich I think there is no need to send the summaries of the workspace to the LLM too. If the right files are given, it will work on it.
Copilot workspace works like that.

@SmartManoj, when you work on a project, you don't recall every line of code, do you? U just recall the file structure and an abstract idea of what is where. And since you know the abstract content of each files u can effectively navigate. It's the same with LLM as it's designed after how the human mind works.

By file names only, you can't get all the info regarding that file. Codeact will access the raw context of a file when a task requires it, but it will be able to navigate the project effectively and hit all the files related to the task with scattered methods if it has a summery of all the files.

You can't fit a whole project under 128k window, but u can definitely fit a complete summary of it. Ai can generate what it needs to generate if only it has all the necessary information.

@SmartManoj
Copy link
Contributor Author

SmartManoj commented Jul 30, 2024

when you work on a project, you don't recall every line of code, do you? U just recall the file structure and an abstract idea of what is where. And since you know the abstract content of each files u can effectively navigate. It's the same with LLM as it's designed after how the human mind works.

here the output is relevant file names. right?

By file names only, you can't get all the info regarding that file.

Why not? for eg: if it needs info about an imported method, it can fetch that accordingly.


Could you provide a small example that provides summaries? Isn't the docstring of a file enough?

@rezzie-rich
Copy link

rezzie-rich commented Jul 30, 2024

Summary of agenthub/dummy_agent/agent.py

This file defines a basic dummy agent within the OpenDevin framework. It includes:

  1. Imports:

    • import os
    • import sys
    • import logging
    • from agenthub.base_agent import BaseAgent
    • from agenthub.utils import some_utility_function
  2. DummyAgent Class: Implements a minimal agent with initialization, execution, and cleanup routines.

    • Initialization (__init__): Configures the agent and initializes logging.
    • Run Method (run): Core logic for agent actions during execution.
    • Helper Methods:
      • setup: Prepares the agent for execution.
      • execute_task: Manages task execution.
      • cleanup: Cleans up after task completion.
  3. Related Files:

    • agenthub/base_agent.py: Contains the BaseAgent class that DummyAgent inherits from.
    • agenthub/utils.py: Includes utility functions used by the agent.

@SmartManoj, as you can see, this is much more info than just the file name and much less compared to the actual file.

This is just a sample. The actual prompt for the search agent for summarization should be optimized to retrieve the core and useful information effectively.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants