-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat: Include relevant file names if any #2865
Conversation
I'd like this get reviewed by other team member(s)
The refactored code looks good to me. |
@SmartManoj did you run your PR with LLM being enabled in integration test to confirm the change doesn't cause an issue there? |
As mentioned in the description, not regenerated tests yet. Tested directly in live versions. |
Ok, please run the integration tests on your end then, too. |
Failed because of the prompt change. Added that change only in the description image. Will generate once it's approved. |
Please have a look at this error, it is separate from the prompt change, I think: Did you try your live test with an empty workspace, too? |
So you ignore line 450 then in your error analysis? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about others' thoughts, but I am more inclined towards a simpler approach: explicitly letting the agent know that the project is already set up in the prompt. This way, we can minimize manual intervention in the prompt sent to the LLM, which is preferred to keep the agent more general imo.
@@ -217,6 +218,16 @@ def _get_messages(self, state: State) -> list[dict[str, str]]: | |||
{'role': 'user', 'content': self.in_context_example}, | |||
] | |||
|
|||
workspace_contents = ', '.join(list_files(config.workspace_base)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may also require mounting workspace to work when running swe-bench
eval similar to RepoMap before? CC: @xingyaoww
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mounting workspace to work
work means?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
codeact_swe_agent.py
is only used for swe-bench
. right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc: @rezzie-rich
@xingyaoww An ui beside option for the user to choose to mount or not mount any workspace could be a simple solution. This way, it's useful when working with existing projects as well as swe-bench eval. |
IMO, this PR should be tied to a vector-base 'repomap' so OD not only knows the file names of an existing project but also the content of it to effectively work on old or new projects. Also IMHO that should be a priority because successful integration of it will make OD capable of working on and improving OD, making the development supersonic. |
@xingyaoww can you take a quick look at this PR again and see if this is not the correct approach and please close it if so. |
Let's close this for now and wait to develop a more general solution (e.g., a search agent) |
@mamoodi, Could you create a new issue for this to track? |
Till that one can use this PR if one needs that quickly if is opened. Also, if it is opened, it will prevent duplicate PRs if one use a feature like |
@rezzie-rich, Does the current commit match your expectation? |
@SmartManoj You're missing problems pointed out above and the alternative solution here #2865 (review). To make embeddings only for this is not a good solution, when a |
I'm not sure about the exact technical implementation. However, having all the file names from the project in memory can be useful. It can serve as a skeleton map for the search agents. The search agent can go through the project and create a small summary of each files including the key contexts. This way, the search agent can know about a complete project not by the entire source code but by the small summaries it creates per file name. It will help it be aware of the complete project without maxing out the context limit, and when a task requires relevant source code from the project, it can effectively navigate through the file structure quickly and extract the actual code for completion. Summarizing the content of a 200 lines code file can be done using a single line of natural language. It is a form of compression by contextual meaning. Going through a project and making summaries will definitely use a lot of tokens, so it should be done through a UI option where users can choose to load a local project/github repo to the knowledge base. By default, OD should assume it's a new project, IG, that will help with eval as well. |
@enyst, we might be talking about something similar. @SmartManoj i was pro embedding until mentat-bot turned out useless. A real project can be of any size, and even with embedding, there's a risk of exceeding the context window. Since context window is the amount of info llm can process at a time, it's better for llm to have summarized whole context rather than embedded/raw incomplete context. Embedding after summarization could have different potential if that allows to squeeze more context in without extending the context window. |
@enyst @tobitege, could you hide the comments about the integration tests? @rezzie-rich I think there is no need to send the summaries of the workspace to the LLM too. If the right files are given, it will work on it. |
@SmartManoj, when you work on a project, you don't recall every line of code, do you? U just recall the file structure and an abstract idea of what is where. And since you know the abstract content of each files u can effectively navigate. It's the same with LLM as it's designed after how the human mind works. By file names only, you can't get all the info regarding that file. Codeact will access the raw context of a file when a task requires it, but it will be able to navigate the project effectively and hit all the files related to the task with scattered methods if it has a summery of all the files. You can't fit a whole project under 128k window, but u can definitely fit a complete summary of it. Ai can generate what it needs to generate if only it has all the necessary information. |
here the output is relevant file names. right?
Why not? for eg: if it needs info about an imported method, it can fetch that accordingly. Could you provide a small example that provides summaries? Isn't the docstring of a file enough? |
Summary of
|
What is the problem that this fixes or functionality that this introduces? Does it fix any open issues?
Closes #2838; Included relevant file names if any in step 1 only.