Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test Execution Enviroment for SweBench tasks #73

Open
345ishaan opened this issue Oct 1, 2024 · 3 comments
Open

Test Execution Enviroment for SweBench tasks #73

345ishaan opened this issue Oct 1, 2024 · 3 comments

Comments

@345ishaan
Copy link

I have recently being working on swebench where we built distributed eval on top of Modal for faster eval cycles. As a next step, I was hoping to use that setup to execute the patch generated by LLMs after the localization stage. I was wondering whether it is possible via the commit0 project.

Test execution feedback and search can improve the quality over Best-of-N or majority voting based approaches. Also, as part of this idea, we either need to predict the relevant unittests which affect the localized files or generate unittests using LLMs.

cc @wenting-zhao

@wenting-zhao
Copy link
Collaborator

Thank you for your interests! Our current setup does something similar where we copy the patch to modal and run eval there. However, it is not possible to directly use our code on swebench. Some code modifications are needed. If you already have the plan I'm happy to help out, but supporting commit0 on swebench is not our priority at the moment, and I am not sure when we will get to that.

@wenting-zhao
Copy link
Collaborator

Update: the integration is in process. We will have a release next week.

@345ishaan
Copy link
Author

@wenting-zhao Thanks a lot. looking forward to try it out and contribute.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants