Test Execution Enviroment for SweBench tasks #73

345ishaan · 2024-10-01T21:06:25Z

I have recently being working on swebench where we built distributed eval on top of Modal for faster eval cycles. As a next step, I was hoping to use that setup to execute the patch generated by LLMs after the localization stage. I was wondering whether it is possible via the commit0 project.

Test execution feedback and search can improve the quality over Best-of-N or majority voting based approaches. Also, as part of this idea, we either need to predict the relevant unittests which affect the localized files or generate unittests using LLMs.

cc @wenting-zhao

The text was updated successfully, but these errors were encountered:

wenting-zhao · 2024-10-02T20:08:10Z

Thank you for your interests! Our current setup does something similar where we copy the patch to modal and run eval there. However, it is not possible to directly use our code on swebench. Some code modifications are needed. If you already have the plan I'm happy to help out, but supporting commit0 on swebench is not our priority at the moment, and I am not sure when we will get to that.

wenting-zhao · 2024-10-26T21:50:51Z

Update: the integration is in process. We will have a release next week.

345ishaan · 2024-11-01T16:43:47Z

@wenting-zhao Thanks a lot. looking forward to try it out and contribute.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test Execution Enviroment for SweBench tasks #73

Test Execution Enviroment for SweBench tasks #73

345ishaan commented Oct 1, 2024

wenting-zhao commented Oct 2, 2024

wenting-zhao commented Oct 26, 2024

345ishaan commented Nov 1, 2024

Test Execution Enviroment for SweBench tasks #73

Test Execution Enviroment for SweBench tasks #73

Comments

345ishaan commented Oct 1, 2024

wenting-zhao commented Oct 2, 2024

wenting-zhao commented Oct 26, 2024

345ishaan commented Nov 1, 2024